Infrastructure Layering¶
This document explains the layered architecture approach used in the kup6s.com cluster, separating infrastructure bootstrapping from application deployments.
Why Layered Architecture?¶
Problem: Kubernetes clusters have a chicken-and-egg problem:
Applications need storage (Longhorn, S3)
Applications need networking (Traefik, cert-manager)
Applications need GitOps (ArgoCD)
But these components ARE applications themselves!
Solution: Two-tier architecture with clear separation of concerns:
Infrastructure Tier: Bootstrap essential platform components (OpenTofu-managed)
Application Tier: Deploy applications assuming platform exists (ArgoCD-managed)
Architecture Layers¶
┌─────────────────────────────────────────────────────┐
│ Developer Workstation │
├─────────────────────────────────────────────────────┤
│ │
│ source .env && tofu apply │
│ │ │
│ ▼ │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ INFRASTRUCTURE TIER (Bootstrap) │
│ Managed via: OpenTofu (kube-hetzner/) │
├─────────────────────────────────────────────────────┤
│ │
│ • Storage: Longhorn, SMB CSI Driver │
│ • Networking: Traefik, cert-manager │
│ • Provisioning: Crossplane │
│ • Secrets: External Secrets Operator │
│ • GitOps: ArgoCD itself │
│ │
│ Rationale: Must exist before apps can deploy │
└─────────────────────────────────────────────────────┘
│
▼ ArgoCD syncs from git repositories
┌─────────────────────────────────────────────────────┐
│ APPLICATION TIER (Platform & Apps) │
│ Managed via: ArgoCD (from dp-infra/ repos) │
├─────────────────────────────────────────────────────┤
│ │
│ Platform Services: │
│ • Database Operators (CloudNativePG) │
│ • Monitoring Stack (Prometheus, Thanos, Loki) │
│ │
│ Infrastructure Applications: │
│ • GitLab BDA, Mailu, etc. │
│ │
│ Application Services: │
│ • PostgreSQL databases │
│ • Redis, RabbitMQ, etc. │
│ • S3 buckets (Crossplane-managed) │
│ │
│ Customer Deployments: │
│ • Websites, APIs, microservices │
│ │
│ Rationale: Assumes platform components exist │
└─────────────────────────────────────────────────────┘
Infrastructure Tier (Bootstrap)¶
Purpose¶
Components that MUST exist before applications can be deployed. These form the platform foundation.
Management¶
Tool: OpenTofu v2.17.4
Configuration:
kube-hetzner/kube.tfandextra-manifests/Deployment: During initial cluster provisioning
Updates: Infrequent, planned, via
tofu apply
Components¶
Storage Operators:
Longhorn: Cloud-native distributed block storage
Provides PersistentVolumes for applications
Enables replication, snapshots, backups
Required by: Almost all stateful applications
SMB CSI Driver: Hetzner Storage Box integration
Provides shared file storage
Required by: Applications needing shared volumes
Crossplane: Dynamic S3 bucket provisioning
Provisions S3 buckets on demand
Required by: Applications needing object storage
Networking:
Traefik: Ingress controller
Routes HTTP/HTTPS traffic to services
Required by: All public-facing applications
cert-manager: TLS certificate automation
Provisions Let’s Encrypt certificates
Required by: HTTPS endpoints
Cilium: CNI (Container Network Interface)
Pod-to-pod networking
Required by: All cluster communication
Secrets Management:
External Secrets Operator (ESO)
Syncs secrets from external sources
Required by: Applications using centralized secret management
GitOps Engine:
ArgoCD
Syncs applications from git to cluster
Required by: All application deployments in the next tier
Why These Components?¶
These components are dependencies of everything else:
❌ Can’t deploy database without storage (Longhorn)
❌ Can’t deploy webapp without ingress (Traefik)
❌ Can’t deploy monitoring without ArgoCD (sync from git)
❌ Can’t deploy apps needing secrets without ESO
Therefore: They must be bootstrapped first, before ArgoCD can manage anything.
Update Process¶
cd kube-hetzner
source .env # Load credentials
# Plan changes (review before applying)
tofu plan
# Apply infrastructure changes
# IMPORTANT: For kube.tf changes, use the script:
bash scripts/apply-and-configure-longhorn.sh
# For other changes (extra-manifests, variables):
tofu apply
CRITICAL: Always use apply-and-configure-longhorn.sh for kube.tf changes to ensure proper Longhorn node configuration.
Application Tier (Platform & Apps)¶
Purpose¶
Applications and services that assume the platform exists. These can be deployed via GitOps because ArgoCD is already running.
Management¶
Tool: ArgoCD
Source Repositories:
dp-infra/*, external reposDeployment: Automated via ArgoCD sync from git
Updates: Frequent, automated, via git push
Components¶
Platform Services (infrastructure-level applications):
Database Operators (from
dp-infra/cnpg/):CloudNativePG operator
Barman-cloud plugin
Enables PostgreSQL cluster provisioning
Monitoring Stack (from
dp-infra/monitoring/):Prometheus (metrics collection)
Thanos (long-term metrics storage)
Loki (log aggregation)
Grafana (visualization)
Alloy (log collector)
Alertmanager (alert routing)
Infrastructure Applications (from dp-infra/*/):
GitLab BDA (from
dp-infra/gitlabbda/)Mailu (from
dp-infra/mailu/, planned)Other infrastructure services
Application Services (from external repos):
PostgreSQL databases (using CloudNativePG operator)
Redis, RabbitMQ, Kafka
Custom microservices
S3 buckets (using Crossplane from infrastructure tier)
Customer Deployments:
Websites and web applications
APIs and backend services
Static site generators
Why Application Tier?¶
These components depend on infrastructure:
✅ Database needs storage (Longhorn from infrastructure tier)
✅ Monitoring needs ingress (Traefik from infrastructure tier)
✅ Apps need ArgoCD (from infrastructure tier to deploy them)
Therefore: They’re deployed via ArgoCD AFTER infrastructure exists.
Update Process¶
For dp-infra/ repositories (CDK8S-based):
cd dp-infra/monitoring # Or any dp-infra subdirectory
# Edit TypeScript constructs or config.yaml
vim config.yaml
# Build manifests
npm run build
# Commit and push (triggers ArgoCD sync)
git add manifests/ config.yaml
git commit -m "Update monitoring configuration"
git push
# ArgoCD automatically syncs (or manually trigger):
argocd app sync monitoring
For external repositories:
Push changes to git repository
ArgoCD automatically detects and syncs changes (if auto-sync enabled)
Or manually sync via ArgoCD UI/CLI
See:
Source Repositories¶
Infrastructure Tier Sources¶
kube-hetzner/ (OpenTofu):
Repository:
git@git.bluedynamics.eu:kup6s/kube-hetzner.gitContains:
kube.tf,extra-manifests/, variable definitionsDeployment:
source .env && tofu apply
Application Tier Sources¶
argoapps/ (ArgoCD Application definitions):
Repository:
git@git.bluedynamics.eu:kup6s/argoapps.gitContains: CDK8S-based ArgoCD Application definitions
Purpose: Defines WHAT to deploy (points to dp-infra/ or external repos)
Deployment:
npm run build && kubectl apply -f dist/
dp-infra/ (Infrastructure application manifests):
Repository:
git@git.bluedynamics.eu:kup6s/dp/dp-infra.gitContains: CDK8S-based deployments (monitoring, cnpg, gitlabbda, mailu, etc.)
Purpose: HOW to deploy infrastructure applications
Deployment: ArgoCD syncs from git (manifests committed to repo)
External repositories:
Custom application repositories
Referenced by ArgoCD Applications in argoapps/
Deployment: ArgoCD syncs from git
Update Strategy Summary¶
Layer |
Tool |
Frequency |
Trigger |
Risk |
|---|---|---|---|---|
Infrastructure Tier |
OpenTofu |
Infrequent (weeks/months) |
Manual |
High - affects all apps |
Database Operators |
ArgoCD (dp-infra/cnpg) |
Occasional (months) |
Git push + ArgoCD sync |
Medium - affects databases |
Monitoring Stack |
ArgoCD (dp-infra/monitoring) |
Regular (weeks) |
Git push + ArgoCD sync |
Low - monitoring-only |
Infrastructure Apps |
ArgoCD (dp-infra/*) |
Regular (days/weeks) |
Git push + ArgoCD sync |
Medium - specific apps |
Application Services |
ArgoCD (external repos) |
Frequent (daily) |
Git push + ArgoCD sync |
Low - app-specific |
Risk Management:
High risk (infrastructure): Plan carefully, test in dev cluster, have rollback strategy
Medium risk (platform services): Canary deployments, monitor closely
Low risk (applications): Continuous deployment, quick rollback if issues
Benefits of Layering¶
Separation of Concerns¶
Infrastructure Team:
Manages platform components (storage, networking, GitOps)
Updates infrequently with high planning
Focused on cluster stability
Application Teams:
Deploy applications via ArgoCD
Update frequently via git push
Focused on feature delivery
Risk Isolation¶
Infrastructure changes don’t mix with application changes
Failed application deployment doesn’t affect platform
Platform stability enables rapid application iteration
Clear Dependencies¶
Infrastructure provides: Storage, networking, GitOps engine
Applications consume: Storage, networking, GitOps automation
No circular dependencies
Update Independence¶
Platform updates: Planned, controlled, infrequent
Application updates: Automated, frequent, low-ceremony
Teams don’t block each other
Anti-Patterns to Avoid¶
❌ Deploying Storage Operators via ArgoCD:
Problem: ArgoCD needs storage to work (PVCs for Redis, etc.)
Chicken-and-egg: Can’t deploy storage provider using storage
Solution: Bootstrap storage via OpenTofu
❌ Deploying ArgoCD via ArgoCD:
Problem: ArgoCD can’t manage itself during initial bootstrap
Bootstrap paradox: ArgoCD doesn’t exist to deploy itself
Solution: Bootstrap ArgoCD via OpenTofu, THEN let it manage apps
❌ Manual kubectl apply for Applications:
Problem: Bypasses GitOps, creates drift between git and cluster
Loses audit trail and versioning
Solution: Always deploy applications via ArgoCD (git → cluster)
❌ Mixing Infrastructure and Application Changes:
Problem: Complex rollback if something fails
Unclear which change caused issues
Solution: Separate commits, separate deployments
Troubleshooting¶
Infrastructure Change Broke Applications¶
Symptom: After tofu apply, applications fail or become degraded
Diagnosis:
Check what changed:
cd kube-hetzner git diff HEAD~1 HEAD
Check affected components:
kubectl get pods -A | grep -v Running kubectl get events -A --sort-by='.lastTimestamp' | tail -20
Resolution:
Rollback OpenTofu:
git revert HEAD && tofu applyOr fix forward: Address root cause and re-apply
ArgoCD Can’t Sync Applications¶
Symptom: ArgoCD Applications stuck in “OutOfSync” or “Degraded”
Diagnosis:
Check ArgoCD Application status:
kubectl get applications -n argocd kubectl describe application <app-name> -n argocd
Check sync errors:
kubectl get application <app-name> -n argocd -o jsonpath='{.status.conditions}'
Common Causes:
Infrastructure tier missing (e.g., ArgoCD not running)
Storage class unavailable (Longhorn not ready)
Network policy blocking (Cilium misconfigured)
Resolution: Fix infrastructure tier first, THEN re-sync applications.
New Application Won’t Deploy¶
Symptom: New ArgoCD Application created, but nothing happens
Diagnosis:
Check Application created:
kubectl get application <app-name> -n argocdCheck Application definition:
kubectl get application <app-name> -n argocd -o yaml
Common Causes:
Application definition not applied (
kubectl apply -f dist/app.yaml)Repository credentials missing (private repo)
Invalid path in Application spec
Resolution: Verify Application manifest, apply with kubectl, check ArgoCD logs.
Further Reading¶
Architecture Overview - Complete system architecture
ArgoCD GitOps - GitOps principles and workflows
Apply Infrastructure Changes - OpenTofu workflow
Deploy Applications with ArgoCD - Application deployment
Monitoring Deployment - Example application tier deployment