Architecture Overview¶
This document provides a high-level overview of the kup6s.com Kubernetes cluster architecture, explaining the key components, their relationships, and the design decisions behind the system.
System Design Philosophy¶
The kup6s.com infrastructure is built on three core principles:
Infrastructure as Code (IaC): All infrastructure is defined declaratively using OpenTofu and CDK8S
GitOps: Application deployments managed through ArgoCD, syncing from git repositories
Layered Architecture: Clear separation between infrastructure tier (bootstrap) and application tier
Cluster Infrastructure (kube-hetzner/)¶
Platform Foundation¶
Kubernetes Distribution: K3S - lightweight, production-ready Kubernetes
Cloud Provider: Hetzner Cloud (cost-effective European provider)
Provisioning Tool: kube-hetzner Terraform module
IaC Tool: OpenTofu v2.17.4 (open-source Terraform fork)
Key Infrastructure Components¶
Deployed via OpenTofu extra-manifests during cluster bootstrap:
Storage Layer:
Longhorn: Cloud-native distributed block storage with replication and snapshots
SMB CSI Driver: Integration with Hetzner Storage Box for shared file storage
Crossplane: Dynamic S3 bucket provisioning on Hetzner Object Storage
Networking Layer:
Traefik: Ingress controller for HTTP/HTTPS routing
cert-manager: Automatic Let’s Encrypt TLS certificate management
GitOps Layer:
ArgoCD: Continuous deployment engine, syncs applications from git to cluster
Security Layer:
External Secrets Operator (ESO): Centralized secret management, syncs from external stores
Automation:
Auto-Uncordon CronJob: Automatically detects and uncordons nodes stuck in
SchedulingDisabledstate after K3s upgrades (runs every 5 minutes)
Credential Management Philosophy¶
CRITICAL: All credentials are managed via environment variables (TF_VAR_*):
The
kube.tffile contains no hardcoded credentials.envfile stores credentials (git-ignored for security)Before running OpenTofu commands:
source .env.env.exampleserves as template for required variables
See Security Model for detailed security architecture.
Application Deployments (argoapps/)¶
CDK8S for ArgoCD Applications¶
Uses CDK8S (Cloud Development Kit for Kubernetes) with TypeScript to define ArgoCD Applications as code:
Type-safe: Full IDE autocomplete and compile-time validation
Programmatic: Generate manifests using real programming language
DRY: Share common configuration patterns across applications
Directory Structure¶
argoapps/
├── main.ts # Entry point that synthesizes all charts
├── apps/
│ ├── registry.ts # Central registry of all application charts
│ ├── config.ts # AppConfig interface definition
│ ├── kup/ # KUP organization apps
│ ├── programmatic/ # Programmatic organization apps
│ └── infra/ # Infrastructure apps
├── dist/ # Generated Kubernetes manifests (git-ignored)
└── imports/ # Generated K8S API types (cdk8s import)
Workflow¶
Define ArgoCD Application in TypeScript (e.g.,
apps/kup/myapp.ts)Register in
apps/registry.tsBuild:
npm run build→ generatesdist/myapp.k8s.yamlApply:
kubectl apply -f dist/myapp.k8s.yamlArgoCD syncs application from external git repository
Infrastructure Deployments (dp-infra/)¶
Separate Repository for CDK8S Applications¶
Repository: git@git.bluedynamics.eu:kup6s/dp/dp-infra.git
This repository contains CDK8S-based infrastructure application deployments. Unlike argoapps/ (which defines ArgoCD Applications), dp-infra/ contains the actual application manifests.
Directory Structure¶
dp-infra/
├── monitoring/ # Monitoring stack (Prometheus, Thanos, Loki, Grafana)
│ ├── charts/ # CDK8S TypeScript source code
│ ├── manifests/ # Generated K8S manifests (committed to git)
│ ├── config.yaml # Central configuration
│ └── main.ts # Entry point
├── gitlabbda/ # GitLab BDA platform deployment
│ ├── charts/ # CDK8S TypeScript source code
│ ├── manifests/ # Generated manifests (committed to git)
│ └── config.yaml # Configuration
└── cnpg/ # CloudNativePG operator deployment
├── charts/
├── manifests/
└── config.yaml
How It Works¶
Development: Edit TypeScript constructs in
charts/directoryBuild:
npm run buildgenerates manifests inmanifests/directoryCommit: Manifests are committed to git (ArgoCD reads from git)
ArgoCD Sync: ArgoCD Application (defined in
argoapps/) points todp-infra/repository:repoURL: https://git.bluedynamics.eu/kup6s/dp/dp-infra.gitpath: monitoring/manifests(for monitoring stack)
Automatic Deployment: ArgoCD syncs changes from git to cluster
Why Commit Manifests?¶
ArgoCD requires manifests in git (it cannot run npm run build). This approach:
✅ Enables GitOps workflow
✅ Provides manifest versioning
✅ Allows ArgoCD to track drift
✅ Supports rollback to previous versions
Documentation (documentation/)¶
Sphinx + MyST + Diátaxis¶
Build System: Sphinx with MyST Markdown parser
Framework: Diátaxis (tutorials, how-to, explanation, reference)
Build Tool: mxmake
Documentation Structure¶
Cluster-Level Explanations (documentation/sources/explanation/):
Universal concepts: CDK8S, ArgoCD, storage tiers, security model
Referenced by multiple deployments
Single source of truth
Deployment-Specific Documentation (documentation/sources/deployments/):
Each deployment has its own Diátaxis-structured docs
Focus only on deployment-specific details
Reference cluster-level explanations for shared concepts
Key Principle: Centralize common concepts, keep deployment docs focused and concise.
See:
Workspace Structure¶
Meta Repository Pattern¶
This workspace uses a meta repository pattern for managing multiple independent git repositories:
Workspace Repository (workspace-kup6s):
Tracks workspace-level files:
CLAUDE.md,repos.yaml,scripts/,README.mdIgnores subrepo directories (they have their own git history)
Subrepos (independent git repositories):
kube-hetzner/- Cluster infrastructure (OpenTofu)argoapps/- ArgoCD Application definitions (CDK8S)dp-infra/- Infrastructure deployments (CDK8S)documentation/- Sphinx documentation
Advantages:
No git submodule/subtree complexity
Subrepos remain fully independent
Can clone/use subrepos separately
Version control for workspace-level automation
Onboarding:
git clone git@git.bluedynamics.eu:kup6s/workspace-kup6s.git kup6s
cd kup6s
./scripts/clone-all.sh
See Workspace Structure for detailed explanation.
Component Relationships¶
┌─────────────────────────────────────────────────────────────┐
│ Developer Workstation │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Edit kube-hetzner/*.tf │
│ source .env && tofu apply │
│ └──> Provisions/updates cluster infrastructure │
│ │
│ 2. Edit argoapps/apps/*.ts │
│ npm run build && kubectl apply -f dist/ │
│ └──> Deploys ArgoCD Application definitions │
│ │
│ 3. Edit dp-infra/monitoring/charts/*.ts │
│ npm run build && git commit manifests/ && git push │
│ └──> ArgoCD auto-syncs from git to cluster │
│ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster (K3S on Hetzner) │
├─────────────────────────────────────────────────────────────┤
│ │
│ Infrastructure Tier (OpenTofu-managed): │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Storage: Longhorn, SMB CSI, Crossplane │ │
│ │ Networking: Traefik, cert-manager │ │
│ │ GitOps: ArgoCD │ │
│ │ Secrets: External Secrets Operator │ │
│ └────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Application Tier (ArgoCD-managed): │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Monitoring: Prometheus, Thanos, Loki, Grafana │ │
│ │ Databases: CloudNativePG operator, PostgreSQL │ │
│ │ Applications: GitLab BDA, Mailu, etc. │ │
│ │ S3 Buckets: Crossplane-managed buckets │ │
│ └────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Infrastructure Layering¶
The architecture uses a two-tier approach to separate concerns:
Infrastructure Tier (Bootstrap)¶
Managed via OpenTofu, deployed during cluster provisioning:
Storage operators (Longhorn, SMB CSI, Crossplane)
Networking (Traefik, cert-manager)
GitOps engine (ArgoCD itself)
Secrets management (ESO)
Rationale: These components must exist before applications can be deployed.
Application Tier¶
Managed via ArgoCD, deployed after platform is ready:
Database operators (CloudNativePG)
Monitoring stack (Prometheus, Thanos, Loki, Grafana)
Infrastructure applications (GitLab BDA, Mailu)
Application databases and services
S3 buckets (using Crossplane)
Rationale: Applications assume the platform exists and are deployed/updated via GitOps.
See Infrastructure Layering for detailed explanation.
Deployment Workflow¶
Infrastructure Changes¶
Edit OpenTofu configuration in
kube-hetzner/Plan:
source .env && tofu planApply:
bash scripts/apply-and-configure-longhorn.sh(for kube.tf changes)Cluster infrastructure updated
Application Deployments¶
Edit CDK8S code in
dp-infra/monitoring/charts/Build:
npm run build(generatesmanifests/)Commit:
git add manifests/ && git commit && git pushArgoCD auto-syncs from git to cluster
Application deployed/updated
Adding New Applications¶
Create ArgoCD Application in
argoapps/apps/Build:
npm run buildApply:
kubectl apply -f dist/myapp.k8s.yamlArgoCD syncs application from external repository
Monitoring and Observability¶
Monitoring Stack (deployed via ArgoCD + CDK8S):
Prometheus: Metrics collection (2 replicas, 3-day local retention)
Thanos: Long-term metrics storage (S3, 2-year retention with downsampling)
Grafana: Visualization and dashboards
Loki: Log aggregation (SimpleScalable mode, S3 storage, 31-day retention)
Alloy: Log collection (DaemonSet, one per node)
Alertmanager: Alert routing (email via SMTP)
Management: All monitoring components managed via dp-infra/monitoring/ repository using CDK8S TypeScript.
See Monitoring Architecture for detailed explanation.
Key Design Decisions¶
Why OpenTofu Instead of Terraform?¶
OpenTofu is an open-source fork of Terraform created after HashiCorp’s license change. Provides:
✅ True open-source (MPL 2.0 license)
✅ Community-driven development
✅ Drop-in replacement for Terraform
✅ No vendor lock-in
Why CDK8S Instead of Helm/Kustomize?¶
CDK8S provides:
✅ Full programming language (TypeScript) with type safety
✅ IDE autocomplete and refactoring support
✅ Easier testing and validation
✅ Better code reuse via constructs
✅ Compile-time error detection
Why ArgoCD Instead of Flux/Jenkins?¶
ArgoCD provides:
✅ Declarative GitOps workflow
✅ Web UI for visualization and troubleshooting
✅ Multi-cluster support (if needed)
✅ Strong RBAC and security
✅ Native Kubernetes resource health checks
Why Meta Repository Instead of Monorepo?¶
Meta repository provides:
✅ Independent git history per subrepo
✅ No git submodule/subtree complexity
✅ Subrepos can be cloned/used independently
✅ ArgoCD references remain simple (no monorepo paths)
✅ Flexible access control (different teams, different repos)
Further Reading¶
Infrastructure Layering - Bootstrap vs platform vs application tiers
Storage Architecture and Tiers - Multi-tier storage strategy
Security Model - Firewall, credentials, ESO
Resource Management - QoS classes, sizing guidelines
Monitoring Architecture - Metrics and logs collection
CDK8S Infrastructure as Code - CDK8S patterns
ArgoCD GitOps - GitOps principles and workflows
Cluster Capabilities Reference - Complete infrastructure capabilities