Architecture Overview¶

This document provides a high-level overview of the kup6s.com Kubernetes cluster architecture, explaining the key components, their relationships, and the design decisions behind the system.

System Design Philosophy¶

The kup6s.com infrastructure is built on three core principles:

Infrastructure as Code (IaC): All infrastructure is defined declaratively using OpenTofu and CDK8S
GitOps: Application deployments managed through ArgoCD, syncing from git repositories
Layered Architecture: Clear separation between infrastructure tier (bootstrap) and application tier

Cluster Infrastructure (kube-hetzner/)¶

Platform Foundation¶

Kubernetes Distribution: K3S - lightweight, production-ready Kubernetes
Cloud Provider: Hetzner Cloud (cost-effective European provider)
Provisioning Tool: kube-hetzner Terraform module
IaC Tool: OpenTofu v2.17.4 (open-source Terraform fork)

Key Infrastructure Components¶

Deployed via OpenTofu extra-manifests during cluster bootstrap:

Storage Layer:

Longhorn: Cloud-native distributed block storage with replication and snapshots
SMB CSI Driver: Integration with Hetzner Storage Box for shared file storage
Crossplane: Dynamic S3 bucket provisioning on Hetzner Object Storage

Networking Layer:

Traefik: Ingress controller for HTTP/HTTPS routing
cert-manager: Automatic Let’s Encrypt TLS certificate management

GitOps Layer:

ArgoCD: Continuous deployment engine, syncs applications from git to cluster

Security Layer:

External Secrets Operator (ESO): Centralized secret management, syncs from external stores

Automation:

Auto-Uncordon CronJob: Automatically detects and uncordons nodes stuck in SchedulingDisabled state after K3s upgrades (runs every 5 minutes)

Credential Management Philosophy¶

CRITICAL: All credentials are managed via environment variables (TF_VAR_*):

The kube.tf file contains no hardcoded credentials
.env file stores credentials (git-ignored for security)
Before running OpenTofu commands: source .env
.env.example serves as template for required variables

See Security Model for detailed security architecture.

Application Deployments (argoapps/)¶

CDK8S for ArgoCD Applications¶

Uses CDK8S (Cloud Development Kit for Kubernetes) with TypeScript to define ArgoCD Applications as code:

Type-safe: Full IDE autocomplete and compile-time validation
Programmatic: Generate manifests using real programming language
DRY: Share common configuration patterns across applications

Directory Structure¶

argoapps/
├── main.ts              # Entry point that synthesizes all charts
├── apps/
│   ├── registry.ts      # Central registry of all application charts
│   ├── config.ts        # AppConfig interface definition
│   ├── kup/             # KUP organization apps
│   ├── programmatic/    # Programmatic organization apps
│   └── infra/           # Infrastructure apps
├── dist/                # Generated Kubernetes manifests (git-ignored)
└── imports/             # Generated K8S API types (cdk8s import)

Workflow¶

Define ArgoCD Application in TypeScript (e.g., apps/kup/myapp.ts)
Register in apps/registry.ts
Build: npm run build → generates dist/myapp.k8s.yaml
Apply: kubectl apply -f dist/myapp.k8s.yaml
ArgoCD syncs application from external git repository

Infrastructure Deployments (dp-infra/)¶

Separate Repository for CDK8S Applications¶

Repository: git@git.bluedynamics.eu:kup6s/dp/dp-infra.git

This repository contains CDK8S-based infrastructure application deployments. Unlike argoapps/ (which defines ArgoCD Applications), dp-infra/ contains the actual application manifests.

Directory Structure¶

dp-infra/
├── monitoring/          # Monitoring stack (Prometheus, Thanos, Loki, Grafana)
│   ├── charts/          # CDK8S TypeScript source code
│   ├── manifests/       # Generated K8S manifests (committed to git)
│   ├── config.yaml      # Central configuration
│   └── main.ts          # Entry point
├── gitlabbda/           # GitLab BDA platform deployment
│   ├── charts/          # CDK8S TypeScript source code
│   ├── manifests/       # Generated manifests (committed to git)
│   └── config.yaml      # Configuration
└── cnpg/                # CloudNativePG operator deployment
    ├── charts/
    ├── manifests/
    └── config.yaml

How It Works¶

Development: Edit TypeScript constructs in charts/ directory
Build: npm run build generates manifests in manifests/ directory
Commit: Manifests are committed to git (ArgoCD reads from git)
ArgoCD Sync: ArgoCD Application (defined in argoapps/) points to dp-infra/ repository:
- repoURL: https://git.bluedynamics.eu/kup6s/dp/dp-infra.git
- path: monitoring/manifests (for monitoring stack)
Automatic Deployment: ArgoCD syncs changes from git to cluster

Why Commit Manifests?¶

ArgoCD requires manifests in git (it cannot run npm run build). This approach:

✅ Enables GitOps workflow
✅ Provides manifest versioning
✅ Allows ArgoCD to track drift
✅ Supports rollback to previous versions

Documentation (documentation/)¶

Sphinx + MyST + Diátaxis¶

Build System: Sphinx with MyST Markdown parser
Framework: Diátaxis (tutorials, how-to, explanation, reference)
Build Tool: mxmake

Documentation Structure¶

Cluster-Level Explanations (documentation/sources/explanation/):

Universal concepts: CDK8S, ArgoCD, storage tiers, security model
Referenced by multiple deployments
Single source of truth

Deployment-Specific Documentation (documentation/sources/deployments/):

Each deployment has its own Diátaxis-structured docs
Focus only on deployment-specific details
Reference cluster-level explanations for shared concepts

Key Principle: Centralize common concepts, keep deployment docs focused and concise.

See:

Workspace Structure¶

Meta Repository Pattern¶

This workspace uses a meta repository pattern for managing multiple independent git repositories:

Workspace Repository (workspace-kup6s):

Tracks workspace-level files: CLAUDE.md, repos.yaml, scripts/, README.md
Ignores subrepo directories (they have their own git history)

Subrepos (independent git repositories):

kube-hetzner/ - Cluster infrastructure (OpenTofu)
argoapps/ - ArgoCD Application definitions (CDK8S)
dp-infra/ - Infrastructure deployments (CDK8S)
documentation/ - Sphinx documentation

Advantages:

No git submodule/subtree complexity
Subrepos remain fully independent
Can clone/use subrepos separately
Version control for workspace-level automation

Onboarding:

git clone git@git.bluedynamics.eu:kup6s/workspace-kup6s.git kup6s
cd kup6s
./scripts/clone-all.sh

See Workspace Structure for detailed explanation.

Component Relationships¶

┌─────────────────────────────────────────────────────────────┐
│ Developer Workstation                                        │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. Edit kube-hetzner/*.tf                                  │
│     source .env && tofu apply                               │
│     └──> Provisions/updates cluster infrastructure          │
│                                                              │
│  2. Edit argoapps/apps/*.ts                                 │
│     npm run build && kubectl apply -f dist/                 │
│     └──> Deploys ArgoCD Application definitions             │
│                                                              │
│  3. Edit dp-infra/monitoring/charts/*.ts                    │
│     npm run build && git commit manifests/ && git push      │
│     └──> ArgoCD auto-syncs from git to cluster              │
│                                                              │
└─────────────────────────────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster (K3S on Hetzner)                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Infrastructure Tier (OpenTofu-managed):                    │
│  ┌────────────────────────────────────────────────────┐    │
│  │ Storage: Longhorn, SMB CSI, Crossplane             │    │
│  │ Networking: Traefik, cert-manager                  │    │
│  │ GitOps: ArgoCD                                      │    │
│  │ Secrets: External Secrets Operator                 │    │
│  └────────────────────────────────────────────────────┘    │
│                          │                                   │
│                          ▼                                   │
│  Application Tier (ArgoCD-managed):                         │
│  ┌────────────────────────────────────────────────────┐    │
│  │ Monitoring: Prometheus, Thanos, Loki, Grafana      │    │
│  │ Databases: CloudNativePG operator, PostgreSQL      │    │
│  │ Applications: GitLab BDA, Mailu, etc.              │    │
│  │ S3 Buckets: Crossplane-managed buckets             │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Infrastructure Layering¶

The architecture uses a two-tier approach to separate concerns:

Infrastructure Tier (Bootstrap)¶

Managed via OpenTofu, deployed during cluster provisioning:

Storage operators (Longhorn, SMB CSI, Crossplane)
Networking (Traefik, cert-manager)
GitOps engine (ArgoCD itself)
Secrets management (ESO)

Rationale: These components must exist before applications can be deployed.

Application Tier¶

Managed via ArgoCD, deployed after platform is ready:

Database operators (CloudNativePG)
Monitoring stack (Prometheus, Thanos, Loki, Grafana)
Infrastructure applications (GitLab BDA, Mailu)
Application databases and services
S3 buckets (using Crossplane)

Rationale: Applications assume the platform exists and are deployed/updated via GitOps.

See Infrastructure Layering for detailed explanation.

Deployment Workflow¶

Infrastructure Changes¶

Edit OpenTofu configuration in kube-hetzner/
Plan: source .env && tofu plan
Apply: bash scripts/apply-and-configure-longhorn.sh (for kube.tf changes)
Cluster infrastructure updated

Application Deployments¶

Edit CDK8S code in dp-infra/monitoring/charts/
Build: npm run build (generates manifests/)
Commit: git add manifests/ && git commit && git push
ArgoCD auto-syncs from git to cluster
Application deployed/updated

Adding New Applications¶

Create ArgoCD Application in argoapps/apps/
Build: npm run build
Apply: kubectl apply -f dist/myapp.k8s.yaml
ArgoCD syncs application from external repository

Monitoring and Observability¶

Monitoring Stack (deployed via ArgoCD + CDK8S):

Prometheus: Metrics collection (2 replicas, 3-day local retention)
Thanos: Long-term metrics storage (S3, 2-year retention with downsampling)
Grafana: Visualization and dashboards
Loki: Log aggregation (SimpleScalable mode, S3 storage, 31-day retention)
Alloy: Log collection (DaemonSet, one per node)
Alertmanager: Alert routing (email via SMTP)

Management: All monitoring components managed via dp-infra/monitoring/ repository using CDK8S TypeScript.

See Monitoring Architecture for detailed explanation.

Key Design Decisions¶

Why OpenTofu Instead of Terraform?¶

OpenTofu is an open-source fork of Terraform created after HashiCorp’s license change. Provides:

✅ True open-source (MPL 2.0 license)
✅ Community-driven development
✅ Drop-in replacement for Terraform
✅ No vendor lock-in

Why CDK8S Instead of Helm/Kustomize?¶

CDK8S provides:

✅ Full programming language (TypeScript) with type safety
✅ IDE autocomplete and refactoring support
✅ Easier testing and validation
✅ Better code reuse via constructs
✅ Compile-time error detection

Why ArgoCD Instead of Flux/Jenkins?¶

ArgoCD provides:

✅ Declarative GitOps workflow
✅ Web UI for visualization and troubleshooting
✅ Multi-cluster support (if needed)
✅ Strong RBAC and security
✅ Native Kubernetes resource health checks

Why Meta Repository Instead of Monorepo?¶

Meta repository provides:

✅ Independent git history per subrepo
✅ No git submodule/subtree complexity
✅ Subrepos can be cloned/used independently
✅ ArgoCD references remain simple (no monorepo paths)
✅ Flexible access control (different teams, different repos)

Architecture Overview¶

System Design Philosophy¶

Cluster Infrastructure (kube-hetzner/)¶

Platform Foundation¶

Key Infrastructure Components¶

Credential Management Philosophy¶

Application Deployments (argoapps/)¶

CDK8S for ArgoCD Applications¶

Directory Structure¶

Workflow¶

Infrastructure Deployments (dp-infra/)¶

Separate Repository for CDK8S Applications¶

Directory Structure¶

How It Works¶

Why Commit Manifests?¶

Documentation (documentation/)¶

Sphinx + MyST + Diátaxis¶

Documentation Structure¶

Workspace Structure¶

Meta Repository Pattern¶

Component Relationships¶

Infrastructure Layering¶

Infrastructure Tier (Bootstrap)¶

Application Tier¶

Deployment Workflow¶

Infrastructure Changes¶

Application Deployments¶

Adding New Applications¶

Monitoring and Observability¶

Key Design Decisions¶

Why OpenTofu Instead of Terraform?¶

Why CDK8S Instead of Helm/Kustomize?¶

Why ArgoCD Instead of Flux/Jenkins?¶

Why Meta Repository Instead of Monorepo?¶

Further Reading¶