Architecture Overview

This document provides a high-level overview of the kup6s.com Kubernetes cluster architecture, explaining the key components, their relationships, and the design decisions behind the system.

System Design Philosophy

The kup6s.com infrastructure is built on three core principles:

  1. Infrastructure as Code (IaC): All infrastructure is defined declaratively using OpenTofu and CDK8S

  2. GitOps: Application deployments managed through ArgoCD, syncing from git repositories

  3. Layered Architecture: Clear separation between infrastructure tier (bootstrap) and application tier

Cluster Infrastructure (kube-hetzner/)

Platform Foundation

  • Kubernetes Distribution: K3S - lightweight, production-ready Kubernetes

  • Cloud Provider: Hetzner Cloud (cost-effective European provider)

  • Provisioning Tool: kube-hetzner Terraform module

  • IaC Tool: OpenTofu v2.17.4 (open-source Terraform fork)

Key Infrastructure Components

Deployed via OpenTofu extra-manifests during cluster bootstrap:

Storage Layer:

  • Longhorn: Cloud-native distributed block storage with replication and snapshots

  • SMB CSI Driver: Integration with Hetzner Storage Box for shared file storage

  • Crossplane: Dynamic S3 bucket provisioning on Hetzner Object Storage

Networking Layer:

  • Traefik: Ingress controller for HTTP/HTTPS routing

  • cert-manager: Automatic Let’s Encrypt TLS certificate management

GitOps Layer:

  • ArgoCD: Continuous deployment engine, syncs applications from git to cluster

Security Layer:

  • External Secrets Operator (ESO): Centralized secret management, syncs from external stores

Automation:

  • Auto-Uncordon CronJob: Automatically detects and uncordons nodes stuck in SchedulingDisabled state after K3s upgrades (runs every 5 minutes)

Credential Management Philosophy

CRITICAL: All credentials are managed via environment variables (TF_VAR_*):

  • The kube.tf file contains no hardcoded credentials

  • .env file stores credentials (git-ignored for security)

  • Before running OpenTofu commands: source .env

  • .env.example serves as template for required variables

See Security Model for detailed security architecture.

Application Deployments (argoapps/)

CDK8S for ArgoCD Applications

Uses CDK8S (Cloud Development Kit for Kubernetes) with TypeScript to define ArgoCD Applications as code:

  • Type-safe: Full IDE autocomplete and compile-time validation

  • Programmatic: Generate manifests using real programming language

  • DRY: Share common configuration patterns across applications

Directory Structure

argoapps/
├── main.ts              # Entry point that synthesizes all charts
├── apps/
│   ├── registry.ts      # Central registry of all application charts
│   ├── config.ts        # AppConfig interface definition
│   ├── kup/             # KUP organization apps
│   ├── programmatic/    # Programmatic organization apps
│   └── infra/           # Infrastructure apps
├── dist/                # Generated Kubernetes manifests (git-ignored)
└── imports/             # Generated K8S API types (cdk8s import)

Workflow

  1. Define ArgoCD Application in TypeScript (e.g., apps/kup/myapp.ts)

  2. Register in apps/registry.ts

  3. Build: npm run build → generates dist/myapp.k8s.yaml

  4. Apply: kubectl apply -f dist/myapp.k8s.yaml

  5. ArgoCD syncs application from external git repository

Infrastructure Deployments (dp-infra/)

Separate Repository for CDK8S Applications

Repository: git@git.bluedynamics.eu:kup6s/dp/dp-infra.git

This repository contains CDK8S-based infrastructure application deployments. Unlike argoapps/ (which defines ArgoCD Applications), dp-infra/ contains the actual application manifests.

Directory Structure

dp-infra/
├── monitoring/          # Monitoring stack (Prometheus, Thanos, Loki, Grafana)
│   ├── charts/          # CDK8S TypeScript source code
│   ├── manifests/       # Generated K8S manifests (committed to git)
│   ├── config.yaml      # Central configuration
│   └── main.ts          # Entry point
├── gitlabbda/           # GitLab BDA platform deployment
│   ├── charts/          # CDK8S TypeScript source code
│   ├── manifests/       # Generated manifests (committed to git)
│   └── config.yaml      # Configuration
└── cnpg/                # CloudNativePG operator deployment
    ├── charts/
    ├── manifests/
    └── config.yaml

How It Works

  1. Development: Edit TypeScript constructs in charts/ directory

  2. Build: npm run build generates manifests in manifests/ directory

  3. Commit: Manifests are committed to git (ArgoCD reads from git)

  4. ArgoCD Sync: ArgoCD Application (defined in argoapps/) points to dp-infra/ repository:

    • repoURL: https://git.bluedynamics.eu/kup6s/dp/dp-infra.git

    • path: monitoring/manifests (for monitoring stack)

  5. Automatic Deployment: ArgoCD syncs changes from git to cluster

Why Commit Manifests?

ArgoCD requires manifests in git (it cannot run npm run build). This approach:

  • ✅ Enables GitOps workflow

  • ✅ Provides manifest versioning

  • ✅ Allows ArgoCD to track drift

  • ✅ Supports rollback to previous versions

Documentation (documentation/)

Sphinx + MyST + Diátaxis

  • Build System: Sphinx with MyST Markdown parser

  • Framework: Diátaxis (tutorials, how-to, explanation, reference)

  • Build Tool: mxmake

Documentation Structure

Cluster-Level Explanations (documentation/sources/explanation/):

  • Universal concepts: CDK8S, ArgoCD, storage tiers, security model

  • Referenced by multiple deployments

  • Single source of truth

Deployment-Specific Documentation (documentation/sources/deployments/):

  • Each deployment has its own Diátaxis-structured docs

  • Focus only on deployment-specific details

  • Reference cluster-level explanations for shared concepts

Key Principle: Centralize common concepts, keep deployment docs focused and concise.

See:

Workspace Structure

Meta Repository Pattern

This workspace uses a meta repository pattern for managing multiple independent git repositories:

Workspace Repository (workspace-kup6s):

  • Tracks workspace-level files: CLAUDE.md, repos.yaml, scripts/, README.md

  • Ignores subrepo directories (they have their own git history)

Subrepos (independent git repositories):

  • kube-hetzner/ - Cluster infrastructure (OpenTofu)

  • argoapps/ - ArgoCD Application definitions (CDK8S)

  • dp-infra/ - Infrastructure deployments (CDK8S)

  • documentation/ - Sphinx documentation

Advantages:

  • No git submodule/subtree complexity

  • Subrepos remain fully independent

  • Can clone/use subrepos separately

  • Version control for workspace-level automation

Onboarding:

git clone git@git.bluedynamics.eu:kup6s/workspace-kup6s.git kup6s
cd kup6s
./scripts/clone-all.sh

See Workspace Structure for detailed explanation.

Component Relationships

┌─────────────────────────────────────────────────────────────┐
│ Developer Workstation                                        │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  1. Edit kube-hetzner/*.tf                                  │
│     source .env && tofu apply                               │
│     └──> Provisions/updates cluster infrastructure          │
│                                                              │
│  2. Edit argoapps/apps/*.ts                                 │
│     npm run build && kubectl apply -f dist/                 │
│     └──> Deploys ArgoCD Application definitions             │
│                                                              │
│  3. Edit dp-infra/monitoring/charts/*.ts                    │
│     npm run build && git commit manifests/ && git push      │
│     └──> ArgoCD auto-syncs from git to cluster              │
│                                                              │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster (K3S on Hetzner)                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Infrastructure Tier (OpenTofu-managed):                    │
│  ┌────────────────────────────────────────────────────┐    │
│  │ Storage: Longhorn, SMB CSI, Crossplane             │    │
│  │ Networking: Traefik, cert-manager                  │    │
│  │ GitOps: ArgoCD                                      │    │
│  │ Secrets: External Secrets Operator                 │    │
│  └────────────────────────────────────────────────────┘    │
│                          │                                   │
│                          ▼                                   │
│  Application Tier (ArgoCD-managed):                         │
│  ┌────────────────────────────────────────────────────┐    │
│  │ Monitoring: Prometheus, Thanos, Loki, Grafana      │    │
│  │ Databases: CloudNativePG operator, PostgreSQL      │    │
│  │ Applications: GitLab BDA, Mailu, etc.              │    │
│  │ S3 Buckets: Crossplane-managed buckets             │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Infrastructure Layering

The architecture uses a two-tier approach to separate concerns:

Infrastructure Tier (Bootstrap)

Managed via OpenTofu, deployed during cluster provisioning:

  • Storage operators (Longhorn, SMB CSI, Crossplane)

  • Networking (Traefik, cert-manager)

  • GitOps engine (ArgoCD itself)

  • Secrets management (ESO)

Rationale: These components must exist before applications can be deployed.

Application Tier

Managed via ArgoCD, deployed after platform is ready:

  • Database operators (CloudNativePG)

  • Monitoring stack (Prometheus, Thanos, Loki, Grafana)

  • Infrastructure applications (GitLab BDA, Mailu)

  • Application databases and services

  • S3 buckets (using Crossplane)

Rationale: Applications assume the platform exists and are deployed/updated via GitOps.

See Infrastructure Layering for detailed explanation.

Deployment Workflow

Infrastructure Changes

  1. Edit OpenTofu configuration in kube-hetzner/

  2. Plan: source .env && tofu plan

  3. Apply: bash scripts/apply-and-configure-longhorn.sh (for kube.tf changes)

  4. Cluster infrastructure updated

Application Deployments

  1. Edit CDK8S code in dp-infra/monitoring/charts/

  2. Build: npm run build (generates manifests/)

  3. Commit: git add manifests/ && git commit && git push

  4. ArgoCD auto-syncs from git to cluster

  5. Application deployed/updated

Adding New Applications

  1. Create ArgoCD Application in argoapps/apps/

  2. Build: npm run build

  3. Apply: kubectl apply -f dist/myapp.k8s.yaml

  4. ArgoCD syncs application from external repository

Monitoring and Observability

Monitoring Stack (deployed via ArgoCD + CDK8S):

  • Prometheus: Metrics collection (2 replicas, 3-day local retention)

  • Thanos: Long-term metrics storage (S3, 2-year retention with downsampling)

  • Grafana: Visualization and dashboards

  • Loki: Log aggregation (SimpleScalable mode, S3 storage, 31-day retention)

  • Alloy: Log collection (DaemonSet, one per node)

  • Alertmanager: Alert routing (email via SMTP)

Management: All monitoring components managed via dp-infra/monitoring/ repository using CDK8S TypeScript.

See Monitoring Architecture for detailed explanation.

Key Design Decisions

Why OpenTofu Instead of Terraform?

OpenTofu is an open-source fork of Terraform created after HashiCorp’s license change. Provides:

  • ✅ True open-source (MPL 2.0 license)

  • ✅ Community-driven development

  • ✅ Drop-in replacement for Terraform

  • ✅ No vendor lock-in

Why CDK8S Instead of Helm/Kustomize?

CDK8S provides:

  • ✅ Full programming language (TypeScript) with type safety

  • ✅ IDE autocomplete and refactoring support

  • ✅ Easier testing and validation

  • ✅ Better code reuse via constructs

  • ✅ Compile-time error detection

Why ArgoCD Instead of Flux/Jenkins?

ArgoCD provides:

  • ✅ Declarative GitOps workflow

  • ✅ Web UI for visualization and troubleshooting

  • ✅ Multi-cluster support (if needed)

  • ✅ Strong RBAC and security

  • ✅ Native Kubernetes resource health checks

Why Meta Repository Instead of Monorepo?

Meta repository provides:

  • ✅ Independent git history per subrepo

  • ✅ No git submodule/subtree complexity

  • ✅ Subrepos can be cloned/used independently

  • ✅ ArgoCD references remain simple (no monorepo paths)

  • ✅ Flexible access control (different teams, different repos)

Further Reading