Explanation

Architecture Overview

This document explains the high-level architecture of the GitLab BDA deployment - how all the pieces fit together and why they’re designed this way.

System Overview

GitLab BDA (Blue Dynamics Alliance) is a complete Git platform with integrated CI/CD, container registry, and static site hosting, deployed using modern cloud-native patterns.

        graph TB
    subgraph "Developer Workflow"
        DEV[Developer] -->|1. Edit Code| CDK8S[CDK8S TypeScript]
        CDK8S -->|2. npm run build| MAN[manifests/]
        MAN -->|3. git commit/push| GIT[Git Repository]
    end

    subgraph "GitOps Deployment"
        GIT -->|4. Monitors| ARGOCD[ArgoCD]
        ARGOCD -->|5. Applies| K8S[Kubernetes Cluster]
    end

    subgraph "Running Application"
        K8S --> GL[GitLab Platform]
        K8S --> HAR[Harbor Registry]
        K8S --> PG[PostgreSQL CNPG]
        K8S --> RED[Redis Cache]
    end

    subgraph "Storage"
        GL --> LH[Longhorn PVs]
        GL --> S3[Hetzner S3]
        PG --> LH
        PG --> S3B[S3 Backups]
        HAR --> S3
    end

    subgraph "External Access"
        USERS[Users] -->|HTTPS| ING[Traefik Ingress]
        ING --> GL
        ING --> HAR
    end

    style ARGOCD fill:#f96
    style CDK8S fill:#6cf
    style GL fill:#fc6
    style K8S fill:#ccf
    

Core Design Principles

1. Infrastructure as Code

Everything is code, nothing is manual.

  • CDK8S (TypeScript): All Kubernetes resources defined in type-safe code

  • GitOps (ArgoCD): Git is the single source of truth

  • Declarative: Desired state in git, ArgoCD makes it reality

  • Reproducible: Entire deployment can be recreated from code

Benefits:

  • Catch configuration errors at compile-time (TypeScript)

  • Version control for infrastructure changes

  • Easy rollback (git revert)

  • Audit trail (git history)

2. Cloud-Native Architecture

Use Kubernetes operators and CRDs instead of managing components ourselves.

We don’t deploy PostgreSQL and Redis using basic StatefulSets. We use operators that bring database expertise:

  • CloudNativePG: PostgreSQL operator with HA, backups, pooling

  • Crossplane: Infrastructure-as-code for S3 buckets

  • External Secrets Operator (ESO): Automated secret synchronization

  • cert-manager: Automated TLS certificate management

Benefits:

  • Best practices built-in (HA, backups, monitoring)

  • Automated operations (failover, backup, rotation)

  • Less custom code to maintain

  • Battle-tested solutions

3. Separation of Concerns

Clear boundaries between infrastructure, platform, and applications.

Layer 1: Cluster Infrastructure
├─ Kubernetes (K3S)
├─ Operators (CNPG, Crossplane, ESO, cert-manager)
├─ Storage (Longhorn, SMB CSI)
└─ Networking (Traefik, Cilium)

Layer 2: Platform Services (this deployment)
├─ PostgreSQL Database (CNPG Cluster)
├─ Redis Cache
├─ S3 Buckets (Crossplane)
└─ Secrets (ESO ExternalSecrets)

Layer 3: Applications
├─ GitLab (Helm chart integration)
└─ Harbor (Custom deployment)

Each layer depends only on layers below it, never above.

4. GitOps Automation

No manual kubectl apply, no cluster credentials on laptops.

  1. Developer updates TypeScript code

  2. CDK8S synthesizes Kubernetes manifests

  3. Manifests committed to git

  4. ArgoCD detects changes automatically

  5. ArgoCD applies to cluster

  6. ArgoCD continuously monitors and self-heals

Benefits:

  • No cluster credentials needed on developer machines

  • Audit trail (who changed what, when)

  • Easy rollback (git revert, ArgoCD syncs)

  • Self-healing (ArgoCD reverts manual kubectl changes)

Component Architecture

GitLab Platform

Official GitLab Helm chart with external dependencies:

GitLab consists of multiple microservices working together:

  • Webservice - Rails application handling HTTP requests

  • Gitaly - Git server managing repositories (stored on Hetzner Cloud Volumes)

  • Sidekiq - Background job processing (CI/CD, email, maintenance)

  • GitLab Shell - SSH access for git operations

  • GitLab Pages - Static site hosting

  • GitLab KAS - Kubernetes Agent Server (optional)

All components use external PostgreSQL (CNPG) and external Redis instead of built-in versions.

For detailed component architecture, see GitLab Components.

Harbor Container Registry

Separate from GitLab for vulnerability scanning and better image management.

Harbor provides:

  • Vulnerability scanning - Trivy scanner for container images

  • OAuth integration - Users authenticate with GitLab credentials

  • Project management - Fine-grained access control

  • S3 storage - Container image layers stored in Hetzner S3

Why Harbor instead of GitLab’s built-in registry? See Harbor Integration.

PostgreSQL Database (CloudNativePG)

High-availability PostgreSQL cluster (2 replicas) managed by CNPG operator with:

  • Streaming replication and automatic failover

  • PgBouncer connection pooling

  • Daily backups to S3 (offsite) with 30-day retention

  • Uses longhorn-redundant-app storage class (1 Longhorn replica, since CNPG provides 2 DB replicas)

Why external PostgreSQL? See GitLab Components.

Redis Cache

Dedicated Redis StatefulSet for caching and Sidekiq job queuing:

  • Single instance (sufficient for 2-5 users)

  • 10Gi persistent storage (Longhorn, 2 replicas)

  • Handles sessions, cache, and background job queue

Storage Architecture

Three storage tiers working together:

  1. Hetzner Cloud Volumes - Gitaly git repositories (20Gi, network-attached)

  2. Longhorn - PostgreSQL and Redis (30Gi raw, with replication)

  3. Hetzner S3 - Artifacts, uploads, LFS, pages, backups (8 buckets)

For complete storage strategy and rationale, see Storage Architecture.

Security & Secrets

Layered security approach:

  • ESO (External Secrets Operator) - Syncs secrets from application-secrets namespace

  • TLS certificates - Automated via cert-manager (Let’s Encrypt)

  • RBAC - Three ServiceAccounts with least-privilege access

  • S3 credentials - Shared Hetzner S3 credentials via ClusterSecretStore

All secrets managed centrally in application-secrets namespace, synchronized to gitlabbda via ExternalSecrets.

For complete security architecture, see Security Model and Secrets Reference.

Data Flow

Git Push Workflow

1. Developer pushes code
   git push origin main
2. GitLab Shell (SSH)
   Authenticates user
3. Gitaly (Git Server)
   Writes to Hetzner Cloud Volume (20Gi)
4. Sidekiq (Background Jobs)
   Triggers CI/CD pipeline
5. GitLab Runner (if configured)
   Executes pipeline jobs
6. Artifacts & Cache
   Stored in S3 buckets

CI/CD Build Workflow

1. Pipeline triggered
2. GitLab Runner pulls code
   From Gitaly
3. Runner executes build
   Downloads dependencies (cache from S3)
4. Build artifacts generated
   Stored in S3 (artifacts bucket)
5. Docker image built
6. Image pushed to Harbor
   Layers stored in S3 (registry bucket)
7. Vulnerability scan
   Harbor scans image (Trivy)

Static Site Deployment (GitLab Pages)

1. Pages job in .gitlab-ci.yml
2. Generates static HTML/CSS/JS
   In CI/CD pipeline
3. Uploads to pages S3 bucket
4. GitLab Pages daemon
   Serves from S3 at pages.staging.bluedynamics.eu

Deployment Pipeline (GitOps)

Developer Workflow

1. Edit TypeScript code
   charts/constructs/*.ts
2. Build manifests
   bash -c 'source .env && npm run build'
3. Review changes
   git diff manifests/gitlab.k8s.yaml
4. Commit and push
   git commit -m "Update configuration"
   git push
5. ArgoCD automatically syncs
   Within seconds

ArgoCD Sync Process

1. ArgoCD monitors git repository
   Every 3 minutes (configurable)
2. Detects changes in manifests/
3. Compares cluster state vs git state
   (drift detection)
4. Applies changes in sync waves
   Wave 1: Infrastructure (namespace, RBAC, S3)
   Wave 2: Secrets (ExternalSecrets wait for store)
   Wave 3: Applications (wait for secrets)
5. Self-healing
   Reverts any manual kubectl changes

See ArgoCD GitOps Workflow for detailed explanation.

Technology Stack

Core Infrastructure

Component

Version

Purpose

Kubernetes

1.34 (K3S)

Container orchestration

ArgoCD

Latest

GitOps deployment

Traefik

v3

Ingress controller

cert-manager

v1.16

TLS certificate automation

Longhorn

v1.7

Persistent block storage

Database & Caching

Component

Version

Purpose

CloudNativePG

v1.24

PostgreSQL operator

PostgreSQL

16

Application database

PgBouncer

(included)

Connection pooling

Redis

7

Caching and job queue

Application Platform

Component

Version

Purpose

GitLab CE

v18.5.1

Git platform (Helm chart)

Harbor

v2.14.0

Container registry

Crossplane

v1.18

S3 bucket management

ESO

v0.20.4

Secret synchronization

Development Tools

Component

Version

Purpose

CDK8S

v2.x

Infrastructure as code

cdk8s-plus

33

K8S 1.33 API constructs

TypeScript

5.x

Type-safe configuration

Node.js

22.x

Runtime for CDK8S

See Version Compatibility for compatibility matrix.

Resource Requirements

Optimized for 2-5 concurrent users:

  • CPU: ~2.5 cores (requests), ~5 cores (limits)

  • Memory: ~4Gi (requests), ~8Gi (limits)

  • Block Storage: 50Gi total (20Gi Hetzner + 30Gi Longhorn raw)

  • Object Storage: Variable (typically 10-50Gi for small team)

For detailed resource breakdown by component, see Resource Requirements.

External Access

Three public domains (all HTTPS with Let’s Encrypt):

  • gitlab.staging.bluedynamics.eu - GitLab Web UI and API

  • pages.staging.bluedynamics.eu - GitLab Pages (+ wildcard for user sites)

  • registry.staging.bluedynamics.eu - Harbor container registry

All traffic routed through Traefik ingress with automatic HTTPS redirect.

For complete endpoint and port reference, see Endpoints & Ports.

Monitoring & Observability

Integrated with cluster monitoring:

  • Prometheus - Scrapes metrics via ServiceMonitor CRDs (30-day retention)

  • Grafana - Dashboards for GitLab, PostgreSQL, Harbor, Longhorn

  • Loki - Log aggregation via Alloy (31-day retention, S3 storage)

For monitoring strategy and configuration, see Monitoring & Observability.

Why This Architecture?

Cloud-Native Approach

Why operators instead of manual StatefulSets?

  • Best practices built-in: CNPG knows PostgreSQL better than we do

  • Automated operations: Backups, failover, pooling without custom scripts

  • Battle-tested: Used by thousands of production deployments

  • Less maintenance: Operator updates bring improvements automatically

Infrastructure as Code

Why CDK8S instead of raw YAML?

  • Type safety: Catch errors at compile-time, not deploy-time

  • Reusability: Constructs are composable and shareable

  • Tooling: IDE autocomplete, refactoring, linting

  • Maintainability: Functions and variables instead of copy-paste YAML

See CDK8S Approach for detailed rationale.

GitOps Deployment

Why ArgoCD instead of kubectl apply?

  • Audit trail: Git history shows who changed what and when

  • Rollback: git revert is instant and safe

  • Self-healing: ArgoCD reverts manual changes automatically

  • No credentials: No cluster access needed on developer machines

See ArgoCD GitOps Workflow for detailed explanation.

External PostgreSQL

Why CNPG instead of GitLab’s built-in PostgreSQL?

  • High availability: Automatic failover without data loss

  • Automated backups: Daily full + continuous WAL archiving

  • Connection pooling: PgBouncer reduces connection overhead

  • Better monitoring: CNPG exports rich PostgreSQL metrics

See GitLab Components for details.