Explanation

Storage Architecture for GitLab BDA

This document explains GitLab BDA’s specific storage tier assignments and S3 bucket strategy. For general storage architecture, tier selection criteria, and Longhorn storage classes, see Storage Architecture and Tiers.

Overview

GitLab BDA uses all three storage tiers to optimize cost, performance, and reliability for a small team (2-5 users):

Component

Storage Tier

Storage Class

Size

Rationale

Gitaly (Git repos)

Hetzner Cloud Volumes

hcloud-volumes

20Gi

Managed storage, daily S3 backups

PostgreSQL (2 instances)

Longhorn

longhorn-redundant-app

10Gi each

CNPG provides replication

Redis

Longhorn

longhorn

10Gi

Single instance needs storage HA

GitLab artifacts/uploads/etc

Hetzner S3

N/A

Variable

8 buckets for different purposes

Total cluster storage: 40Gi Longhorn + 20Gi Hetzner Volumes = 60Gi

Tier 1: Hetzner Cloud Volumes

Gitaly (Git Repository Storage)

Storage tier: Hetzner Cloud Volumes Size: 20Gi Storage class: hcloud-volumes

Why Hetzner Volumes instead of Longhorn?

Decision rationale:

  1. Simplicity - Hetzner handles replication, no cluster overhead

  2. Network-attached - Gitaly pod can reschedule freely between nodes

  3. Backup-based redundancy - Daily GitLab backups to S3 (see below)

  4. Avoid triple redundancy - Hetzner replication + Longhorn replication + S3 backups would be wasteful

  5. Cost-effective - €0.05/GB/month managed storage

Key insight: GitLab’s daily backup job uploads all .git data to gitlab-backups-gitlabbda-kup6s S3 bucket. Combined with Hetzner’s built-in volume redundancy, this provides adequate protection without needing Longhorn replication.

PVC configuration (from GitLab Helm chart):

storageClass: hcloud-volumes
size: 20Gi
accessModes: [ReadWriteOnce]

Tier 2: Longhorn Distributed Storage

PostgreSQL (CloudNativePG Cluster)

Storage tier: Longhorn Size: 10Gi per instance × 2 instances = 20Gi total (logical) Storage class: longhorn-redundant-app (1 Longhorn replica)

Why 1 replica when PostgreSQL is critical?

Decision rationale:

  • CNPG provides replication: 2 PostgreSQL instances with streaming replication

  • App-level redundancy: Primary instance replicates to standby instance

  • Avoid double redundancy: Each CNPG instance has its own PVC with 1 Longhorn replica

  • Result: 2 total copies of data (CNPG replication) + S3 backups (Barman)

Formula:

2 CNPG instances × 1 Longhorn replica each = 2 total copies
vs.
2 CNPG instances × 2 Longhorn replicas each = 4 total copies (wasteful!)

PVC configuration (from DatabaseConstruct):

storage: {
  storageClass: 'longhorn-redundant-app',  // 1 replica
  size: '10Gi',  // Per instance
}

Backup strategy:

  1. CNPG Barman Cloud Plugin - WAL archiving + base backups to gitlab-postgresbackups-gitlabbda-kup6s S3 bucket

  2. Longhorn volume backups - Snapshots to Hetzner Storage Box (CIFS)

See Storage Tiers for detailed explanation of this pattern.

Redis Cache

Storage tier: Longhorn Size: 10Gi Storage class: longhorn (2 Longhorn replicas)

Why 2 replicas for Redis?

Decision rationale:

  • Single instance: Redis not clustered for 2-5 users (adequate performance)

  • Storage-level HA: 2 Longhorn replicas provide redundancy

  • Data locality: Best-effort placement (one replica on same node as pod for performance)

  • Tolerable data loss: Redis is cache layer, can be rebuilt if needed

PVC configuration (from RedisConstruct):

storageClass: longhorn  # 2 replicas (default)
size: 10Gi
accessModes: [ReadWriteOnce]

Tier 3: Hetzner S3 Object Storage

Eight S3 Buckets

All buckets in fsn1 region (Falkenstein - same as cluster for low latency):

Bucket Name

Purpose

Typical Size

gitlab-artifacts-gitlabbda-kup6s

CI/CD artifacts (build outputs, test results)

Variable

gitlab-uploads-gitlabbda-kup6s

User uploads (images, attachments)

1-5 GB

gitlab-lfs-gitlabbda-kup6s

Git LFS objects (large files tracked in git)

5-20 GB

gitlab-pages-gitlabbda-kup6s

GitLab Pages static sites

1-10 GB

gitlab-registry-gitlabbda-kup6s

Harbor OCI container images

10-50 GB

gitlab-backups-gitlabbda-kup6s

GitLab application backups (repos, DB, uploads)

20-100 GB

gitlab-postgresbackups-gitlabbda-kup6s

PostgreSQL CNPG Barman WAL/base backups

10-50 GB

gitlab-cache-gitlabbda-kup6s

GitLab Runner build cache

Variable

Total estimated: 50-200 GB (grows with usage)

For complete bucket specifications, see S3 Buckets Reference.

Why fsn1 Region for All Buckets?

Decision: All buckets in production region (fsn1) instead of spreading across regions.

Rationale:

  • Latency: Same datacenter as cluster = lowest upload/download latency

  • Bandwidth: No cross-region egress fees (Hetzner internal network)

  • Simplicity: Single Crossplane ProviderConfig endpoint

Trade-off: Backup buckets (gitlab-backups-gitlabbda-kup6s, gitlab-postgresbackups-gitlabbda-kup6s) also in fsn1.

Alternative considered: Move backup buckets to hel1 (Helsinki) for geographic redundancy.

Why not multi-region for backups?

  • Hetzner S3 already replicates within region

  • Current scale (2-5 users) doesn’t justify complexity

  • Future improvement: Move backup buckets to hel1 when implementing DR strategy

Bucket Provisioning

Via Crossplane (GitOps-managed):

// charts/constructs/s3-buckets.ts
new Bucket(this, 'artifacts', {
  metadata: {
    name: 'gitlab-artifacts-gitlabbda-kup6s',
    annotations: { 'argocd.argoproj.io/sync-wave': '1' },
  },
  spec: {
    forProvider: { region: 'fsn1' },
    providerConfigRef: { name: 'hetzner-s3' },
    managementPolicies: ['Observe', 'Create', 'Delete'],  // Skip Update
    deletionPolicy: 'Orphan',  // Safety
  },
});

Why managementPolicies: [Observe, Create, Delete]?

  • Skips Update operations (Hetzner S3 doesn’t support tagging)

  • Without this: buckets show SYNCED=False due to 501 Not Implemented errors

See S3 Bucket Architecture for details.

Storage Allocation Summary

Block Storage

Component

Hetzner Volumes

Longhorn

Replicas

Total

Gitaly

20Gi

-

Hetzner-managed

20Gi (billed)

PostgreSQL (2×)

-

20Gi (logical)

1 per instance

20Gi (cluster)

Redis

-

10Gi

2

20Gi (cluster)

Subtotal

20Gi

40Gi

80Gi total

Object Storage (S3)

Purpose

Buckets

Estimated Size

Application data

5 buckets (artifacts, uploads, LFS, pages, cache)

20-100 GB

Container registry

1 bucket

10-50 GB

Backups

2 buckets (GitLab, PostgreSQL)

30-150 GB

Subtotal

8 buckets

60-300 GB

Total storage cost estimate (monthly):

  • Hetzner Volumes: 20Gi × €0.05/GB = €1.00

  • Longhorn: Cluster overhead (included in node costs)

  • Hetzner S3: ~100GB × €0.005/GB = €0.50

  • Total: ~€1.50/month for storage

Backup Strategy: Defense in Depth

GitLab BDA has four backup layers:

1. Application-Level Replication

  • PostgreSQL: CNPG streaming replication (2 instances)

  • Redis: Single instance (no clustering needed for 2-5 users)

2. Storage-Level Redundancy

  • Gitaly: Hetzner Cloud Volumes (provider-managed)

  • PostgreSQL: Longhorn 1 replica per CNPG instance (2 total copies)

  • Redis: Longhorn 2 replicas

  • S3: Hetzner multi-datacenter replication

3. Snapshot Backups

  • Longhorn PVCs: Daily snapshots to Hetzner Storage Box (CIFS)

  • PostgreSQL: Continuous WAL archiving to S3 (Barman Cloud Plugin)

4. Application Backups

  • GitLab Toolbox: Daily full backup to gitlab-backups-gitlabbda-kup6s S3 bucket

    • Includes: Git repositories, database dump, uploads, LFS, artifacts

Recovery Scenarios

Scenario

Recovery Method

RTO

RPO

Pod crash

Kubernetes restart, same PVC

< 1 min

0

Node failure

Pod reschedule, Longhorn/CNPG replica

< 5 min

0

PVC corruption

Restore from Longhorn backup (CIFS)

< 30 min

24h (daily)

Database corruption

Restore from Barman backup (S3, PITR)

< 1 hour

Minutes (WAL-based)

GitLab data loss

Restore from GitLab backup (S3)

< 4 hours

24h (daily)

Cluster destroyed

Rebuild cluster, restore from S3/CIFS

< 1 day

24h (daily)

RTO: Recovery Time Objective RPO: Recovery Point Objective

Resource Efficiency

Storage Optimization Comparison

Without optimization (naive 2-replica Longhorn for everything):

Component

Size

Replicas

Total

Gitaly

20Gi

2

40Gi

PostgreSQL (2×)

20Gi

2

40Gi

Redis

10Gi

2

20Gi

Total

100Gi

With optimization (tiered storage + avoid double redundancy):

Component

Storage

Total

Gitaly

Hetzner Volumes

20Gi (billed)

PostgreSQL (2×)

Longhorn 1 replica each

20Gi (cluster)

Redis

Longhorn 2 replicas

20Gi (cluster)

Total

60Gi (40% savings)

Key insight: Avoiding double redundancy (CNPG replication + Longhorn replication) saves 40Gi cluster storage.