Explanation
Storage Architecture for GitLab BDA¶
This document explains GitLab BDA’s specific storage tier assignments and S3 bucket strategy. For general storage architecture, tier selection criteria, and Longhorn storage classes, see Storage Architecture and Tiers.
Overview¶
GitLab BDA uses all three storage tiers to optimize cost, performance, and reliability for a small team (2-5 users):
Component |
Storage Tier |
Storage Class |
Size |
Rationale |
|---|---|---|---|---|
Gitaly (Git repos) |
Hetzner Cloud Volumes |
|
20Gi |
Managed storage, daily S3 backups |
PostgreSQL (2 instances) |
Longhorn |
|
10Gi each |
CNPG provides replication |
Redis |
Longhorn |
|
10Gi |
Single instance needs storage HA |
GitLab artifacts/uploads/etc |
Hetzner S3 |
N/A |
Variable |
8 buckets for different purposes |
Total cluster storage: 40Gi Longhorn + 20Gi Hetzner Volumes = 60Gi
Tier 1: Hetzner Cloud Volumes¶
Gitaly (Git Repository Storage)¶
Storage tier: Hetzner Cloud Volumes
Size: 20Gi
Storage class: hcloud-volumes
Why Hetzner Volumes instead of Longhorn?
Decision rationale:
Simplicity - Hetzner handles replication, no cluster overhead
Network-attached - Gitaly pod can reschedule freely between nodes
Backup-based redundancy - Daily GitLab backups to S3 (see below)
Avoid triple redundancy - Hetzner replication + Longhorn replication + S3 backups would be wasteful
Cost-effective - €0.05/GB/month managed storage
Key insight: GitLab’s daily backup job uploads all .git data to gitlab-backups-gitlabbda-kup6s S3 bucket. Combined with Hetzner’s built-in volume redundancy, this provides adequate protection without needing Longhorn replication.
PVC configuration (from GitLab Helm chart):
storageClass: hcloud-volumes
size: 20Gi
accessModes: [ReadWriteOnce]
Tier 2: Longhorn Distributed Storage¶
PostgreSQL (CloudNativePG Cluster)¶
Storage tier: Longhorn
Size: 10Gi per instance × 2 instances = 20Gi total (logical)
Storage class: longhorn-redundant-app (1 Longhorn replica)
Why 1 replica when PostgreSQL is critical?
Decision rationale:
CNPG provides replication: 2 PostgreSQL instances with streaming replication
App-level redundancy: Primary instance replicates to standby instance
Avoid double redundancy: Each CNPG instance has its own PVC with 1 Longhorn replica
Result: 2 total copies of data (CNPG replication) + S3 backups (Barman)
Formula:
2 CNPG instances × 1 Longhorn replica each = 2 total copies
vs.
2 CNPG instances × 2 Longhorn replicas each = 4 total copies (wasteful!)
PVC configuration (from DatabaseConstruct):
storage: {
storageClass: 'longhorn-redundant-app', // 1 replica
size: '10Gi', // Per instance
}
Backup strategy:
CNPG Barman Cloud Plugin - WAL archiving + base backups to
gitlab-postgresbackups-gitlabbda-kup6sS3 bucketLonghorn volume backups - Snapshots to Hetzner Storage Box (CIFS)
See Storage Tiers for detailed explanation of this pattern.
Redis Cache¶
Storage tier: Longhorn
Size: 10Gi
Storage class: longhorn (2 Longhorn replicas)
Why 2 replicas for Redis?
Decision rationale:
Single instance: Redis not clustered for 2-5 users (adequate performance)
Storage-level HA: 2 Longhorn replicas provide redundancy
Data locality: Best-effort placement (one replica on same node as pod for performance)
Tolerable data loss: Redis is cache layer, can be rebuilt if needed
PVC configuration (from RedisConstruct):
storageClass: longhorn # 2 replicas (default)
size: 10Gi
accessModes: [ReadWriteOnce]
Tier 3: Hetzner S3 Object Storage¶
Eight S3 Buckets¶
All buckets in fsn1 region (Falkenstein - same as cluster for low latency):
Bucket Name |
Purpose |
Typical Size |
|---|---|---|
|
CI/CD artifacts (build outputs, test results) |
Variable |
|
User uploads (images, attachments) |
1-5 GB |
|
Git LFS objects (large files tracked in git) |
5-20 GB |
|
GitLab Pages static sites |
1-10 GB |
|
Harbor OCI container images |
10-50 GB |
|
GitLab application backups (repos, DB, uploads) |
20-100 GB |
|
PostgreSQL CNPG Barman WAL/base backups |
10-50 GB |
|
GitLab Runner build cache |
Variable |
Total estimated: 50-200 GB (grows with usage)
For complete bucket specifications, see S3 Buckets Reference.
Why fsn1 Region for All Buckets?¶
Decision: All buckets in production region (fsn1) instead of spreading across regions.
Rationale:
Latency: Same datacenter as cluster = lowest upload/download latency
Bandwidth: No cross-region egress fees (Hetzner internal network)
Simplicity: Single Crossplane ProviderConfig endpoint
Trade-off: Backup buckets (gitlab-backups-gitlabbda-kup6s, gitlab-postgresbackups-gitlabbda-kup6s) also in fsn1.
Alternative considered: Move backup buckets to hel1 (Helsinki) for geographic redundancy.
Why not multi-region for backups?
Hetzner S3 already replicates within region
Current scale (2-5 users) doesn’t justify complexity
Future improvement: Move backup buckets to hel1 when implementing DR strategy
Bucket Provisioning¶
Via Crossplane (GitOps-managed):
// charts/constructs/s3-buckets.ts
new Bucket(this, 'artifacts', {
metadata: {
name: 'gitlab-artifacts-gitlabbda-kup6s',
annotations: { 'argocd.argoproj.io/sync-wave': '1' },
},
spec: {
forProvider: { region: 'fsn1' },
providerConfigRef: { name: 'hetzner-s3' },
managementPolicies: ['Observe', 'Create', 'Delete'], // Skip Update
deletionPolicy: 'Orphan', // Safety
},
});
Why managementPolicies: [Observe, Create, Delete]?
Skips Update operations (Hetzner S3 doesn’t support tagging)
Without this: buckets show
SYNCED=Falsedue to 501 Not Implemented errors
See S3 Bucket Architecture for details.
Storage Allocation Summary¶
Block Storage¶
Component |
Hetzner Volumes |
Longhorn |
Replicas |
Total |
|---|---|---|---|---|
Gitaly |
20Gi |
- |
Hetzner-managed |
20Gi (billed) |
PostgreSQL (2×) |
- |
20Gi (logical) |
1 per instance |
20Gi (cluster) |
Redis |
- |
10Gi |
2 |
20Gi (cluster) |
Subtotal |
20Gi |
40Gi |
80Gi total |
Object Storage (S3)¶
Purpose |
Buckets |
Estimated Size |
|---|---|---|
Application data |
5 buckets (artifacts, uploads, LFS, pages, cache) |
20-100 GB |
Container registry |
1 bucket |
10-50 GB |
Backups |
2 buckets (GitLab, PostgreSQL) |
30-150 GB |
Subtotal |
8 buckets |
60-300 GB |
Total storage cost estimate (monthly):
Hetzner Volumes: 20Gi × €0.05/GB = €1.00
Longhorn: Cluster overhead (included in node costs)
Hetzner S3: ~100GB × €0.005/GB = €0.50
Total: ~€1.50/month for storage
Backup Strategy: Defense in Depth¶
GitLab BDA has four backup layers:
1. Application-Level Replication¶
PostgreSQL: CNPG streaming replication (2 instances)
Redis: Single instance (no clustering needed for 2-5 users)
2. Storage-Level Redundancy¶
Gitaly: Hetzner Cloud Volumes (provider-managed)
PostgreSQL: Longhorn 1 replica per CNPG instance (2 total copies)
Redis: Longhorn 2 replicas
S3: Hetzner multi-datacenter replication
3. Snapshot Backups¶
Longhorn PVCs: Daily snapshots to Hetzner Storage Box (CIFS)
PostgreSQL: Continuous WAL archiving to S3 (Barman Cloud Plugin)
4. Application Backups¶
GitLab Toolbox: Daily full backup to
gitlab-backups-gitlabbda-kup6sS3 bucketIncludes: Git repositories, database dump, uploads, LFS, artifacts
Recovery Scenarios¶
Scenario |
Recovery Method |
RTO |
RPO |
|---|---|---|---|
Pod crash |
Kubernetes restart, same PVC |
< 1 min |
0 |
Node failure |
Pod reschedule, Longhorn/CNPG replica |
< 5 min |
0 |
PVC corruption |
Restore from Longhorn backup (CIFS) |
< 30 min |
24h (daily) |
Database corruption |
Restore from Barman backup (S3, PITR) |
< 1 hour |
Minutes (WAL-based) |
GitLab data loss |
Restore from GitLab backup (S3) |
< 4 hours |
24h (daily) |
Cluster destroyed |
Rebuild cluster, restore from S3/CIFS |
< 1 day |
24h (daily) |
RTO: Recovery Time Objective RPO: Recovery Point Objective
Resource Efficiency¶
Storage Optimization Comparison¶
Without optimization (naive 2-replica Longhorn for everything):
Component |
Size |
Replicas |
Total |
|---|---|---|---|
Gitaly |
20Gi |
2 |
40Gi |
PostgreSQL (2×) |
20Gi |
2 |
40Gi |
Redis |
10Gi |
2 |
20Gi |
Total |
100Gi |
With optimization (tiered storage + avoid double redundancy):
Component |
Storage |
Total |
|---|---|---|
Gitaly |
Hetzner Volumes |
20Gi (billed) |
PostgreSQL (2×) |
Longhorn 1 replica each |
20Gi (cluster) |
Redis |
Longhorn 2 replicas |
20Gi (cluster) |
Total |
60Gi (40% savings) |
Key insight: Avoiding double redundancy (CNPG replication + Longhorn replication) saves 40Gi cluster storage.