Explanation
Storage Architecture and Tiers¶
This document explains kup6s’s multi-tier storage architecture - why we use different storage technologies for different workloads, and how to choose the right storage tier for your data.
Overview: Why Multiple Storage Tiers?¶
Kubernetes workloads have vastly different storage requirements:
Data Type |
Access Pattern |
Needs |
Best Fit |
|---|---|---|---|
Databases |
Random I/O, high IOPS |
Performance, HA |
Replicated block storage |
Git repositories |
Sequential, append-mostly |
Reliability, backups |
Simple block storage |
Logs/Metrics |
Write-heavy, time-series |
Long retention, cost |
Object storage |
Artifacts/Uploads |
Large files, read-heavy |
Scale, low cost/GB |
Object storage |
Single storage solution = suboptimal for all use cases
kup6s uses a three-tier architecture where each tier is optimized for specific workloads:
Hetzner Cloud Volumes - Managed network block storage (simple, reliable)
Longhorn - Self-managed replicated block storage (flexible, HA)
Hetzner S3 Object Storage - Scalable object storage (unlimited, cost-effective)
Tier 1: Hetzner Cloud Volumes¶
What It Is¶
Managed network block storage provided by Hetzner Cloud infrastructure.
Provisioned via Hetzner CSI driver
Network-attached (not tied to specific nodes)
Hetzner handles redundancy, snapshots, availability
Storage class:
hcloud-volumes
When to Use¶
Perfect for:
Workloads with built-in backup/redundancy - Don’t need Longhorn replication
Simple persistent storage needs - Want Hetzner to handle availability
Network-attached storage - Pods need to move freely between nodes
Example: Git repositories (Gitaly)
Git repos backed up daily to S3
Hetzner Cloud Volumes provide reliability
No need for Longhorn replication (would be triple redundancy!)
Characteristics¶
Pros:
✅ Managed by Hetzner (no cluster overhead)
✅ Network-attached (pod rescheduling works seamlessly)
✅ Hetzner SLA guarantees
✅ Cost-effective (€0.05/GB/month)
Cons:
❌ Less flexible than Longhorn (can’t tune replica count)
❌ Hetzner-specific (vendor lock-in)
Example PVC¶
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gitaly-data
namespace: gitlabbda
spec:
storageClassName: hcloud-volumes
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
Tier 2: Longhorn Distributed Block Storage¶
What It Is¶
Self-managed distributed block storage running on cluster nodes.
Runs on all agent nodes (5 nodes across 2 regions)
Replicates data across nodes for high availability
Provides snapshot, backup, and restore capabilities
Backs up to Hetzner Storage Box (CIFS)
Longhorn Resilience Configuration¶
Since 2025-11-09, Longhorn manager pods use custom health probe configuration to prevent CrashLoopBackOff during cluster operations.
The Problem (Default Helm Chart):
Default readiness probe has only 30-second startup window (too short for network initialization)
No startup probe for slow webhook initialization during K3s upgrades
No liveness probe means pods never auto-recover from stuck states
Pods can enter permanent CrashLoopBackOff during network disruptions
Custom Probe Solution:
startupProbe: 5-minute grace period for webhook initialization during cluster disruptions
livenessProbe: Automatic recovery from stuck states (90-second tolerance)
readinessProbe: Enhanced timeout (5s) to handle network latency
Why This Matters: During K3s upgrades or network disruptions, Longhorn manager pods can experience a “chicken-and-egg” problem where the pod’s readiness probe checks its own admission webhook endpoint, but network timing issues prevent the endpoint from becoming available within the default 30-second window. This permanently marks the pod as NotReady, preventing it from joining the Service that would make it reachable.
Implementation:
Applied via strategic merge patch:
extra-manifests/40-G-longhorn-manager-probes-patch.yaml.tplAutomatically deployed with infrastructure
See Longhorn Resilience Configuration for technical deep-dive
See Troubleshoot Longhorn Manager CrashLoopBackOff for diagnostics
Three Storage Classes¶
Longhorn provides three storage classes with different replica strategies:
1. longhorn-redundant-app (1 replica)¶
When to use:
Applications with built-in replication (PostgreSQL clusters, Redis clusters, Kafka)
App-level redundancy eliminates need for storage redundancy
Minimize cost for already-replicated data
Example: PostgreSQL with CNPG (3 replicas)
PostgreSQL: 3 database replicas = 3x data redundancy
Storage: 1 Longhorn replica = 1x storage
Total: 3 copies of data (efficient!)
Without this optimization:
PostgreSQL: 3 database replicas
Storage: 2 Longhorn replicas per DB
Total: 6 copies of data (wasteful!)
Characteristics:
Data locality: best-effort (replica on same node as pod for performance)
Cost: Lowest (1x storage per PVC)
Protection: App-level replication provides redundancy
2. longhorn (2 replicas) - DEFAULT¶
When to use:
General purpose storage for single-instance applications
Good balance of cost vs reliability
Most common use case
Example: Prometheus (no built-in replication)
Prometheus: 1 instance (or 2 for HA, but separate data)
Storage: 2 Longhorn replicas
Protection: Survives 1 node failure
Characteristics:
Data locality: best-effort (one replica local, one remote)
Cost: Medium (2x storage per PVC)
Protection: Survives 1 node failure
Default: Used when no storageClassName specified
3. longhorn-ha (3 replicas)¶
When to use:
Mission-critical data requiring maximum availability
Data that absolutely cannot be lost
Willing to pay 3x storage cost
Examples:
Critical backups
Audit logs
Compliance data
Characteristics:
Data locality: disabled (maximize scheduling flexibility)
Cost: Highest (3x storage per PVC)
Protection: Survives 2 simultaneous node failures
Use sparingly: Most apps don’t need this level
Data Locality Explained¶
best-effort (used by 1 and 2 replica classes):
Longhorn tries to place one replica on the same node as the pod
Faster reads/writes (local disk access)
Falls back to remote if local placement impossible
disabled (used by 3 replica class):
Replicas distributed purely by available space
Maximum scheduling flexibility
Better for HA scenarios (replicas spread across more nodes)
Choosing the Right Storage Class¶
Does your app replicate data internally?
├─ Yes (DB cluster, Redis cluster, Kafka)
│ └─ Use: longhorn-redundant-app (1 replica)
│ Reason: App replication = redundancy already
│
└─ No (single instance or stateless)
├─ Is data mission-critical? Cannot lose under any circumstance?
│ └─ Yes → Use: longhorn-ha (3 replicas)
│ Reason: Maximum protection
│
└─ No → Use: longhorn (2 replicas, default)
Reason: Good balance of cost vs reliability
Storage Cost Calculation¶
Example: App requests 10Gi PVC
Storage Class |
Replicas |
Actual Storage Used |
Cost Factor |
|---|---|---|---|
|
1 |
10Gi |
1x |
|
2 |
20Gi |
2x |
|
3 |
30Gi |
3x |
Cluster-wide impact:
10 PVCs × 10Gi each = 100Gi requested
With
longhorn(default): 200Gi actual storage usedWith
longhorn-redundant-appwhere appropriate: 100-150Gi (savings!)
Example PVCs¶
# PostgreSQL with CNPG (3 replicas = built-in redundancy)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data-myapp
namespace: myapp
spec:
storageClassName: longhorn-redundant-app # 1 replica
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
# Prometheus (single instance, needs storage HA)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus-data
namespace: monitoring
spec:
storageClassName: longhorn # 2 replicas (default)
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
---
# Critical audit logs (cannot lose)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: audit-logs
namespace: compliance
spec:
storageClassName: longhorn-ha # 3 replicas
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
Longhorn Backup Strategy¶
Backup Target: Hetzner Storage Box (CIFS/SMB)
Daily automatic backups via RecurringJob
Backup target:
cifs://u233601-sub2.your-storagebox.de/u233601-sub2Credentials stored in
TF_VAR_longhorn_cifs_*environment variables
CRITICAL Configuration Notes:
Backup target URL MUST include
cifs://protocol prefixExample:
cifs://u233601-sub2.your-storagebox.de/u233601-sub2Without
cifs://: Longhorn manager CrashLoopBackOff
Tier 3: Hetzner S3 Object Storage¶
What It Is¶
Managed S3-compatible object storage provided by Hetzner.
S3-compatible API (works with AWS SDK tools)
Regional endpoints (fsn1, nbg1, hel1)
Lifecycle policies for automatic expiration
Cost-effective for large-scale data
Regional Strategy¶
Two S3 regions for different purposes:
Disaster Recovery Region (hel1 - Helsinki)
Purpose: etcd backup storage
Bucket:
kup6s-etcd-backupsEndpoint:
hel1.your-objectstorage.com(nohttps://)Rationale: Geographic redundancy - separate from cluster region
Production Region (fsn1 - Falkenstein)
Purpose: Application data (Loki logs, Thanos metrics, GitLab artifacts)
Endpoint:
https://fsn1.your-objectstorage.comRationale: Same region as cluster for lower latency
When to Use S3¶
Perfect for:
Large files - Artifacts, uploads, container images
Long-term data - Logs, metrics, backups
Unlimited scale - No capacity planning needed
Infrequent access - Cheaper per GB than block storage
Examples:
GitLab artifacts, uploads, LFS, pages, registry, backups
Prometheus/Thanos long-term metrics (730-day retention)
Loki log chunks (90-day retention)
PostgreSQL Barman backups
Bucket Naming Convention¶
CRITICAL: Hetzner S3 bucket names are globally unique across all Hetzner customers.
Naming Schema: {localpart}-{namespace}-kup6s
Examples:
Infrastructure:
backup-etcd-kup6s,logs-loki-kup6s,metrics-thanos-kup6sGitLab BDA:
gitlab-artifacts-gitlabbda-kup6s,gitlab-uploads-gitlabbda-kup6s
Rule: Always suffix bucket names with -kup6s to avoid collisions.
See S3 Bucket Naming for complete patterns.
Bucket Provisioning¶
Via Crossplane (automated, GitOps-friendly):
apiVersion: s3.aws.upbound.io/v1beta1
kind: Bucket
metadata:
name: data-myapp-kup6s
spec:
deletionPolicy: Delete
managementPolicies:
- Observe
- Create
- Delete
# Skip Update to avoid tagging operations (Hetzner doesn't support)
forProvider:
region: fsn1 # Use Hetzner region codes
providerConfigRef:
name: hetzner-s3
CRITICAL Management Policies:
Always include
managementPolicies: [Observe, Create, Delete]Skips Update operations (Hetzner doesn’t support tagging)
Without this: buckets show
SYNCED=Falsedue to tagging errors
Storage Decision Matrix¶
Workload Type |
Hetzner Volumes |
Longhorn |
S3 |
Rationale |
|---|---|---|---|---|
Git repositories |
✅ |
❌ |
Backups only |
Simple, backed up to S3 |
PostgreSQL (clustered) |
❌ |
✅ (1 replica) |
Backups |
App has replication |
PostgreSQL (single) |
❌ |
✅ (2 replicas) |
Backups |
Need storage HA |
Redis (clustered) |
❌ |
✅ (1 replica) |
❌ |
App has replication |
Redis (single) |
❌ |
✅ (2 replicas) |
❌ |
Need storage HA |
Prometheus |
❌ |
✅ (2 replicas) |
✅ Long-term |
Short-term local, long S3 |
Loki |
❌ |
✅ (2 replicas) |
✅ Chunks |
WAL local, chunks S3 |
GitLab artifacts |
❌ |
❌ |
✅ |
Large files, unlimited scale |
Container registry |
❌ |
❌ |
✅ |
Image layers, blob storage |
Backups |
❌ |
❌ |
✅ |
Long-term, cost-effective |
Performance Characteristics¶
Storage Tier |
Read Latency |
Write Latency |
IOPS |
Throughput |
Cost/GB/month |
|---|---|---|---|---|---|
Longhorn (local) |
<5ms |
<10ms |
Thousands |
160-320 MB/s |
~€0.10 (cluster overhead) |
Hetzner Volumes |
~10ms |
~10ms |
Hundreds |
100 MB/s |
€0.05 |
Hetzner S3 |
50-200ms |
100-300ms |
Unlimited |
1-10 Gbps |
€0.005 |
Takeaway: Use block storage (Longhorn/Volumes) for performance, S3 for scale and cost.
Backup Philosophy¶
Defense in Depth¶
Application-level backups - GitLab backups, PostgreSQL Barman
Storage-level redundancy - Longhorn replication across nodes
Off-cluster backups - Longhorn to CIFS, PostgreSQL/GitLab to S3
Geographic separation - etcd backups in different region (hel1)
Avoid Double Redundancy¶
Anti-pattern:
PostgreSQL: 3 CNPG replicas (3x data)
Storage: 2 Longhorn replicas per instance (2x)
Total: 6 copies of data (wasteful!)
Better:
PostgreSQL: 3 CNPG replicas (3x data)
Storage: 1 Longhorn replica per instance (1x)
Barman backups to S3 (off-cluster protection)
Total: 3 copies + S3 backup (efficient!)
Cost Optimization Strategies¶
1. Use Appropriate Storage Classes¶
Example cluster with 5 PostgreSQL databases (10Gi each):
Approach |
Storage Class |
Total Longhorn |
Savings |
|---|---|---|---|
Default (all 2 replicas) |
|
100Gi |
Baseline |
Optimized (1 replica) |
|
50Gi |
50% savings |
2. Reduce Retention Where Possible¶
Prometheus example:
Before Thanos: 7-day local retention = 6Gi PVC
After Thanos: 3-day local retention = 3Gi PVC
Long-term data offloaded to S3 (cheaper)
Savings: 50% Longhorn storage
3. Use S3 for Large/Infrequent Data¶
Storage |
Cost/GB/month |
Use Case |
|---|---|---|
Longhorn |
~€0.10 |
Active databases, hot data |
S3 |
~€0.005 |
Archives, backups, artifacts |
Savings |
20x cheaper |
Move appropriate data to S3 |
4. Right-Size PVCs¶
Don’t overprovision:
Monitor actual usage:
kubectl top pvStart small, grow as needed
Longhorn PVCs can be expanded (but not shrunk)