Explanation
Storage Architecture for Monitoring Stack¶
This document explains the monitoring stack’s specific storage allocations and S3 bucket strategy. For general storage architecture, tier selection criteria, and Longhorn storage classes, see Storage Architecture and Tiers.
Overview¶
The monitoring stack uses Longhorn PVCs for short-term data and S3 for long-term storage, optimizing cost and performance:
Component |
Storage Tier |
Storage Class |
Size |
Purpose |
|---|---|---|---|---|
Prometheus |
Longhorn + S3 |
|
3Gi × 2 |
3-day local, 2-year S3 (via Thanos) |
Thanos Store |
Longhorn + S3 |
|
10Gi × 2 |
Index cache for S3 queries |
Thanos Compactor |
Longhorn + S3 |
|
20Gi × 1 |
Compaction workspace |
Loki (Write/Read/Backend) |
Longhorn + S3 |
|
5Gi each × 6 |
WAL + cache, chunks in S3 |
Grafana |
Longhorn |
|
5Gi × 1 |
Dashboards, settings |
Total Longhorn storage: 71Gi (across components with 2 replicas each = 142Gi actual)
S3 buckets: 2 buckets (metrics-thanos-kup6s, logs-loki-kup6s)
Storage Allocation by Component¶
Prometheus (StatefulSet, 2 replicas)¶
Local Storage (Longhorn):
Size: 3Gi per replica
Total: 6Gi Longhorn (2 replicas)
Retention: 3 days local
Storage class:
longhorn(2 Longhorn replicas for HA)
Why 3Gi?
Metrics ingestion: ~100k samples/sec
Storage rate: ~500MB/day compressed (TSDB 3:1 compression)
3 days × 500MB = 1.5GB + 100% headroom = 3GB
Optimization: Originally 6Gi with 7-day retention. Reduced after Thanos integration offloads historical data to S3.
S3 Offload (via Thanos Sidecar):
Bucket:
metrics-thanos-kup6sUpload frequency: Every 2 hours (2-hour blocks)
Retention: 730 days (2 years, see S3 section below)
Thanos Query (Deployment, 2 replicas)¶
Storage: None (stateless)
No PVCs required
Queries data from Prometheus sidecars and Thanos Store
Temporary cache in memory only
Thanos Store (StatefulSet, 2 replicas)¶
Local Storage (Longhorn):
Size: 10Gi per replica
Total: 20Gi Longhorn (2 replicas)
Purpose: Index and chunk caching for S3 queries
Storage class:
longhorn(2 Longhorn replicas for HA)
Why 10Gi?
Index cache: 500MB (configured max)
Chunk cache: 500MB (configured max)
Metadata: ~100MB (block metadata, labels)
Headroom: 5x for growth = 10GB
S3 Access: Reads blocks from metrics-thanos-kup6s bucket (read-only)
Thanos Compactor (StatefulSet, 1 replica)¶
Local Storage (Longhorn):
Size: 20Gi
Total: 20Gi Longhorn (1 replica, single instance)
Purpose: Compaction workspace for merging and downsampling blocks
Storage class:
longhorn(2 Longhorn replicas)
Why 20Gi?
Downloads multiple 2-hour blocks from S3
Merges, downsamples, and uploads back to S3
Peak usage during compaction: ~10GB
Headroom: 2x = 20GB
Downsampling Strategy:
Raw data: 30 days retention
5-minute resolution: 180 days (6 months)
1-hour resolution: 730 days (2 years)
Loki Components¶
All Loki components use Longhorn + S3 architecture:
Component |
Replicas |
Size per Replica |
Total Longhorn |
Purpose |
|---|---|---|---|---|
Loki Write |
2 |
5Gi |
10Gi |
WAL (Write-Ahead Log) |
Loki Read |
2 |
5Gi |
10Gi |
Query result cache |
Loki Backend |
2 |
5Gi |
10Gi |
Index and metadata |
Why 5Gi for each?
Write: WAL holds unflushed chunks (buffered before S3)
Read: Cache for frequently queried log chunks
Backend: Index metadata and chunk references
S3 Storage:
Bucket:
logs-loki-kup6sChunk flush: Every 15 minutes or 1.5MB (whichever first)
Retention: 744h (31 days)
Storage class: longhorn (2 Longhorn replicas for all components)
Grafana (Deployment, 1 replica)¶
Local Storage (Longhorn):
Size: 5Gi
Total: 5Gi Longhorn (1 replica, UI component)
Purpose: Dashboards, datasources, settings, plugins
Storage class:
longhorn(2 Longhorn replicas)
Why 5Gi?
Dashboards JSON: ~10MB
Plugins: ~100MB
SQLite database: ~50MB (users, settings)
Headroom: 100x = 5GB
Note: Not critical data (can be recreated from code/backups)
S3 Object Storage¶
Two S3 Buckets¶
Both buckets in fsn1 region (Falkenstein - same as cluster):
1. metrics-thanos-kup6s¶
Purpose: Long-term Prometheus metrics storage
Content:
2-hour TSDB blocks uploaded by Thanos sidecar
Downsampled 5-minute resolution blocks (180 days)
Downsampled 1-hour resolution blocks (730 days)
Retention Strategy (managed by Thanos Compactor):
Raw data (no downsampling): 30 days
5-minute resolution: 180 days (6 months)
1-hour resolution: 730 days (2 years)
Lifecycle Configuration:
apiVersion: s3.aws.upbound.io/v1beta1
kind: BucketLifecycleConfiguration
spec:
forProvider:
rule:
- id: delete-raw-data-after-30d
status: Enabled
expiration:
days: 30
filter:
prefix: "01" # Raw data prefix
- id: delete-5m-data-after-180d
status: Enabled
expiration:
days: 180
filter:
prefix: "02" # 5m resolution prefix
- id: delete-1h-data-after-730d
status: Enabled
expiration:
days: 730
filter:
prefix: "03" # 1h resolution prefix
Estimated size: 50-200 GB (grows with time)
2. logs-loki-kup6s¶
Purpose: Long-term log storage
Content:
Compressed log chunks (1.5MB each)
Index files (TSDB format)
Retention: 744h (31 days)
Lifecycle Configuration:
apiVersion: s3.aws.upbound.io/v1beta1
kind: BucketLifecycleConfiguration
spec:
forProvider:
rule:
- id: expire-logs-after-31d
status: Enabled
expiration:
days: 31
Estimated size: 10-50 GB (varies by log volume)
Why fsn1 Region for Both Buckets?¶
Decision: Both buckets in production region (fsn1) for performance.
Rationale:
Latency: Same datacenter as cluster = lowest latency for uploads/queries
Bandwidth: No cross-region egress fees
Simplicity: Single Crossplane ProviderConfig endpoint
Trade-off: No geographic redundancy (acceptable for current scale)
See Storage Tiers for general S3 strategy.
Storage Allocation Summary¶
Longhorn PVC Breakdown¶
Component |
Replicas |
Size per Replica |
Longhorn Storage (logical) |
Actual Storage (with 2x Longhorn replication) |
|---|---|---|---|---|
Prometheus |
2 |
3Gi |
6Gi |
12Gi |
Thanos Store |
2 |
10Gi |
20Gi |
40Gi |
Thanos Compactor |
1 |
20Gi |
20Gi |
40Gi |
Loki Write |
2 |
5Gi |
10Gi |
20Gi |
Loki Read |
2 |
5Gi |
10Gi |
20Gi |
Loki Backend |
2 |
5Gi |
10Gi |
20Gi |
Grafana |
1 |
5Gi |
5Gi |
10Gi |
Total |
81Gi |
162Gi |
Note: All components use longhorn storage class (2 Longhorn replicas), so actual cluster storage is ~2x logical PVC size.
S3 Storage¶
Bucket |
Retention |
Estimated Size |
|---|---|---|
metrics-thanos-kup6s |
30d raw + 180d 5m + 730d 1h |
50-200 GB |
logs-loki-kup6s |
31d |
10-50 GB |
Total |
60-250 GB |
Cost: ~€0.50/month (€0.005/GB × ~100GB average)
Hybrid Storage Strategy¶
The monitoring stack demonstrates tiered storage optimization:
Short-Term (Longhorn PVCs)¶
Use case: Active queries, recent data
Prometheus: 3-day local retention (fast queries)
Loki: WAL + cache (writes buffer before S3)
Thanos Store: Index cache (accelerate S3 queries)
Characteristics:
Low latency: <5ms (local disk)
High cost: ~€0.10/GB/month (cluster overhead)
Limited capacity: 81Gi total
Long-Term (S3)¶
Use case: Historical data, infrequent queries
Prometheus: 2-year metrics (via Thanos sidecar)
Loki: 31-day logs (chunked and compressed)
Characteristics:
Higher latency: 50-200ms (network + S3 API)
Low cost: ~€0.005/GB/month (20x cheaper)
Unlimited capacity: Scales automatically
Result: Best of both worlds - fast recent queries + cheap long-term storage.
Retention Policy Rationale¶
Prometheus/Thanos Metrics¶
Why 3 days local?
Most queries are for last 24-48 hours (recent alerts, dashboards)
Longer queries automatically use Thanos Query → S3
Why 2 years in S3?
Capacity planning: Year-over-year comparisons
Long-term trends: Resource usage over time
Compliance: Some metrics retained for audit
Why downsample?
Raw data (30d): Full resolution for detailed analysis
5-minute (180d): Sufficient for most dashboards
1-hour (730d): Adequate for long-term trends
Storage savings: 1h resolution is ~10x smaller than raw
Loki Logs¶
Why 31 days?
Debugging: Most issues discovered within 1 week
Compliance: Short-term audit trail
Cost: Logs are high volume, longer retention expensive
Future: Could extend to 90 days with lifecycle policies (older logs = cheaper storage tier)