Explanation

Storage Architecture for Monitoring Stack

This document explains the monitoring stack’s specific storage allocations and S3 bucket strategy. For general storage architecture, tier selection criteria, and Longhorn storage classes, see Storage Architecture and Tiers.

Overview

The monitoring stack uses Longhorn PVCs for short-term data and S3 for long-term storage, optimizing cost and performance:

Component

Storage Tier

Storage Class

Size

Purpose

Prometheus

Longhorn + S3

longhorn

3Gi × 2

3-day local, 2-year S3 (via Thanos)

Thanos Store

Longhorn + S3

longhorn

10Gi × 2

Index cache for S3 queries

Thanos Compactor

Longhorn + S3

longhorn

20Gi × 1

Compaction workspace

Loki (Write/Read/Backend)

Longhorn + S3

longhorn

5Gi each × 6

WAL + cache, chunks in S3

Grafana

Longhorn

longhorn

5Gi × 1

Dashboards, settings

Total Longhorn storage: 71Gi (across components with 2 replicas each = 142Gi actual)

S3 buckets: 2 buckets (metrics-thanos-kup6s, logs-loki-kup6s)

Storage Allocation by Component

Prometheus (StatefulSet, 2 replicas)

Local Storage (Longhorn):

  • Size: 3Gi per replica

  • Total: 6Gi Longhorn (2 replicas)

  • Retention: 3 days local

  • Storage class: longhorn (2 Longhorn replicas for HA)

Why 3Gi?

  • Metrics ingestion: ~100k samples/sec

  • Storage rate: ~500MB/day compressed (TSDB 3:1 compression)

  • 3 days × 500MB = 1.5GB + 100% headroom = 3GB

Optimization: Originally 6Gi with 7-day retention. Reduced after Thanos integration offloads historical data to S3.

S3 Offload (via Thanos Sidecar):

  • Bucket: metrics-thanos-kup6s

  • Upload frequency: Every 2 hours (2-hour blocks)

  • Retention: 730 days (2 years, see S3 section below)

Thanos Query (Deployment, 2 replicas)

Storage: None (stateless)

  • No PVCs required

  • Queries data from Prometheus sidecars and Thanos Store

  • Temporary cache in memory only

Thanos Store (StatefulSet, 2 replicas)

Local Storage (Longhorn):

  • Size: 10Gi per replica

  • Total: 20Gi Longhorn (2 replicas)

  • Purpose: Index and chunk caching for S3 queries

  • Storage class: longhorn (2 Longhorn replicas for HA)

Why 10Gi?

  • Index cache: 500MB (configured max)

  • Chunk cache: 500MB (configured max)

  • Metadata: ~100MB (block metadata, labels)

  • Headroom: 5x for growth = 10GB

S3 Access: Reads blocks from metrics-thanos-kup6s bucket (read-only)

Thanos Compactor (StatefulSet, 1 replica)

Local Storage (Longhorn):

  • Size: 20Gi

  • Total: 20Gi Longhorn (1 replica, single instance)

  • Purpose: Compaction workspace for merging and downsampling blocks

  • Storage class: longhorn (2 Longhorn replicas)

Why 20Gi?

  • Downloads multiple 2-hour blocks from S3

  • Merges, downsamples, and uploads back to S3

  • Peak usage during compaction: ~10GB

  • Headroom: 2x = 20GB

Downsampling Strategy:

  • Raw data: 30 days retention

  • 5-minute resolution: 180 days (6 months)

  • 1-hour resolution: 730 days (2 years)

Loki Components

All Loki components use Longhorn + S3 architecture:

Component

Replicas

Size per Replica

Total Longhorn

Purpose

Loki Write

2

5Gi

10Gi

WAL (Write-Ahead Log)

Loki Read

2

5Gi

10Gi

Query result cache

Loki Backend

2

5Gi

10Gi

Index and metadata

Why 5Gi for each?

  • Write: WAL holds unflushed chunks (buffered before S3)

  • Read: Cache for frequently queried log chunks

  • Backend: Index metadata and chunk references

S3 Storage:

  • Bucket: logs-loki-kup6s

  • Chunk flush: Every 15 minutes or 1.5MB (whichever first)

  • Retention: 744h (31 days)

Storage class: longhorn (2 Longhorn replicas for all components)

Grafana (Deployment, 1 replica)

Local Storage (Longhorn):

  • Size: 5Gi

  • Total: 5Gi Longhorn (1 replica, UI component)

  • Purpose: Dashboards, datasources, settings, plugins

  • Storage class: longhorn (2 Longhorn replicas)

Why 5Gi?

  • Dashboards JSON: ~10MB

  • Plugins: ~100MB

  • SQLite database: ~50MB (users, settings)

  • Headroom: 100x = 5GB

Note: Not critical data (can be recreated from code/backups)

S3 Object Storage

Two S3 Buckets

Both buckets in fsn1 region (Falkenstein - same as cluster):

1. metrics-thanos-kup6s

Purpose: Long-term Prometheus metrics storage

Content:

  • 2-hour TSDB blocks uploaded by Thanos sidecar

  • Downsampled 5-minute resolution blocks (180 days)

  • Downsampled 1-hour resolution blocks (730 days)

Retention Strategy (managed by Thanos Compactor):

Raw data (no downsampling):   30 days
5-minute resolution:          180 days (6 months)
1-hour resolution:            730 days (2 years)

Lifecycle Configuration:

apiVersion: s3.aws.upbound.io/v1beta1
kind: BucketLifecycleConfiguration
spec:
  forProvider:
    rule:
      - id: delete-raw-data-after-30d
        status: Enabled
        expiration:
          days: 30
        filter:
          prefix: "01"  # Raw data prefix
      - id: delete-5m-data-after-180d
        status: Enabled
        expiration:
          days: 180
        filter:
          prefix: "02"  # 5m resolution prefix
      - id: delete-1h-data-after-730d
        status: Enabled
        expiration:
          days: 730
        filter:
          prefix: "03"  # 1h resolution prefix

Estimated size: 50-200 GB (grows with time)

2. logs-loki-kup6s

Purpose: Long-term log storage

Content:

  • Compressed log chunks (1.5MB each)

  • Index files (TSDB format)

Retention: 744h (31 days)

Lifecycle Configuration:

apiVersion: s3.aws.upbound.io/v1beta1
kind: BucketLifecycleConfiguration
spec:
  forProvider:
    rule:
      - id: expire-logs-after-31d
        status: Enabled
        expiration:
          days: 31

Estimated size: 10-50 GB (varies by log volume)

Why fsn1 Region for Both Buckets?

Decision: Both buckets in production region (fsn1) for performance.

Rationale:

  • Latency: Same datacenter as cluster = lowest latency for uploads/queries

  • Bandwidth: No cross-region egress fees

  • Simplicity: Single Crossplane ProviderConfig endpoint

Trade-off: No geographic redundancy (acceptable for current scale)

See Storage Tiers for general S3 strategy.

Storage Allocation Summary

Longhorn PVC Breakdown

Component

Replicas

Size per Replica

Longhorn Storage (logical)

Actual Storage (with 2x Longhorn replication)

Prometheus

2

3Gi

6Gi

12Gi

Thanos Store

2

10Gi

20Gi

40Gi

Thanos Compactor

1

20Gi

20Gi

40Gi

Loki Write

2

5Gi

10Gi

20Gi

Loki Read

2

5Gi

10Gi

20Gi

Loki Backend

2

5Gi

10Gi

20Gi

Grafana

1

5Gi

5Gi

10Gi

Total

81Gi

162Gi

Note: All components use longhorn storage class (2 Longhorn replicas), so actual cluster storage is ~2x logical PVC size.

S3 Storage

Bucket

Retention

Estimated Size

metrics-thanos-kup6s

30d raw + 180d 5m + 730d 1h

50-200 GB

logs-loki-kup6s

31d

10-50 GB

Total

60-250 GB

Cost: ~€0.50/month (€0.005/GB × ~100GB average)

Hybrid Storage Strategy

The monitoring stack demonstrates tiered storage optimization:

Short-Term (Longhorn PVCs)

Use case: Active queries, recent data

  • Prometheus: 3-day local retention (fast queries)

  • Loki: WAL + cache (writes buffer before S3)

  • Thanos Store: Index cache (accelerate S3 queries)

Characteristics:

  • Low latency: <5ms (local disk)

  • High cost: ~€0.10/GB/month (cluster overhead)

  • Limited capacity: 81Gi total

Long-Term (S3)

Use case: Historical data, infrequent queries

  • Prometheus: 2-year metrics (via Thanos sidecar)

  • Loki: 31-day logs (chunked and compressed)

Characteristics:

  • Higher latency: 50-200ms (network + S3 API)

  • Low cost: ~€0.005/GB/month (20x cheaper)

  • Unlimited capacity: Scales automatically

Result: Best of both worlds - fast recent queries + cheap long-term storage.

Retention Policy Rationale

Prometheus/Thanos Metrics

Why 3 days local?

  • Most queries are for last 24-48 hours (recent alerts, dashboards)

  • Longer queries automatically use Thanos Query → S3

Why 2 years in S3?

  • Capacity planning: Year-over-year comparisons

  • Long-term trends: Resource usage over time

  • Compliance: Some metrics retained for audit

Why downsample?

  • Raw data (30d): Full resolution for detailed analysis

  • 5-minute (180d): Sufficient for most dashboards

  • 1-hour (730d): Adequate for long-term trends

  • Storage savings: 1h resolution is ~10x smaller than raw

Loki Logs

Why 31 days?

  • Debugging: Most issues discovered within 1 week

  • Compliance: Short-term audit trail

  • Cost: Logs are high volume, longer retention expensive

Future: Could extend to 90 days with lifecycle policies (older logs = cheaper storage tier)