Storage Strategy

This document explains the storage architecture decisions for Nextcloud on kup6s.

Storage Layers

Nextcloud uses three distinct storage layers:

Layer

Purpose

Backend

Access Mode

Size

User Files

File content

Hetzner S3

Object storage

Unlimited

Database

File metadata

CloudNativePG

RWO block (Longhorn)

10Gi per replica

Local Storage

Config, apps, temp

PVC

RWO block (Longhorn)

5Gi

S3 Primary Storage

Why S3 for User Files?

Scalability

  • No need to provision or resize volumes

  • Automatic capacity scaling

  • No manual intervention for storage growth

Cost Efficiency

Hetzner S3: €0.005/GB/month
Longhorn (Hetzner volumes): ~€0.12/GB/month
Savings: 96% cheaper for file storage

Durability

  • Hetzner provides 11 nines (99.999999999%) durability

  • Automatic replication across multiple availability zones

  • No manual backup management for files

Performance

  • Direct S3 access from PHP (no FUSE overhead)

  • Parallel chunk uploads for large files

  • CDN integration possible for static assets

Portability

  • Easy cluster migration (just update S3 credentials)

  • No data in cluster to migrate

  • Restore to new cluster in minutes

S3 Configuration

Bucket Structure:

data-nextcloudkup-kup6s/
├── admin/               # User admin's files
├── jensens/             # User jensens's files
├── appdata_oc{instanceid}/  # App data (thumbnails, previews)
└── files_external/      # External storage cache

Object Naming:

urn:oid:{fileid}

Nextcloud maps file paths to numeric fileid in database, then stores object with that ID in S3.

Credentials Injection:

Environment variables injected from Kubernetes secret:

- name: OBJECTSTORE_S3_KEY
  valueFrom:
    secretKeyRef:
      name: nextcloud-s3-credentials
      key: AWS_ACCESS_KEY_ID
- name: OBJECTSTORE_S3_SECRET
  valueFrom:
    secretKeyRef:
      name: nextcloud-s3-credentials
      key: AWS_SECRET_ACCESS_KEY

Nextcloud auto-configures config.php on startup with these credentials.

S3 Trade-offs

Advantages:

  • ✅ Unlimited, auto-scaling storage

  • ✅ Cost-efficient for large files

  • ✅ High durability (no data loss risk)

  • ✅ Easy backup (S3 versioning)

  • ✅ Cluster migration simplified

Disadvantages:

  • ❌ Network latency for file access (~10-50ms vs <1ms local)

  • ❌ Egress costs for downloads (€0.01/GB from Hetzner S3)

  • ❌ Dependency on external service

  • ❌ Debugging complexity (can’t ls files)

S3 Performance Optimization

Chunked Uploads:

'objectstore' => [
  'arguments' => [
    'partsize' => 104857600, // 100MB chunks
  ],
],

Large files split into 100MB chunks for parallel upload.

Local Caching:

# config/config.php
'cache_path' => '/var/www/tmp/',  # Local SSD for temp files

Frequently accessed files cached in pod’s local storage.

Redis Caching:

  • File locks cached in Redis

  • Thumbnails metadata cached

  • Reduces S3 API calls

PostgreSQL Storage

Why Longhorn for PostgreSQL?

Performance Requirements:

  • Database needs low-latency, high-IOPS storage

  • CloudNativePG requires block storage (no object storage support)

  • Longhorn provides SSD-backed storage on cluster nodes

Configuration:

storage:
  storageClass: longhorn
  postgresSize: 10Gi
replicas:
  postgres: 2

Data Protection:

  • CNPG continuous archiving to S3 (WAL logs)

  • Automated full backups every 6 hours to S3

  • 30-day backup retention

  • Point-in-time recovery available

Database Growth

Typical Growth:

  • 17,596 files = ~500MB database

  • ~30KB metadata per file (shares, versions, comments)

  • Grows linearly with file count, not file size

Monitoring:

# Check database size
kubectl exec -n nextcloudkup nextcloud-postgres-1 -- \
  psql -U postgres -d nextcloud -c \
  "SELECT pg_size_pretty(pg_database_size('nextcloud'));"

Resize if Needed:

# In config.yaml
storage:
  postgresSize: 20Gi  # Increase from 10Gi

Longhorn supports online volume expansion.

Local Storage (Config/Apps)

Why RWO Block Storage?

Nextcloud’s config and app directories need filesystem semantics:

  • File locking for config.php updates

  • Executable permissions for apps

  • Symlink support

  • Random access for app updates

Not Suitable for S3:

  • S3 is object storage (no POSIX filesystem)

  • S3FS FUSE too slow and unreliable

  • Config updates need immediate consistency

Content in Local Storage

/var/www/html/
├── config/            # config.php, theme configs
├── custom_apps/       # Installed apps
├── themes/            # Custom themes
└── tmp/               # Temporary files, local cache

Not Stored Here:

  • User data (in S3)

  • Database (separate PostgreSQL PVC)

RWO vs RWX Challenge

Problem:

  • Longhorn and Hetzner Cloud Volumes only support RWO

  • Multiple Nextcloud replicas need shared access to config/apps

Current Solution (nextcloudaffenstall):

  • Run 1 replica only

  • Accept brief downtime during pod restart

Alternative Solutions:

1. SMB CSI (Hetzner Storage Box)

storage:
  storageClass: smb-csi  # RWX support
replicas:
  nextcloud: 3  # Multiple pods possible

Concerns:

  • Network latency to storage box (~10-30ms)

  • SMB protocol overhead

  • Performance testing required

2. Separate StatefulSet per Pod

# Each pod gets own RWO volume
StatefulSet: nextcloud-0 → PVC-0 (RWO)
StatefulSet: nextcloud-1 → PVC-1 (RWO)
StatefulSet: nextcloud-2 → PVC-2 (RWO)

Concerns:

  • Config sync complexity

  • Manual load balancing

  • Upgrade coordination

3. Config in S3 + Local Cache

  • Store config.php in S3

  • Cache locally in pod

  • Reload on change notification

Concerns:

  • Complex implementation

  • Race conditions on updates

  • Not officially supported by Nextcloud

Storage Evolution Path

Current (January 2026)

nextcloudkup: 3 replicas (RWX not needed - read-only apps)
nextcloudaffenstall: 1 replica (RWO limitation)

Phase 2: SMB CSI Testing

  1. Deploy test Nextcloud with SMB CSI storage

  2. Benchmark config/app access performance

  3. Test multi-replica behavior

  4. Validate app installations/updates

Phase 3: Migration to SMB (If Viable)

If SMB performance acceptable:

  1. Migrate config/apps to SMB-backed PVC

  2. Scale to 3 replicas

  3. Improved availability (no downtime on restarts)

Phase 4: Optimization

  • Tune SMB mount options

  • Implement local caching layer

  • Profile and optimize hot paths

Backup Strategy

User Files (S3)

Built-in Versioning:

# Hetzner S3 bucket versioning
aws s3api put-bucket-versioning \
  --bucket data-nextcloudkup-kup6s \
  --versioning-configuration Status=Enabled

Lifecycle Policies:

  • Keep all versions for 30 days

  • Transition old versions to glacier after 90 days

  • Delete after 1 year

Database (PostgreSQL)

CNPG Automated Backups:

  • Continuous WAL archiving to S3

  • Full backup every 6 hours

  • 30-day retention

  • Point-in-time recovery

Manual Backup:

# Create on-demand backup
kubectl cnpg backup nextcloud-postgres -n nextcloudkup

Config/Apps (Local Storage)

Included in S3 Data Bucket:

  • Nextcloud auto-backs up config to appdata_oc{instanceid}/config/

  • Apps can be reinstalled from app store

  • Custom themes should be in Git

Manual Backup:

# Backup config.php
kubectl exec -n nextcloudkup deploy/nextcloud -- \
  cat /var/www/html/config/config.php > config-backup.php

Disaster Recovery

Scenario 1: Cluster Total Loss

Recovery Steps:

  1. Deploy new cluster

  2. Restore CNPG backup to new PostgreSQL cluster

  3. Point Nextcloud at existing S3 buckets (same credentials)

  4. Deploy Nextcloud with same config.yaml

  5. Verify file access

RTO (Recovery Time Objective): ~30 minutes RPO (Recovery Point Objective): ~6 hours (last backup)

Scenario 2: S3 Data Loss

Unlikely (11 nines durability), but if it happens:

  1. User data unrecoverable (S3 is source of truth)

  2. Restore database from CNPG backup

  3. Database shows files, but content missing

  4. Inform users of data loss

Prevention:

  • Enable S3 bucket versioning

  • Cross-region replication (if critical)

  • Regular backup validation

Scenario 3: Database Corruption

Recovery Steps:

  1. CNPG detects corruption during replication

  2. Auto-promote standby replica

  3. Re-sync corrupted primary from backup

  4. No user impact (automatic)

RTO: ~1 minute (failover time) RPO: 0 (synchronous replication)

Capacity Planning

Storage Growth Estimation

nextcloudkup (internal team):

  • Current: ~200GB in S3

  • Growth: ~50GB/year (documents, images)

  • 5-year projection: ~450GB

nextcloudaffenstall (production):

  • Current: ~13GB in S3

  • Growth: ~30GB/year (7 users)

  • 5-year projection: ~163GB

Cost Projection (5 years)

S3 Storage:

nextcloudkup: 450GB × €0.005/GB/month × 12 months × 5 years = €135
nextcloudaffenstall: 163GB × €0.005/GB/month × 12 months × 5 years = €49
Total: €184 (5 years)

PostgreSQL (Longhorn):

2 instances × 10Gi × €0.12/GB/month × 12 months × 5 years = €144

Total Storage Cost (5 years): €328