Explanation

High Availability Configuration

Understanding CNPG’s approach to PostgreSQL high availability.

Cluster Architecture

CNPG provides HA through:

Multi-Instance Clusters

  • Primary instance (read-write)

  • Standby instances (read-only, streaming replication)

  • Automatic promotion on primary failure

Automatic Failover

  • Operator monitors primary health

  • Promotes standby to primary on failure

  • Updates services to point to new primary

  • Typical failover time: 30-60 seconds

Pod Anti-Affinity

Spread instances across nodes for resilience:

affinity:
  podAntiAffinityType: preferred  # Best-effort spread

Options:

  • preferred - Soft anti-affinity (best-effort)

  • required - Hard anti-affinity (enforce spread)

Replication

Streaming Replication

  • Real-time data replication from primary to standbys

  • Asynchronous by default (better performance)

  • Synchronous replication optional (stronger consistency)

Read-Only Replicas

Standby instances can serve read queries:

# Read-write service (primary only)
myapp-postgres-rw

# Read-only service (all instances)
myapp-postgres-ro

Service Endpoints

CNPG creates multiple services:

  • <cluster>-rw - Read-write (primary only)

  • <cluster>-ro - Read-only (all instances, load balanced)

  • <cluster>-r - Read (primary + standbys, load balanced)

Storage Strategy

Uses longhorn-redundant-app storage class (1 replica):

Rationale: PostgreSQL provides application-level replication:

  • 2 PostgreSQL instances × 1 storage replica = 2 copies ✅

  • 2 PostgreSQL instances × 2 storage replicas = 4 copies ❌ (wasteful)

See Storage Tiers for details.