Reference

kube.tf Configuration

Complete reference for the kube.tf file that defines the KUP6S Kubernetes cluster infrastructure.

Overview

The kube.tf file is the main OpenTofu configuration file for the KUP6S cluster. It uses the kube-hetzner community module to provision a K3S cluster on Hetzner Cloud.

Location: kube-hetzner/kube.tf

Purpose:

  • Define cluster topology (control planes, agent nodes)

  • Configure infrastructure components (storage, networking)

  • Specify extra Kubernetes manifests to deploy

  • Manage cluster-wide settings

Module Source

module "kube-hetzner" {
  source = "kube-hetzner/kube-hetzner/hcloud"
  # Version managed by kube-hetzner project
}

Uses the official kube-hetzner Terraform/OpenTofu module from the Terraform Registry.

Cluster Identity

Cluster Name

cluster_name = "kup6s"

Purpose: Identifier for the cluster, used in:

  • Node naming (kup6s-control-fsn1-xyz)

  • Network names

  • Resource tagging

Base Domain

base_domain = "cluster.kup6s.com"

Purpose: Base domain for reverse DNS entries in Hetzner Cloud. Node FQDNs take the form nodename.cluster.kup6s.com.

Control Plane Node Pools

The KUP6S cluster runs 3 control plane nodes across 3 data centers for high availability.

control_plane_nodepools = [
  {
    name        = "control-fsn1"
    server_type = "cax21"
    location    = "fsn1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "control-nbg1"
    server_type = "cax21"
    location    = "nbg1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "control-hel1"
    server_type = "cax21"
    location    = "hel1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  }
]

Control Plane Configuration

Parameter

Value

Notes

Server Type

CAX21

ARM64, 4 vCPU, 8GB RAM, 80GB SSD

Locations

fsn1, nbg1, hel1

3 data centers for HA

Count per Pool

1

Total: 3 control planes

Swap

2GB

Reduces OOM risk

Longhorn Volume

0

Control planes don’t run Longhorn

Rationale:

  • Multi-DC: Survives data center failure

  • ARM64: Cost-effective (€6.49/month vs €9.49 for AMD64 equivalent)

  • CAX21: Sufficient for control plane workload

Agent Node Pools

The KUP6S cluster uses ARM64 nodes as the primary infrastructure, distributed across fsn1 and nbg1 regions for geographic redundancy. One AMD64 node is maintained for specific workloads requiring x86-64 architecture.

agent_nodepools = [
  # ARM64 Workers - Primary for web applications
  # Distributed across fsn1 (3 nodes) and nbg1 (2 nodes) for geographic redundancy

  # fsn1 region (3 nodes)
  {
    name        = "agent-cax31-fsn1"
    server_type = "cax31"
    location    = "fsn1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "agent-cax21-fsn1"
    server_type = "cax21"
    location    = "fsn1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "agent-cpx21-fsn1"
    server_type = "cpx21"
    location    = "fsn1"
    labels      = []
    taints      = ["kubernetes.io/arch=amd64:NoSchedule"]
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },

  # nbg1 region (2 nodes)
  {
    name        = "agent-cax31-nbg1"
    server_type = "cax31"
    location    = "nbg1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "agent-cax21-nbg1"
    server_type = "cax21"
    location    = "nbg1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
]

Agent Pool Summary

Pool

Type

Arch

vCPU

RAM

Disk

Location

Cost/month

agent-cax31-fsn1

CAX31

ARM64

8

16GB

160GB

fsn1

€12.49

agent-cax21-fsn1

CAX21

ARM64

4

8GB

80GB

fsn1

€6.49

agent-cpx21-fsn1

CPX21

AMD64

3

4GB

80GB

fsn1

€9.49

agent-cax31-nbg1

CAX31

ARM64

8

16GB

160GB

nbg1

€12.49

agent-cax21-nbg1

CAX21

ARM64

4

8GB

80GB

nbg1

€6.49

Total Capacity:

  • ARM64: 24 vCPU, 48GB RAM (distributed: fsn1=12 vCPU/24GB, nbg1=12 vCPU/24GB)

  • AMD64: 3 vCPU, 4GB RAM (fsn1 only)

  • Combined: 27 vCPU, 52GB RAM

  • Total Storage: ~457GB across 5 Longhorn nodes

Strategy:

  • Geographic Distribution: 3 nodes in fsn1, 2 nodes in nbg1 for redundancy

  • ARM64 Primary: Most workloads run on cost-effective ARM nodes

  • AMD64 Tainted: Only pods with kubernetes.io/arch=amd64 toleration scheduled on cpx21

  • Balanced Resources: Equal capacity split between regions for better resilience

Key Configuration Variables

Security

WireGuard Encryption

enable_wireguard = true

Effect: Encrypts all pod-to-pod communication across nodes using WireGuard.

Rationale: Security requirement for sensitive data in transit.

Storage

Longhorn Distributed Storage

enable_longhorn = true

Effect: Deploys Longhorn distributed block storage for persistent volumes.

Components:

  • Longhorn manager on each node

  • CSI driver for volume provisioning

  • Backup functionality (CIFS)

Hetzner CSI Driver

# disable_hetzner_csi = true

Status: Commented (enabled by default)

Effect: Hetzner Cloud CSI driver provides block storage volumes.

Note: Longhorn is preferred for replication and backups.

Scheduling

Control Plane Scheduling

# allow_scheduling_on_control_plane = true

Status: Commented (disabled by default)

Effect: When enabled, allows non-system pods to schedule on control plane nodes.

Current: Control planes run only system components.

Credential Management

CRITICAL: All sensitive values are provided via environment variables, not hardcoded in kube.tf.

Required Environment Variables

Variable

Purpose

TF_VAR_hcloud_token

Hetzner Cloud API access

TF_VAR_hetzner_s3_access_key

Hetzner Object Storage access

TF_VAR_hetzner_s3_secret_key

Hetzner Object Storage secret

TF_VAR_longhorn_cifs_url

Longhorn backup target URL

TF_VAR_longhorn_cifs_username

Backup storage username

TF_VAR_longhorn_cifs_password

Backup storage password

TF_VAR_traefik_basicauth_user

Traefik dashboard user

TF_VAR_traefik_basicauth_password

Traefik dashboard password

Loading Credentials

# Create .env file from template
cp .env.example .env
# Edit .env with your credentials

# Load credentials (bash)
set -a
source .env
set +a

# Or use fish shell with dotenv plugin
fish -c "dotenv .env; and tofu plan"

Variable Declaration

In kube.tf:

variable "hcloud_token" {
  description = "Hetzner Cloud API token"
  type        = string
  sensitive   = true
}

provider "hcloud" {
  token = var.hcloud_token  # From TF_VAR_hcloud_token
}

Anti-Pattern (NEVER):

# ❌ WRONG - hardcoded secret
provider "hcloud" {
  token = "abc123secrettoken"
}

Extra Manifests

extra_manifests = ["kube-hetzner/extra-manifests/kustomization.yaml"]

Purpose: Deploys additional Kubernetes manifests via Kustomize during cluster creation.

Location: kube-hetzner/extra-manifests/

Contents: Infrastructure-tier components (ArgoCD, Crossplane, ESO, CNPG, monitoring)

See: Extra Manifests

Node Pool Parameters Reference

Common Parameters

Parameter

Type

Description

name

string

Unique identifier for the node pool

server_type

string

Hetzner server type (cax21, cpx31, etc.)

location

string

Hetzner data center (fsn1, nbg1, hel1)

labels

list(string)

Kubernetes labels applied to nodes

taints

list(string)

Kubernetes taints (e.g., arch=amd64:NoSchedule)

count

number

Number of nodes in this pool

swap_size

string

Swap space size (e.g., “2G”)

longhorn_volume_size

number

Longhorn storage reservation (0 = disabled)

Available Server Types

ARM64 (CAX series):

  • cax11: 2 vCPU, 4GB RAM, 40GB SSD - €3.79/month

  • cax21: 4 vCPU, 8GB RAM, 80GB SSD - €6.49/month

  • cax31: 8 vCPU, 16GB RAM, 160GB SSD - €12.49/month

  • cax41: 16 vCPU, 32GB RAM, 320GB SSD - €24.49/month

AMD64 (CPX series):

  • cpx11: 2 vCPU, 2GB RAM, 40GB SSD - €4.15/month

  • cpx21: 3 vCPU, 4GB RAM, 80GB SSD - €9.49/month

  • cpx31: 4 vCPU, 8GB RAM, 160GB SSD - €16.49/month

  • cpx41: 8 vCPU, 16GB RAM, 240GB SSD - €32.90/month

Locations:

  • fsn1: Falkenstein, Germany

  • nbg1: Nuremberg, Germany

  • hel1: Helsinki, Finland

Modifying the Configuration

Adding a Node Pool

  1. Edit kube.tf:

    agent_nodepools = [
      # ...existing pools...
      {
        name        = "agent-arm-4"
        server_type = "cax41"
        location    = "fsn1"
        labels      = ["workload=heavy"]
        taints      = []
        count       = 2
        swap_size   = "4G"
        longhorn_volume_size = 0
      }
    ]
    
  2. Preview changes:

    cd kube-hetzner
    source .env
    tofu plan
    
  3. Apply:

    tofu apply
    

Changing Node Count

Scaling a pool up or down:

{
  name  = "agent-arm-2"
  count = 5  # Changed from 3 to 5
  # ...other parameters...
}

Note: Scaling down removes nodes. Ensure workloads are drained first.

Modifying Taints

Remove AMD64 taint to allow multi-arch scheduling:

{
  name   = "agent-amd-3"
  taints = []  # Removed: ["kubernetes.io/arch=amd64:NoSchedule"]
  # ...
}

Safety Guidelines

Before Modifying kube.tf

  1. Backup: Ensure recent cluster backup exists

  2. Review: Understand impact of changes

  3. Plan: Always run tofu plan first

  4. Test: Test in dev/staging if possible

  5. Monitor: Watch cluster during apply

Dangerous Operations

Changing Control Plane Configuration:

  • Can cause control plane downtime

  • May require cluster recreation

  • Test in staging first

Removing Agent Nodes:

  • Ensure no critical workloads on nodes

  • Drain nodes before scaling down:

    kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
    

Changing Storage Settings:

  • May cause Longhorn data loss

  • Backup all PVs first

Further Reading