Reference

kube.tf Configuration¶

Type: Reference (Information-oriented)

Complete reference for the kube.tf file that defines the KUP6S Kubernetes cluster infrastructure.

Overview¶

The kube.tf file is the main OpenTofu configuration file for the KUP6S cluster. It uses the kube-hetzner community module to provision a K3S cluster on Hetzner Cloud.

Location: kube-hetzner/kube.tf

Purpose:

Define cluster topology (control planes, agent nodes)
Configure infrastructure components (storage, networking)
Specify extra Kubernetes manifests to deploy
Manage cluster-wide settings

Module Source¶

module "kube-hetzner" {
  source = "kube-hetzner/kube-hetzner/hcloud"
  # Version managed by kube-hetzner project
}

Uses the official kube-hetzner Terraform/OpenTofu module from the Terraform Registry.

Cluster Identity¶

Cluster Name¶

cluster_name = "kup6s"

Purpose: Identifier for the cluster, used in:

Node naming (kup6s-control-fsn1-xyz)
Network names
Resource tagging

Base Domain¶

base_domain = "cluster.kup6s.com"

Purpose: Base domain for reverse DNS entries in Hetzner Cloud. Node FQDNs take the form nodename.cluster.kup6s.com.

Control Plane Node Pools¶

The KUP6S cluster runs 3 control plane nodes across 3 data centers for high availability.

control_plane_nodepools = [
  {
    name        = "control-fsn1"
    server_type = "cax21"
    location    = "fsn1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "control-nbg1"
    server_type = "cax21"
    location    = "nbg1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "control-hel1"
    server_type = "cax21"
    location    = "hel1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  }
]

Control Plane Configuration¶

Parameter	Value	Notes
Server Type	CAX21	ARM64, 4 vCPU, 8GB RAM, 80GB SSD
Locations	fsn1, nbg1, hel1	3 data centers for HA
Count per Pool	1	Total: 3 control planes
Swap	2GB	Reduces OOM risk
Longhorn Volume	0	Control planes don’t run Longhorn

Rationale:

Multi-DC: Survives data center failure
ARM64: Cost-effective (€6.49/month vs €9.49 for AMD64 equivalent)
CAX21: Sufficient for control plane workload

Agent Node Pools¶

The KUP6S cluster uses ARM64 nodes as the primary infrastructure, distributed across fsn1 and nbg1 regions for geographic redundancy. One AMD64 node is maintained for specific workloads requiring x86-64 architecture.

agent_nodepools = [
  # ARM64 Workers - Primary for web applications
  # Distributed across fsn1 (3 nodes) and nbg1 (2 nodes) for geographic redundancy

  # fsn1 region (3 nodes)
  {
    name        = "agent-cax31-fsn1"
    server_type = "cax31"
    location    = "fsn1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "agent-cax21-fsn1"
    server_type = "cax21"
    location    = "fsn1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "agent-cpx21-fsn1"
    server_type = "cpx21"
    location    = "fsn1"
    labels      = []
    taints      = ["kubernetes.io/arch=amd64:NoSchedule"]
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },

  # nbg1 region (2 nodes)
  {
    name        = "agent-cax31-nbg1"
    server_type = "cax31"
    location    = "nbg1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
  {
    name        = "agent-cax21-nbg1"
    server_type = "cax21"
    location    = "nbg1"
    labels      = []
    taints      = []
    count       = 1
    swap_size   = "2G"
    longhorn_volume_size = 0
  },
]

Agent Pool Summary¶

Pool	Type	Arch	vCPU	RAM	Disk	Location	Cost/month
agent-cax31-fsn1	CAX31	ARM64	8	16GB	160GB	fsn1	€12.49
agent-cax21-fsn1	CAX21	ARM64	4	8GB	80GB	fsn1	€6.49
agent-cpx21-fsn1	CPX21	AMD64	3	4GB	80GB	fsn1	€9.49
agent-cax31-nbg1	CAX31	ARM64	8	16GB	160GB	nbg1	€12.49
agent-cax21-nbg1	CAX21	ARM64	4	8GB	80GB	nbg1	€6.49

Total Capacity:

ARM64: 24 vCPU, 48GB RAM (distributed: fsn1=12 vCPU/24GB, nbg1=12 vCPU/24GB)
AMD64: 3 vCPU, 4GB RAM (fsn1 only)
Combined: 27 vCPU, 52GB RAM
Total Storage: ~457GB across 5 Longhorn nodes

Strategy:

Geographic Distribution: 3 nodes in fsn1, 2 nodes in nbg1 for redundancy
ARM64 Primary: Most workloads run on cost-effective ARM nodes
AMD64 Tainted: Only pods with kubernetes.io/arch=amd64 toleration scheduled on cpx21
Balanced Resources: Equal capacity split between regions for better resilience

Key Configuration Variables¶

Security¶

WireGuard Encryption¶

enable_wireguard = true

Effect: Encrypts all pod-to-pod communication across nodes using WireGuard.

Rationale: Security requirement for sensitive data in transit.

Storage¶

Longhorn Distributed Storage¶

enable_longhorn = true

Effect: Deploys Longhorn distributed block storage for persistent volumes.

Components:

Longhorn manager on each node
CSI driver for volume provisioning
Backup functionality (CIFS)

Hetzner CSI Driver¶

# disable_hetzner_csi = true

Status: Commented (enabled by default)

Effect: Hetzner Cloud CSI driver provides block storage volumes.

Note: Longhorn is preferred for replication and backups.

Scheduling¶

Control Plane Scheduling¶

# allow_scheduling_on_control_plane = true

Status: Commented (disabled by default)

Effect: When enabled, allows non-system pods to schedule on control plane nodes.

Current: Control planes run only system components.

Credential Management¶

CRITICAL: All sensitive values are provided via environment variables, not hardcoded in kube.tf.

Required Environment Variables¶

Variable	Purpose
`TF_VAR_hcloud_token`	Hetzner Cloud API access
`TF_VAR_hetzner_s3_access_key`	Hetzner Object Storage access
`TF_VAR_hetzner_s3_secret_key`	Hetzner Object Storage secret
`TF_VAR_longhorn_cifs_url`	Longhorn backup target URL
`TF_VAR_longhorn_cifs_username`	Backup storage username
`TF_VAR_longhorn_cifs_password`	Backup storage password
`TF_VAR_traefik_basicauth_user`	Traefik dashboard user
`TF_VAR_traefik_basicauth_password`	Traefik dashboard password

Loading Credentials¶

# Create .env file from template
cp .env.example .env
# Edit .env with your credentials

# Load credentials (bash)
set -a
source .env
set +a

# Or use fish shell with dotenv plugin
fish -c "dotenv .env; and tofu plan"

Variable Declaration¶

In kube.tf:

variable "hcloud_token" {
  description = "Hetzner Cloud API token"
  type        = string
  sensitive   = true
}

provider "hcloud" {
  token = var.hcloud_token  # From TF_VAR_hcloud_token
}

Anti-Pattern (NEVER):

# ❌ WRONG - hardcoded secret
provider "hcloud" {
  token = "abc123secrettoken"
}

Extra Manifests¶

extra_manifests = ["kube-hetzner/extra-manifests/kustomization.yaml"]

Purpose: Deploys additional Kubernetes manifests via Kustomize during cluster creation.

Location: kube-hetzner/extra-manifests/

Contents: Infrastructure-tier components (ArgoCD, Crossplane, ESO, CNPG, monitoring)

See: Extra Manifests

Node Pool Parameters Reference¶

Common Parameters¶

Parameter	Type	Description
`name`	string	Unique identifier for the node pool
`server_type`	string	Hetzner server type (cax21, cpx31, etc.)
`location`	string	Hetzner data center (fsn1, nbg1, hel1)
`labels`	list(string)	Kubernetes labels applied to nodes
`taints`	list(string)	Kubernetes taints (e.g., `arch=amd64:NoSchedule`)
`count`	number	Number of nodes in this pool
`swap_size`	string	Swap space size (e.g., “2G”)
`longhorn_volume_size`	number	Longhorn storage reservation (0 = disabled)

Available Server Types¶

ARM64 (CAX series):

cax11: 2 vCPU, 4GB RAM, 40GB SSD - €3.79/month
cax21: 4 vCPU, 8GB RAM, 80GB SSD - €6.49/month
cax31: 8 vCPU, 16GB RAM, 160GB SSD - €12.49/month
cax41: 16 vCPU, 32GB RAM, 320GB SSD - €24.49/month

AMD64 (CPX series):

cpx11: 2 vCPU, 2GB RAM, 40GB SSD - €4.15/month
cpx21: 3 vCPU, 4GB RAM, 80GB SSD - €9.49/month
cpx31: 4 vCPU, 8GB RAM, 160GB SSD - €16.49/month
cpx41: 8 vCPU, 16GB RAM, 240GB SSD - €32.90/month

Locations:

fsn1: Falkenstein, Germany
nbg1: Nuremberg, Germany
hel1: Helsinki, Finland

Modifying the Configuration¶

Adding a Node Pool¶

Edit kube.tf:

agent_nodepools = [
  # ...existing pools...
  {
    name        = "agent-arm-4"
    server_type = "cax41"
    location    = "fsn1"
    labels      = ["workload=heavy"]
    taints      = []
    count       = 2
    swap_size   = "4G"
    longhorn_volume_size = 0
  }
]

Preview changes:
```
cd kube-hetzner
source .env
tofu plan
```
Apply:
```
tofu apply
```

Changing Node Count¶

Scaling a pool up or down:

{
  name  = "agent-arm-2"
  count = 5  # Changed from 3 to 5
  # ...other parameters...
}

Note: Scaling down removes nodes. Ensure workloads are drained first.

Modifying Taints¶

Remove AMD64 taint to allow multi-arch scheduling:

{
  name   = "agent-amd-3"
  taints = []  # Removed: ["kubernetes.io/arch=amd64:NoSchedule"]
  # ...
}

Safety Guidelines¶

Before Modifying kube.tf¶

Backup: Ensure recent cluster backup exists
Review: Understand impact of changes
Plan: Always run tofu plan first
Test: Test in dev/staging if possible
Monitor: Watch cluster during apply

Dangerous Operations¶

Changing Control Plane Configuration:

Can cause control plane downtime
May require cluster recreation
Test in staging first

Removing Agent Nodes:

Ensure no critical workloads on nodes

Drain nodes before scaling down:

kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data

Changing Storage Settings:

May cause Longhorn data loss
Backup all PVs first