Reference
kube.tf Configuration¶
Complete reference for the kube.tf file that defines the KUP6S Kubernetes cluster infrastructure.
Overview¶
The kube.tf file is the main OpenTofu configuration file for the KUP6S cluster. It uses the kube-hetzner community module to provision a K3S cluster on Hetzner Cloud.
Location: kube-hetzner/kube.tf
Purpose:
Define cluster topology (control planes, agent nodes)
Configure infrastructure components (storage, networking)
Specify extra Kubernetes manifests to deploy
Manage cluster-wide settings
Module Source¶
module "kube-hetzner" {
source = "kube-hetzner/kube-hetzner/hcloud"
# Version managed by kube-hetzner project
}
Uses the official kube-hetzner Terraform/OpenTofu module from the Terraform Registry.
Cluster Identity¶
Cluster Name¶
cluster_name = "kup6s"
Purpose: Identifier for the cluster, used in:
Node naming (
kup6s-control-fsn1-xyz)Network names
Resource tagging
Base Domain¶
base_domain = "cluster.kup6s.com"
Purpose: Base domain for reverse DNS entries in Hetzner Cloud. Node FQDNs take the form nodename.cluster.kup6s.com.
Control Plane Node Pools¶
The KUP6S cluster runs 3 control plane nodes across 3 data centers for high availability.
control_plane_nodepools = [
{
name = "control-fsn1"
server_type = "cax21"
location = "fsn1"
labels = []
taints = []
count = 1
swap_size = "2G"
longhorn_volume_size = 0
},
{
name = "control-nbg1"
server_type = "cax21"
location = "nbg1"
labels = []
taints = []
count = 1
swap_size = "2G"
longhorn_volume_size = 0
},
{
name = "control-hel1"
server_type = "cax21"
location = "hel1"
labels = []
taints = []
count = 1
swap_size = "2G"
longhorn_volume_size = 0
}
]
Control Plane Configuration¶
Parameter |
Value |
Notes |
|---|---|---|
Server Type |
CAX21 |
ARM64, 4 vCPU, 8GB RAM, 80GB SSD |
Locations |
fsn1, nbg1, hel1 |
3 data centers for HA |
Count per Pool |
1 |
Total: 3 control planes |
Swap |
2GB |
Reduces OOM risk |
Longhorn Volume |
0 |
Control planes don’t run Longhorn |
Rationale:
Multi-DC: Survives data center failure
ARM64: Cost-effective (€6.49/month vs €9.49 for AMD64 equivalent)
CAX21: Sufficient for control plane workload
Agent Node Pools¶
The KUP6S cluster uses ARM64 nodes as the primary infrastructure, distributed across fsn1 and nbg1 regions for geographic redundancy. One AMD64 node is maintained for specific workloads requiring x86-64 architecture.
agent_nodepools = [
# ARM64 Workers - Primary for web applications
# Distributed across fsn1 (3 nodes) and nbg1 (2 nodes) for geographic redundancy
# fsn1 region (3 nodes)
{
name = "agent-cax31-fsn1"
server_type = "cax31"
location = "fsn1"
labels = []
taints = []
count = 1
swap_size = "2G"
longhorn_volume_size = 0
},
{
name = "agent-cax21-fsn1"
server_type = "cax21"
location = "fsn1"
labels = []
taints = []
count = 1
swap_size = "2G"
longhorn_volume_size = 0
},
{
name = "agent-cpx21-fsn1"
server_type = "cpx21"
location = "fsn1"
labels = []
taints = ["kubernetes.io/arch=amd64:NoSchedule"]
count = 1
swap_size = "2G"
longhorn_volume_size = 0
},
# nbg1 region (2 nodes)
{
name = "agent-cax31-nbg1"
server_type = "cax31"
location = "nbg1"
labels = []
taints = []
count = 1
swap_size = "2G"
longhorn_volume_size = 0
},
{
name = "agent-cax21-nbg1"
server_type = "cax21"
location = "nbg1"
labels = []
taints = []
count = 1
swap_size = "2G"
longhorn_volume_size = 0
},
]
Agent Pool Summary¶
Pool |
Type |
Arch |
vCPU |
RAM |
Disk |
Location |
Cost/month |
|---|---|---|---|---|---|---|---|
agent-cax31-fsn1 |
CAX31 |
ARM64 |
8 |
16GB |
160GB |
fsn1 |
€12.49 |
agent-cax21-fsn1 |
CAX21 |
ARM64 |
4 |
8GB |
80GB |
fsn1 |
€6.49 |
agent-cpx21-fsn1 |
CPX21 |
AMD64 |
3 |
4GB |
80GB |
fsn1 |
€9.49 |
agent-cax31-nbg1 |
CAX31 |
ARM64 |
8 |
16GB |
160GB |
nbg1 |
€12.49 |
agent-cax21-nbg1 |
CAX21 |
ARM64 |
4 |
8GB |
80GB |
nbg1 |
€6.49 |
Total Capacity:
ARM64: 24 vCPU, 48GB RAM (distributed: fsn1=12 vCPU/24GB, nbg1=12 vCPU/24GB)
AMD64: 3 vCPU, 4GB RAM (fsn1 only)
Combined: 27 vCPU, 52GB RAM
Total Storage: ~457GB across 5 Longhorn nodes
Strategy:
Geographic Distribution: 3 nodes in fsn1, 2 nodes in nbg1 for redundancy
ARM64 Primary: Most workloads run on cost-effective ARM nodes
AMD64 Tainted: Only pods with
kubernetes.io/arch=amd64toleration scheduled on cpx21Balanced Resources: Equal capacity split between regions for better resilience
Key Configuration Variables¶
Security¶
WireGuard Encryption¶
enable_wireguard = true
Effect: Encrypts all pod-to-pod communication across nodes using WireGuard.
Rationale: Security requirement for sensitive data in transit.
Storage¶
Longhorn Distributed Storage¶
enable_longhorn = true
Effect: Deploys Longhorn distributed block storage for persistent volumes.
Components:
Longhorn manager on each node
CSI driver for volume provisioning
Backup functionality (CIFS)
Hetzner CSI Driver¶
# disable_hetzner_csi = true
Status: Commented (enabled by default)
Effect: Hetzner Cloud CSI driver provides block storage volumes.
Note: Longhorn is preferred for replication and backups.
Scheduling¶
Control Plane Scheduling¶
# allow_scheduling_on_control_plane = true
Status: Commented (disabled by default)
Effect: When enabled, allows non-system pods to schedule on control plane nodes.
Current: Control planes run only system components.
Credential Management¶
CRITICAL: All sensitive values are provided via environment variables, not hardcoded in kube.tf.
Required Environment Variables¶
Variable |
Purpose |
|---|---|
|
Hetzner Cloud API access |
|
Hetzner Object Storage access |
|
Hetzner Object Storage secret |
|
Longhorn backup target URL |
|
Backup storage username |
|
Backup storage password |
|
Traefik dashboard user |
|
Traefik dashboard password |
Loading Credentials¶
# Create .env file from template
cp .env.example .env
# Edit .env with your credentials
# Load credentials (bash)
set -a
source .env
set +a
# Or use fish shell with dotenv plugin
fish -c "dotenv .env; and tofu plan"
Variable Declaration¶
In kube.tf:
variable "hcloud_token" {
description = "Hetzner Cloud API token"
type = string
sensitive = true
}
provider "hcloud" {
token = var.hcloud_token # From TF_VAR_hcloud_token
}
Anti-Pattern (NEVER):
# ❌ WRONG - hardcoded secret
provider "hcloud" {
token = "abc123secrettoken"
}
Extra Manifests¶
extra_manifests = ["kube-hetzner/extra-manifests/kustomization.yaml"]
Purpose: Deploys additional Kubernetes manifests via Kustomize during cluster creation.
Location: kube-hetzner/extra-manifests/
Contents: Infrastructure-tier components (ArgoCD, Crossplane, ESO, CNPG, monitoring)
See: Extra Manifests
Node Pool Parameters Reference¶
Common Parameters¶
Parameter |
Type |
Description |
|---|---|---|
|
string |
Unique identifier for the node pool |
|
string |
Hetzner server type (cax21, cpx31, etc.) |
|
string |
Hetzner data center (fsn1, nbg1, hel1) |
|
list(string) |
Kubernetes labels applied to nodes |
|
list(string) |
Kubernetes taints (e.g., |
|
number |
Number of nodes in this pool |
|
string |
Swap space size (e.g., “2G”) |
|
number |
Longhorn storage reservation (0 = disabled) |
Available Server Types¶
ARM64 (CAX series):
cax11: 2 vCPU, 4GB RAM, 40GB SSD - €3.79/monthcax21: 4 vCPU, 8GB RAM, 80GB SSD - €6.49/monthcax31: 8 vCPU, 16GB RAM, 160GB SSD - €12.49/monthcax41: 16 vCPU, 32GB RAM, 320GB SSD - €24.49/month
AMD64 (CPX series):
cpx11: 2 vCPU, 2GB RAM, 40GB SSD - €4.15/monthcpx21: 3 vCPU, 4GB RAM, 80GB SSD - €9.49/monthcpx31: 4 vCPU, 8GB RAM, 160GB SSD - €16.49/monthcpx41: 8 vCPU, 16GB RAM, 240GB SSD - €32.90/month
Locations:
fsn1: Falkenstein, Germanynbg1: Nuremberg, Germanyhel1: Helsinki, Finland
Modifying the Configuration¶
Adding a Node Pool¶
Edit
kube.tf:agent_nodepools = [ # ...existing pools... { name = "agent-arm-4" server_type = "cax41" location = "fsn1" labels = ["workload=heavy"] taints = [] count = 2 swap_size = "4G" longhorn_volume_size = 0 } ]
Preview changes:
cd kube-hetzner source .env tofu plan
Apply:
tofu apply
Changing Node Count¶
Scaling a pool up or down:
{
name = "agent-arm-2"
count = 5 # Changed from 3 to 5
# ...other parameters...
}
Note: Scaling down removes nodes. Ensure workloads are drained first.
Modifying Taints¶
Remove AMD64 taint to allow multi-arch scheduling:
{
name = "agent-amd-3"
taints = [] # Removed: ["kubernetes.io/arch=amd64:NoSchedule"]
# ...
}
Safety Guidelines¶
Before Modifying kube.tf¶
Backup: Ensure recent cluster backup exists
Review: Understand impact of changes
Plan: Always run
tofu planfirstTest: Test in dev/staging if possible
Monitor: Watch cluster during apply
Dangerous Operations¶
Changing Control Plane Configuration:
Can cause control plane downtime
May require cluster recreation
Test in staging first
Removing Agent Nodes:
Ensure no critical workloads on nodes
Drain nodes before scaling down:
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
Changing Storage Settings:
May cause Longhorn data loss
Backup all PVs first
Further Reading¶
Explanation: Infrastructure as Code - IaC philosophy
How-To: Apply Infrastructure Changes - Safe modification procedures
Tutorial: Deploy Your First Cluster - Hands-on cluster creation
Reference: Extra Manifests - Infrastructure components