Explanation

Infrastructure as Code

This document explains the Infrastructure-as-Code (IaC) approach used to manage the KUP6S Kubernetes cluster.

What is Infrastructure as Code?

Infrastructure as Code is the practice of managing and provisioning infrastructure through machine-readable definition files, rather than through interactive configuration tools or manual processes.

Key Principles:

  • Infrastructure is declared in code (what you want)

  • Not scripted with procedures (how to build it)

  • Version controlled like application code

  • Changes are reproducible and auditable

  • Infrastructure can be tested and validated

Why Infrastructure as Code for Kubernetes?

The Alternative: Imperative Management

Without IaC, cluster setup typically involves:

# Manual, error-prone steps
hcloud server create --name control-1 --type cax21 --location fsn1
ssh root@control-1 "curl -sfL https://get.k3s.io | sh"
kubectl apply -f some-manifest.yaml
# ...repeat for each component

Problems:

  • Not Reproducible: Can’t easily rebuild the cluster

  • No History: No record of what changed and when

  • Drift: Manual changes cause configuration drift

  • Documentation Gaps: Undocumented manual steps

  • No Testing: Changes applied directly to production

The IaC Approach with OpenTofu

With IaC, the entire cluster is defined in code:

# kube.tf
control_plane_nodepools = [
  {
    name        = "control-fsn1"
    server_type = "cax21"
    location    = "fsn1"
    count       = 1
  }
]

Benefits:

  • Reproducible: Rebuild cluster from code at any time

  • Version Controlled: Git tracks all changes

  • Documented: Code is the documentation

  • Reviewable: Changes go through pull requests

  • Testable: Validate with tofu plan before applying

Why OpenTofu?

OpenTofu is an open-source fork of Terraform, chosen for several reasons:

1. Community-Driven Development

  • Not controlled by a single vendor

  • Open governance model

  • Compatible with existing Terraform modules

2. Declarative Language (HCL)

  • Human-readable configuration language

  • Clear separation between desired state and current state

  • Easy to understand infrastructure layout

3. State Management

  • Tracks actual infrastructure state

  • Detects drift between code and reality

  • Safely updates only what changed

4. Provider Ecosystem

  • Hetzner Cloud provider (hcloud)

  • Kube-hetzner module abstracts Kubernetes complexity

  • Extensible with custom providers

5. Plan Before Apply

  • Preview changes before execution

  • Catch errors early

  • Safe infrastructure updates

How kube.tf Fits In

The kube.tf file is the heart of the KUP6S cluster definition. It specifies:

1. Module Source

module "kube-hetzner" {
  source = "kube-hetzner/kube-hetzner/hcloud"
  # Uses community module for K3S on Hetzner Cloud
}

2. Cluster Identity

  • Cluster name

  • Base domain

  • Network configuration

3. Node Pools

  • Control plane nodes (type, count, location)

  • Agent nodes (type, count, location, labels, taints)

4. Infrastructure Components

  • Storage (Longhorn, CSI drivers)

  • Networking (Cilium, WireGuard)

  • Ingress (Traefik, cert-manager)

5. Extra Manifests

  • Additional Kubernetes manifests via Kustomize

  • Helm charts via K3S HelmChart CRD

  • Infrastructure-tier components (ArgoCD, Crossplane, ESO, CNPG)

Declarative vs Imperative

Imperative (How)

# Step-by-step commands
hcloud server create ...
ssh ... "install k3s"
kubectl create namespace argocd
helm install argocd ...

Characteristics:

  • Describes how to achieve a state

  • Order matters

  • Requires scripting logic

  • Hard to reverse

Declarative (What)

# Desired state
control_plane_nodepools = [...]
agent_nodepools = [...]
extra_manifests = [...]

Characteristics:

  • Describes what the end state should be

  • Order doesn’t matter (OpenTofu handles dependencies)

  • No procedural logic needed

  • Reversible (remove from code, apply)

OpenTofu figures out the “how” automatically.

Credential Management Philosophy

CRITICAL: The KUP6S cluster follows a strict credential management approach.

Environment Variables, Not Hardcoded Secrets

All sensitive values are provided via environment variables:

# .env file (git-ignored)
export TF_VAR_hcloud_token="your-hetzner-api-token"
export TF_VAR_hetzner_s3_access_key="your-s3-key"
export TF_VAR_hetzner_s3_secret_key="your-s3-secret"
# ...more credentials

Referenced in kube.tf:

variable "hcloud_token" {
  sensitive = true
}

provider "hcloud" {
  token = var.hcloud_token  # From environment
}

Why This Matters

Security:

  • ✅ No secrets committed to Git

  • ✅ Secrets isolated to local environment

  • ✅ Different credentials per developer/environment

  • ✅ Easy credential rotation

Compliance:

  • ✅ Secrets never in version history

  • ✅ Auditable access (who has .env file)

  • ✅ No accidental disclosure via Git

Anti-Pattern (NEVER DO THIS):

# ❌ WRONG - hardcoded secret
provider "hcloud" {
  token = "abc123supersecret"  # Committed to Git!
}

This would expose credentials to anyone with repository access and persist in Git history forever.

The Infrastructure-as-Code Workflow

1. Modify Code

Edit kube.tf or manifests in extra-manifests/:

vim kube-hetzner/kube.tf

2. Preview Changes

See what will happen:

cd kube-hetzner
source .env  # Load credentials
tofu plan   # Shows proposed changes

3. Review in Pull Request

  • Team reviews the changes

  • Catches mistakes early

  • Documents the “why” in commit messages

4. Apply Changes

Execute the plan:

tofu apply  # Applies changes to infrastructure

5. Commit State

OpenTofu state tracks actual infrastructure:

  • Stored locally or in remote backend

  • Used to detect drift

  • Required for future updates

Benefits Realized

Since adopting IaC for KUP6S, we’ve achieved:

Reproducibility:

  • Can rebuild entire cluster from code

  • Disaster recovery is git clone + tofu apply

  • Test infrastructure changes in dev environment

Version Control:

  • Full history of infrastructure changes

  • Blame/credit for modifications

  • Easy rollback to previous configurations

Documentation:

  • Code documents current state

  • No “tribal knowledge” required

  • New team members onboard faster

Confidence:

  • Preview changes before applying

  • Catch errors in plan phase

  • Rollback capability

Automation:

  • Could integrate with CI/CD

  • Scheduled compliance checks

  • Automated drift detection

Trade-offs and Considerations

Advantages of IaC

  • Predictability: Same input → same output

  • Testability: Can test in staging

  • Collaboration: Team reviews changes

  • Auditability: Git log tracks all modifications

Challenges

  • Learning Curve: HCL syntax, OpenTofu concepts

  • State Management: State file must be protected

  • Initial Setup: More effort than manual approach

  • Tool Dependency: Requires OpenTofu to make changes

When IaC Makes Sense

IaC is ideal when you:

  • Manage multiple environments (dev, staging, prod)

  • Need infrastructure reproducibility

  • Want version-controlled infrastructure

  • Have team collaboration on infrastructure

When to Reconsider

Manual management might suffice if:

  • Single-server setup

  • One-time deployment

  • No team collaboration

  • Experimental/throwaway infrastructure

For KUP6S (production Kubernetes cluster), IaC is the right choice.

OpenTofu vs Other IaC Tools

OpenTofu vs Ansible

  • OpenTofu: Declarative, state-based, infrastructure provisioning

  • Ansible: Imperative, stateless, configuration management

  • KUP6S uses: OpenTofu for infrastructure, Kustomize/Helm for app config

OpenTofu vs Pulumi

  • OpenTofu: HCL language, proven ecosystem

  • Pulumi: Real programming languages (TypeScript, Python)

  • KUP6S uses: OpenTofu for consistency with kube-hetzner module

OpenTofu vs kubectl apply

  • OpenTofu: Manages Hetzner Cloud resources + Kubernetes

  • kubectl: Manages Kubernetes resources only

  • KUP6S uses: Both (OpenTofu for cluster, kubectl/ArgoCD for apps)

Further Reading