Explanation

Infrastructure as Code¶

Type: Explanation (Understanding-oriented)

Related Concepts: Extra Manifests Organization | Apply Infrastructure Changes

This document explains the Infrastructure-as-Code (IaC) approach used to manage the KUP6S Kubernetes cluster.

What is Infrastructure as Code?¶

Infrastructure as Code is the practice of managing and provisioning infrastructure through machine-readable definition files, rather than through interactive configuration tools or manual processes.

Key Principles:

Infrastructure is declared in code (what you want)
Not scripted with procedures (how to build it)
Version controlled like application code
Changes are reproducible and auditable
Infrastructure can be tested and validated

Why Infrastructure as Code for Kubernetes?¶

The Alternative: Imperative Management¶

Without IaC, cluster setup typically involves:

# Manual, error-prone steps
hcloud server create --name control-1 --type cax21 --location fsn1
ssh root@control-1 "curl -sfL https://get.k3s.io | sh"
kubectl apply -f some-manifest.yaml
# ...repeat for each component

Problems:

Not Reproducible: Can’t easily rebuild the cluster
No History: No record of what changed and when
Drift: Manual changes cause configuration drift
Documentation Gaps: Undocumented manual steps
No Testing: Changes applied directly to production

The IaC Approach with OpenTofu¶

With IaC, the entire cluster is defined in code:

# kube.tf
control_plane_nodepools = [
  {
    name        = "control-fsn1"
    server_type = "cax21"
    location    = "fsn1"
    count       = 1
  }
]

Benefits:

✅ Reproducible: Rebuild cluster from code at any time
✅ Version Controlled: Git tracks all changes
✅ Documented: Code is the documentation
✅ Reviewable: Changes go through pull requests
✅ Testable: Validate with tofu plan before applying

Why OpenTofu?¶

OpenTofu is an open-source fork of Terraform, chosen for several reasons:

1. Community-Driven Development

Not controlled by a single vendor
Open governance model
Compatible with existing Terraform modules

2. Declarative Language (HCL)

Human-readable configuration language
Clear separation between desired state and current state
Easy to understand infrastructure layout

3. State Management

Tracks actual infrastructure state
Detects drift between code and reality
Safely updates only what changed

4. Provider Ecosystem

Hetzner Cloud provider (hcloud)
Kube-hetzner module abstracts Kubernetes complexity
Extensible with custom providers

5. Plan Before Apply

Preview changes before execution
Catch errors early
Safe infrastructure updates

How kube.tf Fits In¶

The kube.tf file is the heart of the KUP6S cluster definition. It specifies:

1. Module Source¶

module "kube-hetzner" {
  source = "kube-hetzner/kube-hetzner/hcloud"
  # Uses community module for K3S on Hetzner Cloud
}

2. Cluster Identity¶

Cluster name
Base domain
Network configuration

3. Node Pools¶

Control plane nodes (type, count, location)
Agent nodes (type, count, location, labels, taints)

4. Infrastructure Components¶

Storage (Longhorn, CSI drivers)
Networking (Cilium, WireGuard)
Ingress (Traefik, cert-manager)

5. Extra Manifests¶

Additional Kubernetes manifests via Kustomize
Helm charts via K3S HelmChart CRD
Infrastructure-tier components (ArgoCD, Crossplane, ESO, CNPG)

Declarative vs Imperative¶

Imperative (How)¶

# Step-by-step commands
hcloud server create ...
ssh ... "install k3s"
kubectl create namespace argocd
helm install argocd ...

Characteristics:

Describes how to achieve a state
Order matters
Requires scripting logic
Hard to reverse

Declarative (What)¶

# Desired state
control_plane_nodepools = [...]
agent_nodepools = [...]
extra_manifests = [...]

Characteristics:

Describes what the end state should be
Order doesn’t matter (OpenTofu handles dependencies)
No procedural logic needed
Reversible (remove from code, apply)

OpenTofu figures out the “how” automatically.

Credential Management Philosophy¶

CRITICAL: The KUP6S cluster follows a strict credential management approach.

Environment Variables, Not Hardcoded Secrets¶

All sensitive values are provided via environment variables:

# .env file (git-ignored)
export TF_VAR_hcloud_token="your-hetzner-api-token"
export TF_VAR_hetzner_s3_access_key="your-s3-key"
export TF_VAR_hetzner_s3_secret_key="your-s3-secret"
# ...more credentials

Referenced in kube.tf:

variable "hcloud_token" {
  sensitive = true
}

provider "hcloud" {
  token = var.hcloud_token  # From environment
}

Why This Matters¶

Security:

✅ No secrets committed to Git
✅ Secrets isolated to local environment
✅ Different credentials per developer/environment
✅ Easy credential rotation

Compliance:

✅ Secrets never in version history
✅ Auditable access (who has .env file)
✅ No accidental disclosure via Git

Anti-Pattern (NEVER DO THIS):

# ❌ WRONG - hardcoded secret
provider "hcloud" {
  token = "abc123supersecret"  # Committed to Git!
}

This would expose credentials to anyone with repository access and persist in Git history forever.

The Infrastructure-as-Code Workflow¶

1. Modify Code¶

Edit kube.tf or manifests in extra-manifests/:

vim kube-hetzner/kube.tf

2. Preview Changes¶

See what will happen:

cd kube-hetzner
source .env  # Load credentials
tofu plan   # Shows proposed changes

3. Review in Pull Request¶

Team reviews the changes
Catches mistakes early
Documents the “why” in commit messages

4. Apply Changes¶

Execute the plan:

tofu apply  # Applies changes to infrastructure

5. Commit State¶

OpenTofu state tracks actual infrastructure:

Stored locally or in remote backend
Used to detect drift
Required for future updates

Benefits Realized¶

Since adopting IaC for KUP6S, we’ve achieved:

Reproducibility:

Can rebuild entire cluster from code
Disaster recovery is git clone + tofu apply
Test infrastructure changes in dev environment

Version Control:

Full history of infrastructure changes
Blame/credit for modifications
Easy rollback to previous configurations

Documentation:

Code documents current state
No “tribal knowledge” required
New team members onboard faster

Confidence:

Preview changes before applying
Catch errors in plan phase
Rollback capability

Automation:

Could integrate with CI/CD
Scheduled compliance checks
Automated drift detection

Trade-offs and Considerations¶

Advantages of IaC¶

Predictability: Same input → same output
Testability: Can test in staging
Collaboration: Team reviews changes
Auditability: Git log tracks all modifications

Challenges¶

Learning Curve: HCL syntax, OpenTofu concepts
State Management: State file must be protected
Initial Setup: More effort than manual approach
Tool Dependency: Requires OpenTofu to make changes

When IaC Makes Sense¶

IaC is ideal when you:

Manage multiple environments (dev, staging, prod)
Need infrastructure reproducibility
Want version-controlled infrastructure
Have team collaboration on infrastructure

When to Reconsider¶

Manual management might suffice if:

Single-server setup
One-time deployment
No team collaboration
Experimental/throwaway infrastructure

For KUP6S (production Kubernetes cluster), IaC is the right choice.

OpenTofu vs Other IaC Tools¶

OpenTofu vs Ansible¶

OpenTofu: Declarative, state-based, infrastructure provisioning
Ansible: Imperative, stateless, configuration management
KUP6S uses: OpenTofu for infrastructure, Kustomize/Helm for app config

OpenTofu vs Pulumi¶

OpenTofu: HCL language, proven ecosystem
Pulumi: Real programming languages (TypeScript, Python)
KUP6S uses: OpenTofu for consistency with kube-hetzner module

OpenTofu vs kubectl apply¶

OpenTofu: Manages Hetzner Cloud resources + Kubernetes
kubectl: Manages Kubernetes resources only
KUP6S uses: Both (OpenTofu for cluster, kubectl/ArgoCD for apps)