Explanation

ArgoCD and GitOps Deployment

This document explains how kup6s uses ArgoCD to implement GitOps deployment - turning git into the single source of truth for all infrastructure and applications.

What is GitOps?

GitOps is a deployment paradigm where:

  1. Git is the source of truth - All configuration lives in version control

  2. Declarative configuration - Describe desired state, not steps to achieve it

  3. Automated synchronization - Controller watches git and applies changes

  4. Self-healing - Cluster state continuously reconciled with git

Traditional Approach (kubectl):

# Manual, imperative, no audit trail
kubectl apply -f deployment.yaml
kubectl set image deployment/app app=v2
kubectl scale deployment/app --replicas=3

GitOps Approach (ArgoCD):

# Declarative, version-controlled, auditable
git commit -m "Update app to v2, scale to 3"
git push
# ArgoCD automatically applies changes

Why ArgoCD?

1. No Cluster Credentials on Laptops

Traditional deployment requires kubectl configured with cluster credentials on every developer’s laptop:

# Developer's laptop needs cluster access
export KUBECONFIG=~/.kube/production-cluster
kubectl apply -f manifests/

Security risks:

  • Credentials on many laptops

  • Laptop theft/loss = cluster compromise

  • Hard to audit who deployed what

  • Credentials don’t expire automatically

With ArgoCD:

  • Developers only need git access

  • ArgoCD runs inside the cluster

  • No credentials leave the cluster

  • Laptop compromise doesn’t expose cluster

  • Deploy by git push instead of kubectl apply

2. Complete Audit Trail

Traditional:

  • Who ran kubectl apply? (maybe logs, maybe not)

  • What was deployed? (no record unless you save files)

  • Why was it deployed? (tribal knowledge)

  • When did it change? (check kubectl events, if still available)

With ArgoCD:

  • Git commit shows exactly what changed

  • Git author shows who made the change

  • Commit message explains why

  • Full history available via git log

  • Can trace back months/years

# View deployment history
git log --oneline manifests/app.k8s.yaml

# See who changed database config
git blame charts/constructs/database.ts

# Revert a bad deployment
git revert abc123
git push
# ArgoCD automatically rolls back

3. Self-Healing and Drift Prevention

Traditional:

  • Someone runs kubectl edit to “quickly fix” something

  • Change works, gets forgotten

  • Next deployment overwrites the fix

  • Production breaks

  • No record of what was changed

With ArgoCD:

  • Manual changes detected as drift

  • ArgoCD automatically reverts to git state (if selfHeal enabled)

  • Or: ArgoCD alerts but doesn’t auto-fix (configurable)

  • Forces all changes through git

  • Git remains single source of truth

# Scenario: Someone manually scales replicas
kubectl scale deployment/app --replicas=5

# ArgoCD detects drift (git says 2, cluster says 5)
# After max 3 minutes: ArgoCD scales back to 2
# Git remains source of truth

4. Easy Rollback

Traditional rollback:

# Find old version (where is it?)
# Hope you saved the YAML file
kubectl apply -f deployment.yaml.backup
# Or use kubectl rollout undo (only works for some resources)
# Limited history (usually 10 revisions)

GitOps rollback:

# Instant rollback to any point in history
git revert HEAD
git push
# Or: git reset --hard abc123 && git push --force
# ArgoCD applies old config automatically
# Full git history available (years of changes)

5. Declarative State Management

Imperative (kubectl):

# Must run commands in specific order
kubectl create namespace myapp
kubectl create secret generic db-creds --from-literal=password=xyz
kubectl apply -f database.yaml
kubectl apply -f app.yaml
# Order matters! Database must exist before app

Declarative (GitOps):

# Just declare desired state, ArgoCD handles ordering
# manifests/app.k8s.yaml contains:
# - Namespace
# - Secrets
# - Database
# - Application
# ArgoCD applies in correct order using sync waves

ArgoCD Architecture

┌──────────────────┐              ┌─────────────────┐              ┌──────────────────┐
│  Git Repository  │              │  ArgoCD Server  │              │  Kubernetes API  │
│                  │              │  (in cluster)   │              │                  │
│  manifests/      │◄─────Poll────│                 │              │                  │
│  app.k8s.yaml    │   every 3min │  Compare        │──Apply───────►  Deployments     │
│                  │              │  Desired ←→     │              │  Services        │
│  (git push)      │              │  Actual State   │              │  ConfigMaps      │
└──────────────────┘              └─────────────────┘              └──────────────────┘
                                   ┌─────────────┐
                                   │  ArgoCD UI  │
                                   │  Dashboard  │
                                   └─────────────┘

How it works:

  1. Developer pushes to git

  2. ArgoCD polls git every 3 minutes (or webhook triggers immediately)

  3. ArgoCD compares git state vs cluster state

  4. If different, ArgoCD applies changes to cluster

  5. Status visible in ArgoCD UI and CLI

Sync Waves: Ordering Dependencies

ArgoCD applies resources in waves (numbered order) to respect dependencies. Resources are deployed in ascending wave order, waiting for each wave to be healthy before proceeding.

Example sync wave ordering:

  1. Wave 0 or 1: Infrastructure (namespace, RBAC, CRDs, S3 buckets)

  2. Wave 2: Secrets (ExternalSecrets, credentials)

  3. Wave 3: Databases (PostgreSQL, Redis)

  4. Wave 4: Applications (web services, workers)

Why this matters:

  • Without waves: PostgreSQL starts before secrets exist → CrashLoopBackOff

  • With waves: Secrets created first, then PostgreSQL starts successfully

How to set sync waves:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: app-postgres
  annotations:
    argocd.argoproj.io/sync-wave: "3"  # Wait for secrets (wave 2) first

In CDK8S:

const database = new cnpg.Cluster(this, 'postgres', {
  metadata: {
    annotations: {
      'argocd.argoproj.io/sync-wave': '3',
    },
  },
  // ...
});

Best practices:

  • Start at wave 1 (not 0) for main infrastructure

  • Use wave 0 for cluster-wide resources (CRDs)

  • Leave gaps (1, 2, 3, not 1, 2, 2.5) - easier to insert waves later

  • Don’t overuse waves - only when dependency exists

ArgoCD Application Resource

The ArgoCD Application resource defines how a deployment is managed:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp
  namespace: argocd
spec:
  # Project (RBAC boundary for applications)
  project: default

  # Source: Where to read manifests from
  source:
    repoURL: https://git.example.com/org/myapp.git
    targetRevision: main
    path: manifests  # ArgoCD reads from this directory

  # Destination: Where to deploy to
  destination:
    server: https://kubernetes.default.svc  # This cluster
    namespace: myapp

  # Sync Policy: How to apply changes
  syncPolicy:
    automated:
      prune: true      # Delete resources removed from git
      selfHeal: true   # Auto-revert manual kubectl changes
    syncOptions:
      - CreateNamespace=true        # Create namespace if missing
      - ApplyOutOfSyncOnly=true     # Only update changed resources

Key Fields:

  • source.repoURL - Git repository URL

  • source.path - Directory containing manifests

  • destination.namespace - Target Kubernetes namespace

  • automated.prune - Delete resources removed from git

  • automated.selfHeal - Revert manual changes

  • syncOptions - Additional sync behaviors

Sync Policies

Automated Sync

With automated sync:

syncPolicy:
  automated: {}
  • ArgoCD applies changes automatically (no manual sync needed)

  • Changes appear in cluster within 3 minutes

  • No human approval required

Without automated sync:

  • Changes detected but not applied

  • Operator must click “Sync” in ArgoCD UI

  • Or: run argocd app sync myapp

  • Useful for production (manual gate)

Prune Policy

With prune enabled:

syncPolicy:
  automated:
    prune: true
  • Resources deleted from git are deleted from cluster

  • Example: Delete a ConfigMap from manifests → ArgoCD deletes it from cluster

  • Keeps cluster state synchronized with git

Without prune:

  • Resources deleted from git remain in cluster

  • Can cause drift (git says gone, cluster still has it)

  • Must manually delete with kubectl

Self-Heal Policy

With self-heal enabled:

syncPolicy:
  automated:
    selfHeal: true
  • Manual kubectl changes automatically reverted

  • Enforces git as source of truth

  • Prevents configuration drift

Example:

# Operator manually changes replicas
kubectl scale deployment/app --replicas=5

# ArgoCD detects drift (git: 2, cluster: 5)
# After ~3 minutes: ArgoCD scales back to 2

Without self-heal:

  • ArgoCD detects drift but doesn’t fix it

  • Shows “OutOfSync” status

  • Requires manual sync to fix

CDK8S Integration

CDK8S generates manifests that ArgoCD deploys:

Developer                   Git                    ArgoCD                 Cluster
┌──────────┐              ┌─────┐                ┌────────┐            ┌─────────┐
│ TypeScript│  npm run    │     │   git push     │        │   kubectl  │         │
│  .ts files│────build───>│ Git │──────────────>│ ArgoCD │──apply────>│  Pods   │
│          │              │     │                │        │            │         │
└──────────┘              └─────┘                └────────┘            └─────────┘
     │                        │                       │
     │                        │                       │
     ▼                        ▼                       ▼
   Type-safe              Auditable              Automated
   Validated              Versioned              Self-healing

Benefits of CDK8S + ArgoCD:

  1. Type safety - Catch errors at compile-time (TypeScript)

  2. Audit trail - Git history shows what changed and why

  3. Automation - No manual kubectl commands needed

  4. Rollback - git revert instantly reverts to known-good state

See CDK8S Infrastructure as Code for detailed explanation.

Monitoring ArgoCD

Check Application Status

# Get high-level status
kubectl get application myapp -n argocd

# Expected output:
# NAME    SYNC STATUS   HEALTH STATUS
# myapp   Synced        Healthy

View Sync History

# Describe application (shows recent syncs)
kubectl describe application myapp -n argocd

# Look for:
# - Last Sync Time
# - Sync Status
# - Health Status

Check Drift

# Compare git vs cluster
argocd app diff myapp

# Shows resources that differ between git and cluster
# Empty output = no drift

View Logs

# ArgoCD controller logs
kubectl logs -n argocd deployment/argocd-application-controller

# Filter for specific application
kubectl logs -n argocd deployment/argocd-application-controller | grep myapp

Troubleshooting

Application Stuck “OutOfSync”

Symptoms:

kubectl get application myapp -n argocd
# SYNC STATUS: OutOfSync

Possible causes:

  1. Dependency not ready (sync wave blocked)

    # Check which resources are blocked
    argocd app get myapp --show-operation
    
    # Example: ExternalSecret waiting for ClusterSecretStore
    # Solution: Check ClusterSecretStore is ready
    kubectl get clustersecretstore
    
  2. Resource validation failed

    # Check sync error messages
    kubectl describe application myapp -n argocd
    
    # Example: Invalid YAML, missing CRD
    # Solution: Fix manifest and rebuild
    
  3. Sync wave annotations missing

    # Resources applied out of order
    # Solution: Add sync-wave annotations to constructs
    

Application Shows “Degraded”

Symptoms:

kubectl get application myapp -n argocd
# HEALTH STATUS: Degraded

Possible causes:

  1. Pod CrashLoopBackOff

    # Find failing pods
    kubectl get pods -n myapp | grep -v Running
    
    # Check pod logs
    kubectl logs -n myapp <pod-name>
    
  2. Resource not ready (PVC Pending, Job failed)

    # Check all resources
    kubectl get all,pvc,secret,configmap -n myapp
    

Manual Sync Not Working

Symptoms:

argocd app sync myapp
# Error: operation already in progress

Solution:

# Wait for current operation to complete
watch kubectl get application myapp -n argocd

# Or: Terminate stuck operation
argocd app terminate-op myapp

Prune Not Deleting Resources

Symptoms:

  • Deleted resource from git

  • ArgoCD synced

  • Resource still exists in cluster

Possible causes:

  1. Prune disabled

    # Check prune policy
    kubectl get application myapp -n argocd -o yaml | grep prune
    
    # If false: Enable prune in ArgoCD Application
    
  2. Resource has finalizer

    # Check resource
    kubectl get <resource-type> <name> -n myapp -o yaml | grep finalizers
    
    # Solution: Remove finalizer or fix dependency
    
  3. Resource not managed by ArgoCD

    # Check resource annotations
    kubectl get <resource-type> <name> -n myapp -o yaml | grep "argocd.argoproj.io"
    
    # If missing: Resource not tracked by ArgoCD (created manually?)
    

Best Practices

1. Always Commit Manifests

Good:

# Build and commit together
npm run build
git add charts/ manifests/
git commit -m "Update PostgreSQL storage"
git push

Bad:

# Build but don't commit manifests
npm run build
git add charts/
git commit -m "Update PostgreSQL storage"
git push
# manifests/ out of sync with charts/!

Why: ArgoCD reads manifests, not charts. If manifests aren’t committed, changes won’t deploy.

2. Use Descriptive Commit Messages

Good:

feat: Increase PostgreSQL storage to 20Gi

Anticipating growth from 5 to 10 users over next quarter.
Storage usage currently at 7Gi, will hit limit in ~2 months.

See: #123

Bad:

Update config

Why: Git commit = deployment audit trail. Future you (or teammates) need context.

3. Test Changes in Staging First

# Staging branch (auto-deployed to staging cluster)
git checkout staging
# ... make changes ...
git push origin staging

# Verify in staging
kubectl get pods -n myapp

# Production branch (manual sync in ArgoCD)
git checkout main
git merge staging
git push origin main

4. Use Sync Waves Correctly

Dependency order:

  1. Infrastructure (namespace, RBAC, CRDs)

  2. Configuration (ConfigMaps, Secrets, ESO resources)

  3. Storage (PVCs, Buckets)

  4. Databases (PostgreSQL, Redis)

  5. Applications (web services, workers)

Anti-pattern: All resources in same wave → race conditions.

5. Monitor ArgoCD Health

# Daily check
kubectl get application -n argocd | grep -v Healthy

# Alert on OutOfSync for > 10 minutes
# (indicates deployment failure)

Advantages Over kubectl apply

Aspect

kubectl apply

ArgoCD GitOps

Audit Trail

None (unless logged)

Full git history

Rollback

Manual (save old files)

git revert

Drift Prevention

Manual detection

Automatic self-heal

Credentials

On every laptop

Only in cluster

Automation

Scripts + CI/CD

Built-in

Dependencies

Manual ordering

Sync waves

Visibility

kubectl get

ArgoCD UI + CLI

Security

Credentials leak risk

No credentials on laptops