Runtime Zero
ESC
Browse by topic
Articles  /  Kubernetes

Argo CD Best Practices for Production Clusters

Argo CD is the most widely adopted GitOps tool in the Kubernetes ecosystem. This post covers multi-cluster patterns, application set strategies, RBAC, and the operational habits that prevent GitOps chaos.

JW

Argo CD is deceptively easy to install and devastatingly complex to operate at scale. After running it across twelve clusters serving forty application teams, here are the patterns that work.

Hub and Spoke vs. Per-Cluster Argo CD

There are two deployment topologies:

Hub and Spoke: One Argo CD instance manages all clusters. Simpler to operate, single plane of glass, but the control plane becomes a blast radius for all clusters.

Per-Cluster: Each cluster has its own Argo CD. Better isolation, but multiplies the operational burden of upgrades and configuration drift.

For most teams: hub and spoke with a dedicated management cluster. The Argo CD instance lives on a cluster that doesn't host production workloads — it only deploys to them. If the hub goes down, running deployments are unaffected; only new syncs are blocked.

ApplicationSets: Manage Applications at Scale

If you're manually creating Application objects for each service, you're not doing GitOps — you're doing click-ops in YAML. ApplicationSet generates Application objects dynamically:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: team-services
  namespace: argocd
spec:
  generators:
    - git:
        repoURL: https://github.com/my-org/gitops-config
        revision: main
        directories:
          - path: "clusters/production/apps/*"
  template:
    metadata:
      name: ""
    spec:
      project: production
      source:
        repoURL: https://github.com/my-org/gitops-config
        targetRevision: main
        path: ""
      destination:
        server: https://kubernetes.default.svc
        namespace: ""
      syncPolicy:
        automated:
          prune: true
          selfHeal: true

Drop a new directory under clusters/production/apps/ and Argo CD automatically creates an Application for it. Remove the directory and the app is pruned. No manual Argo CD configuration needed.

RBAC: Applications Belong to Projects

Every Application should live in an Argo CD Project. Projects scope repository access, destination clusters/namespaces, and resource types. A team that owns the payments project cannot accidentally sync their app to the infra namespace.

apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
  name: payments
spec:
  sourceRepos:
    - 'https://github.com/my-org/payments-*'
  destinations:
    - namespace: payments-*
      server: https://prod-cluster.k8s.example.com
  clusterResourceWhitelist: []  # No cluster-scoped resources
  namespaceResourceBlacklist:
    - group: ''
      kind: ResourceQuota  # Prevent teams from removing their own quotas

The Sync Order Problem

Argo CD syncs all resources in an Application simultaneously by default. For apps with dependencies (CRDs before CRs, Namespace before everything else), this fails. Use sync waves:

metadata:
  annotations:
    argocd.argoproj.io/sync-wave: "1"  # Lower numbers sync first

Standard convention: -5 for CRDs, 0 for Namespaces/RBAC, 5 for workloads, 10 for post-deploy jobs.

Drift Detection Without Automated Sync

Not every team wants automated sync — some want GitOps for auditability but manual approvals for deploys. Configure Application without syncPolicy.automated and use Argo CD's notifications to alert on drift:

triggers:
  - name: on-out-of-sync
    condition: app.status.sync.status == 'OutOfSync'
    template: app-out-of-sync

Pair this with a Slack notification and you get "someone changed production without going through git" alerts — essentially free change management.