So far you've deployed your FastAPI agent to a single Kubernetes cluster. That works for development. But production systems need redundancy: if one cluster fails, your agent keeps running on another. If you need to test a new version before rolling out to all users, you deploy to a staging cluster first. This chapter teaches you to manage multiple clusters from one ArgoCD instance using a hub-spoke architecture.
In hub-spoke, ArgoCD (the hub) manages deployment to many Kubernetes clusters (the spokes). You define your application once in Git. ArgoCD syncs that same application to cluster 1, cluster 2, cluster 3—each with different configurations. One Git repository becomes the source of truth for your entire infrastructure.
A hub-spoke topology has one control point (ArgoCD hub) managing many execution points (Kubernetes clusters as spokes). This is different from decentralized approaches where each cluster runs its own ArgoCD instance.
Single pane of glass: One ArgoCD UI/CLI shows status across all clusters
Cost of a unified approach: Secrets containing cluster credentials must be stored securely in ArgoCD, not in Git. We'll address this in Chapter 15 (Secrets Management).
Alternative: cluster-local ArgoCD (not hub-spoke):
This approach works for teams with separate infra teams per cluster but loses the unified deployment view. We'll focus on hub-spoke because it's more common for AI agents.
ArgoCD starts with one cluster: the one it's installed in (the hub). To deploy to other clusters (spokes), you must register those clusters with ArgoCD first.
When you install ArgoCD on a cluster, it automatically registers itself:
Output:
To register an external cluster (e.g., your staging environment), you need:
Step 1: Create a service account on the external cluster
Output:
Step 2: Get the external cluster's kubeconfig
Output:
Step 3: Register the cluster with ArgoCD
Output:
When you register an external cluster, ArgoCD stores the cluster's API server URL and authentication credentials as a Kubernetes Secret in the hub cluster.
Output:
The config field in the secret contains authentication details. For external clusters, it typically includes:
The bearer token comes from the argocd-manager service account on the external cluster:
Output:
ArgoCD periodically verifies cluster connectivity:
Output:
If a cluster becomes unreachable, ArgoCD marks it as unhealthy but continues managing other clusters.
You've already learned ApplicationSets in Chapter 11. Now you'll use the Cluster generator to deploy an application to multiple registered clusters with cluster-specific configurations.
Instead of creating separate Applications for prod, staging, and DR:
Use a Cluster generator to create one Application per registered cluster:
The clusters: {} generator creates template variables for every registered cluster:
Real deployments need different configs per cluster. You might want:
Use Helm values overrides to customize per cluster:
Step 1: Add labels to clusters
Output:
Step 2: Create values-per-cluster in your Git repository
Create these files in your agent repository:
helm/values.yaml (default values) helm/values-prod.yaml (prod-specific overrides) helm/values-staging.yaml (staging-specific overrides) helm/values-dr.yaml (DR cluster same as prod)
Verify the files exist:
Output:
Step 3: Create ApplicationSet with per-cluster values
Apply the ApplicationSet:
Output:
Multi-cluster deployments raise networking questions:
For your AI agent, if each cluster is independent (data doesn't flow between clusters), you don't need cross-cluster communication. Each cluster runs a complete copy of your agent with its own database.
Each Kubernetes cluster has its own DNS domain:
To expose a service to other clusters, use an external DNS name:
Output:
With multiple clusters, you need resilience at two levels: ArgoCD itself must be HA, and your clusters must be capable of failover.
If your ArgoCD hub cluster goes down, you cannot deploy to spoke clusters. Make ArgoCD highly available:
Output:
Each component is fault-tolerant (Controller, Server, Repo Server, Redis). If one pod crashes, others take over.
Your agent runs on three clusters (staging, prod, DR). If the prod cluster fails:
Scenario: User Traffic Shifting
For your agent, implement:
Output:
Here's a production-ready example:
Directory structure:
argocd/agent-multi-cluster-appset.yaml:
Deploy the ApplicationSet:
Output:
Setup: Use the same FastAPI agent from previous chapters. You now have three Kubernetes clusters available (or can simulate with three Minikube instances).
Part 1: Design Your Multi-Cluster Strategy
Ask AI: "I have a FastAPI agent that I want to deploy to three clusters: staging, prod, and DR. Each should have different resource allocations. Design a multi-cluster deployment strategy using ArgoCD that supports: (1) Separate configurations per cluster, (2) Secrets management outside of Git, (3) Automatic failover if one cluster becomes unhealthy."
Part 2: Refine Secret Handling
"How would I configure External Secrets to pull database passwords from HashiCorp Vault for my prod cluster, while the staging cluster gets test credentials from a different secret location?"
Part 3: Test with One Cluster First
"I want to set up a test ApplicationSet with just my staging cluster to verify the approach works before adding prod and DR. Give me a minimal ApplicationSet that deploys to a single cluster with custom values."
Part 4: Scaling to Three Clusters
"Now add the prod and dr clusters to the ApplicationSet. How do I ensure the cluster selector only deploys to clusters with the deploy=true label?"
Part 5: Design Failover
"If my prod cluster becomes unreachable, how does ArgoCD detect this and how would my users be notified? What monitoring should I add to alert when a cluster is unhealthy?"
You built a gitops-deployment skill in Chapter 0. Test and improve it based on what you learned.
Ask yourself:
If you found gaps: