Your FastAPI agent from Module 6 is containerized and pushed to a registry. Now deploy it to a production-like environment using Kubernetes.
This capstone differs from earlier lessons: you start with a specification, not a command. Write out what you're deploying and why. Then have AI generate the manifests that implement your specification. Finally, deploy your agent, configure it with secrets and environment variables, and validate that Kubernetes is managing it correctly (scaling, self-healing, logging).
By the end, your agent will run on a Kubernetes cluster, surviving Pod crashes and responding to scaling demands—all orchestrated by Kubernetes' declarative model.
Before you write a single line of YAML, specify what you're building.
A specification answers these questions:
Here's a template to guide your thinking:
This specification becomes the contract between you and Kubernetes. AI will read it and generate manifests that fulfill the contract.
Before proceeding to AI, write your specification in a text editor or document.
Container image: This is the image you pushed in Module 6. Remember the format:
Example: ghcr.io/yourusername/module6-agent:latest
Port: Your FastAPI agent listens on port 8000 by default (unless you configured differently in Module 6).
Environment variables: Your agent needs:
Readiness and liveness probes: FastAPI's health check endpoint is typically /health or you can use the root endpoint / and check for HTTP 200.
Replicas: For learning purposes, 2-3 replicas is appropriate (shows redundancy). For production load, you'd scale higher.
Resources: A FastAPI agent with OpenAI calls is lightweight:
Service exposure: For Docker Desktop development, NodePort is practical (access via localhost). For cloud clusters, LoadBalancer or Ingress.
Here's what a completed specification looks like:
Use this as a model, then customize it for YOUR agent and deployment preferences.
Once you've written your specification, you have two approaches:
Approach 1: Manual (Educational)
Approach 2: AI-Assisted (Practical)
For this capstone, use Approach 2—this demonstrates the core AI-native workflow: Specification → AI Generation → Validation.
When asking AI to generate manifests, provide:
Here's a template:
Once you have your manifests (either written by hand or generated by AI), deploy them to your Kubernetes cluster.
If using Docker Desktop, ensure Kubernetes is enabled (green indicator in Docker Desktop).
This isolates your deployment and makes cleanup easier.
Or apply all at once:
For Docker Desktop + NodePort:
For kubectl port-forward (alternative):
For cloud cluster + LoadBalancer:
Run through this checklist to confirm your deployment succeeded:
All Pods are Running
bash kubectl get pods # All should show STATUS: Running
Desired Replicas Match Actual
bash kubectl get deployment # DESIRED, CURRENT, READY should all be equal
Readiness Probes Pass
bash kubectl get pods # All Pods should show READY 1/1
Environment Variables are Set
bash kubectl exec [pod-name] -- env | grep OPENAI_API_KEY # Should show the key is populated
ConfigMap Values are Injected
bash kubectl exec [pod-name] -- env | grep LOG_LEVEL # Should show your configured value
Service Exists and Routes to Pods
bash kubectl get endpoints agent-service # Should show IP addresses of Pods
Agent Responds to Health Check
bash curl http://[service-ip]:[port]/health # Should return HTTP 200
External Access Works (NodePort or LoadBalancer)
bash # Test from outside the cluster curl http://[external-ip]:[port]/health
Pod Recovery Works
Desired State Maintains Replicas
Logs are Accessible
bash kubectl logs [pod-name] # Should see your agent's startup logs
Recent Requests are Logged
bash # After making a request to your agent: kubectl logs [pod-name] --tail=20 # Should see request in logs
Use AI to help debug issues and explore advanced deployment scenarios. This section demonstrates the three-role collaboration that makes AI-native development effective.
Ask AI:
What to evaluate:
If your deployment fails, use AI to diagnose:
Ask AI:
What to evaluate:
Then refine: Based on AI's analysis, describe what you tried:
Ask AI:
What to evaluate:
Ask AI:
What to evaluate:
Ask AI:
What to evaluate:
Remember: The goal of this capstone is not just to deploy your agent, but to understand how specification-first development works. Your specification is the contract. The manifests implement it. Kubernetes ensures the contract is maintained (desired state = observed state, always).
When self-healing works (Pod dies, new one starts), you're seeing the declarative model in action.