Your FastAPI agent from Module 6 is now containerized. In Sub-module 1, you packaged it into a Docker image—portable, reproducible, ready to run anywhere. But "anywhere" in production means dozens of servers, not your laptop.
Here's the problem: You run docker run my-agent:v1 on one server. Great. Now what happens when:
With Docker alone, YOU are the "someone" who handles all of this manually. With Kubernetes, the cluster handles it automatically.
This lesson builds the mental model you need before running any kubectl commands: why orchestration exists, how Kubernetes thinks declaratively about state, and what happens inside a cluster when things go wrong.
In Chapter 79, you learned Docker's core value: package your application into an image, run it as a container anywhere. Docker solved the "it works on my machine" problem.
But Docker alone doesn't solve:
1. Multi-machine deployment: docker run starts one container on one machine. Production needs containers across 10, 100, or 1000 machines.
2. Self-healing: If a container crashes, Docker doesn't restart it automatically (unless you use restart policies—and even those are limited). Your application stays down until someone notices.
3. Scaling: When traffic increases, you need more container copies. With Docker alone, you'd manually run docker run on multiple machines and somehow balance traffic between them.
4. Updates without downtime: Deploying a new version means stopping the old container, then starting the new one. Users see an error during that gap.
5. Service discovery: Your frontend needs to talk to your backend. Both are containers, but their IP addresses change every time they restart. How do they find each other?
Container orchestration fills this gap. Kubernetes is the industry standard for orchestrating containers at scale.
The most important concept in Kubernetes is the declarative model.
Imperative (Docker-style): You give step-by-step instructions.
Declarative (Kubernetes-style): You describe the end state you want.
You declare: "I want 3 nginx containers running." Kubernetes makes it happen. If one crashes, Kubernetes notices the mismatch (desired: 3, actual: 2) and creates a new one.
This is the core loop:
This loop runs continuously. Kubernetes controllers are always watching, comparing desired to observed, and taking action to close the gap.
The declarative model means:
A Kubernetes cluster has two types of machines:
Control Plane: The brains. Makes decisions about what should run where.
Worker Nodes: The muscles. Actually run your containers.
The control plane runs on dedicated machines (or VMs) and manages the entire cluster.
The front door to Kubernetes. Every operation—creating a deployment, scaling replicas, checking status—goes through the API server.
When you run kubectl apply, you're sending a request to the API server. When the dashboard shows your pods, it queried the API server.
The API server:
A distributed key-value store that holds the cluster's state:
etcd is the single source of truth. If etcd loses data, you lose your cluster configuration.
When you create a Pod, someone needs to decide which worker node runs it. That's the scheduler.
The scheduler considers:
It doesn't start the Pod—it just assigns the Pod to a node. The kubelet on that node does the actual starting.
Controllers implement the reconciliation loop. Each controller watches specific resources:
When a Pod crashes, the ReplicaSet controller detects the mismatch (desired vs observed) and creates a replacement.
Worker nodes run your actual containers. Each node has three critical components.
The agent running on every worker node. It:
When you deploy a Pod, the scheduler assigns it to a node, and the kubelet on that node makes it happen.
Handles network routing on each node. When your frontend Pod needs to reach your backend Pod, kube-proxy manages the routing.
It implements Kubernetes Services—stable network endpoints that route to the right Pods even when Pod IPs change.
The software that actually runs containers. Kubernetes supports multiple runtimes:
The kubelet talks to the container runtime through a standard interface (CRI—Container Runtime Interface), so Kubernetes doesn't care which runtime you use.
Everything in Kubernetes operates through reconciliation loops:
This loop runs constantly for every resource type. Here's what happens when you deploy:
Let's trace through a failure scenario:
All of this happens without human intervention. The reconciliation loop handles it.
Cluster: A set of machines (nodes) running Kubernetes.
Control Plane: The components that manage the cluster (API server, scheduler, etcd, controller manager).
Worker Node: A machine that runs your containers.
Pod: The smallest deployable unit. Contains one or more containers.
Deployment: A higher-level resource that manages Pods through ReplicaSets.
Desired State: What you specify in YAML manifests.
Observed State: What actually exists in the cluster right now.
Reconciliation: The process of detecting mismatches and taking action.
Controller: A loop that watches resources and ensures desired state matches observed state.
Think of Kubernetes as an operating system for containers:
The key difference: a traditional OS manages one machine. Kubernetes manages a fleet of machines as a single system.
Before moving to hands-on work, solidify your understanding with these prompts:
Prompt 1: Test your declarative model understanding
Ask AI: "I have a Kubernetes Deployment with replicas: 3. One Pod crashes and gets stuck in CrashLoopBackOff. Explain what each Kubernetes component does in response."
Expected answer should mention: kubelet detects crash, restarts container, ReplicaSet controller monitors Pod status, scheduler not involved unless Pod is fully terminated.
Prompt 2: Control plane vs worker nodes
Ask AI: "If I have 3 worker nodes and the control plane node goes down, what happens to my running containers? What can't I do anymore?"
Expected answer: Running containers continue (kubelet keeps them alive), but you can't deploy new workloads, scale, or make changes (no API server).
Prompt 3: Why etcd matters
Ask AI: "If etcd loses all its data in a Kubernetes cluster, what happens? What's lost?"
Expected answer: Cluster state is lost—all Deployments, Services, ConfigMaps, Secrets. Running containers might continue but can't be managed. This is why etcd backups are critical.
Prompt 4: Reconciliation in practice
Ask AI: "I change my Deployment from replicas: 3 to replicas: 5. Walk me through exactly which controllers notice and what actions they take."
Expected answer should trace: Deployment controller creates/updates ReplicaSet, ReplicaSet controller sees 3 Pods but wants 5, creates 2 Pod objects, scheduler assigns them, kubelets start containers.
Prompt 5: Kubernetes vs Docker
Ask AI: "My teammate says 'just use Docker Compose for production.' What's missing compared to Kubernetes?"
Expected answer: Multi-machine orchestration, automatic failover across nodes, rolling updates across a fleet, service discovery at scale, built-in load balancing.
You built a kubernetes-deployment skill in Lesson 0. Test and improve it based on what you learned.
Ask yourself:
If you found gaps: