Your Docker Desktop Kubernetes cluster is running. Now let's deploy something to it.
In Docker, you ran docker run my-agent:v1 to start a container. In Kubernetes, you don't run containers directly—you create Pods. A Pod wraps one or more containers and adds the production features Kubernetes needs: shared networking, health management, resource guarantees, and co-location for tightly coupled processes.
This chapter teaches you to write Pod manifests by hand, deploy them with kubectl apply, and inspect them with kubectl describe and kubectl logs. By the end, you'll have deployed your first workload to Kubernetes and understand why Pods (not containers) are the atomic unit of deployment.
When you worked with Docker, you thought in containers:
When you work with Kubernetes, you think in Pods:
Think of a Pod like an apartment:
Key Insight: Roommates can't coordinate at 2 AM from separate apartments. They're in the same apartment for a reason. Similarly, containers in a Pod are co-located because they need tight coordination.
The most counterintuitive feature of Pods: Containers in the same Pod share localhost.
If you have two containers in one Pod:
This works because containers share the Pod's network namespace. There's no bridge or service discovery needed for intra-Pod communication—they're literally on the same network interface.
Kubernetes uses declarative YAML manifests instead of imperative Docker commands. Instead of:
You write a manifest describing what you want:
apiVersion & kind: Kubernetes API version and resource type
metadata: Identification and labeling
spec.containers: What actually runs
resources: CPU and memory guarantees
YAML enables:
This is the shift from imperative (Docker: "run this") to declarative (Kubernetes: "this should exist").
Create a file nginx-pod.yaml with the manifest above, then:
Output:
This tells Kubernetes: "Make sure this Pod exists. If it doesn't, create it. If it does but the spec changed, update it."
Check if the Pod was created:
Output:
Columns explained:
When you create a Pod, it doesn't run instantly. Kubernetes goes through several states:
Pending: Pod created but not yet running
Running: At least one container is running
Succeeded: All containers completed successfully
Failed: At least one container failed
Watch a Pod's journey from creation to running:
The -w flag means "watch"—stream updates as they happen.
Output (abbreviated):
Key information:
See what your application is outputting:
Output (nginx startup logs):
For multi-container Pods, view logs from a specific container:
When Kubernetes creates your Pod, it assigns an IP address from the cluster network (usually 10.0.0.0/8):
These IPs are ephemeral—they change when the Pod restarts. This is critical:
❌ WRONG: Store Pod IPs in configuration files ✅ RIGHT: Use Kubernetes Services (next lesson) for stable networking
Create a Pod with two containers (web app and log shipper):
The log-shipper can reach the web container at localhost:8080 because they share the Pod's network namespace.
Both containers see:
Most Pods contain one container:
This is the common case. One container = one concern.
Two containers in one Pod when they need tight coupling:
Use case 1: Log Shipper Sidecar
Main container writes logs to stdout. Sidecar ships logs somewhere:
Both containers share the logs volume. App writes to /var/log, log-shipper reads from /var/log.
Use case 2: Init Container (Setup)
An init container runs before main containers:
Init containers always complete before app containers start.
Don't force unrelated containers into one Pod:
❌ WRONG:
Each should be its own Pod. Multi-container is for tightly coupled responsibilities (logging, monitoring, security).
When a container exits, Kubernetes decides what to do based on RestartPolicy:
RestartPolicy options:
Critical insight: Pods are NOT pets. They're cattle.
When a Pod terminates:
Output:
The Pod is gone. It doesn't come back unless:
❌ WRONG:
✅ RIGHT:
You saw this in the manifest earlier. It's critical for production:
requests tells Kubernetes: "This Pod needs at least this much."
Kubernetes uses requests to schedule (place Pods on nodes with available capacity):
Without requests, Kubernetes can overcommit (overload a node).
limits tells Kubernetes: "This Pod can use at most this much."
If container exceeds limits, Kubernetes kills it:
For most applications:
Example for a Python FastAPI app:
Create file hello-api.yaml:
Output:
Output:
Look for:
Output:
Stop the Pod:
Output:
Notice: No Pod respawns. It's gone permanently.
Why? Because we created it directly. In real deployments, you'd use a Deployment (next lesson) that automatically respawns Pods when they fail.
❌ WRONG:
✅ RIGHT: Use persistent volumes or external databases (PostgreSQL in Cloud SQL, etc.)
❌ WRONG:
✅ RIGHT: Always set requests and limits based on actual application needs.
❌ WRONG:
✅ RIGHT: Use Kubernetes Services (next lesson) for stable DNS names.
❌ WRONG:
✅ RIGHT: Create separate Pods for separate services. Only use multi-container for tight coupling (sidecars).
Now that you understand Pods manually, explore deeper questions with AI:
Part 1: Pod Architecture
Ask AI: "Why would I run multiple containers in the same Pod instead of creating separate Pods? Give me 3 real-world examples."
Expected: AI should explain sidecar patterns (logging, monitoring, security sidecar) and why tight network coupling matters.
Part 2: Networking Implications
Ask AI: "In a Pod with two containers, Container A listens on port 8080 and Container B tries to reach it—what's the address that Container B should use?"
Expected: AI should explain that localhost:8080 works because they share the network namespace.
Part 3: Lifecycle and Persistence
Ask AI: "I deployed a Pod that crashed. The Pod is gone. How is this different from a Docker container that exited? What solves the 'Pod keeps crashing but doesn't respawn' problem?"
Expected: AI should explain ephemeral nature of Pods, mention Deployments as the solution for automatic respawning.
Part 4: Resource Limits
Ask AI: "My application uses about 200Mi of memory under normal load. What values should I set for memory requests and limits? Why are they different?"
Expected: AI should explain the distinction (requests for scheduling, limits for hard ceiling) and suggest reasonable safety margins.
You built a kubernetes-deployment skill in Chapter 1. Test and improve it based on what you learned.
Ask yourself:
If you found gaps: