You've deployed your agent to Kubernetes. Requests arrive. Your agent handles them. But where do the logs go? Where's the monitoring data? How do you trace which request took how long?
Traditional single-container deployments mix your application logic with logging, monitoring, and debugging concerns. Your agent code becomes cluttered. Operational concerns become tangled with business logic. Testing gets harder.
Sidecars solve this. A sidecar is a helper container that runs alongside your main application container—in the same Pod, sharing the same network and storage. Your agent focuses on its job. The sidecar handles logging, metrics, proxying, or security concerns independently. They coordinate through shared volumes and localhost networking.
This lesson teaches you to design and deploy sidecars using Kubernetes' native sidecar support (available since Kubernetes 1.28). By the end, you'll deploy an agent with a logging sidecar and a metrics sidecar, keeping operational concerns separated from application logic.
Imagine your FastAPI agent needs to:
If all this logic lives in one container, your code becomes messy:
Your application code is entangled with operational concerns. Testing requires mocking file I/O and metrics. Scaling requires coordinating all concerns together.
Instead, separate concerns into independent containers:
Your agent writes logs to /var/log/agent/requests.log. The logging sidecar watches that file and streams it to a central logging service. Your agent exposes metrics on localhost:8000/metrics. A metrics sidecar scrapes and forwards those metrics.
Each container has one job. Testing is simpler. Scaling is predictable.
Before Kubernetes 1.28, sidecars were regular containers in the containers field, but Kubernetes couldn't distinguish them from the main application container. This created ambiguity about startup ordering and lifecycle.
Kubernetes 1.28 introduced the initContainers field with a new restartPolicy: Always option specifically for sidecars:
This guarantees:
Your agent and sidecars must communicate. The most reliable method: shared volumes.
Create a Pod with:
Create agent-with-logging.yaml:
Key Implementation Points:
Deploy it:
Output:
Verify the Pod is running:
Output:
The 2/2 shows both containers are ready. Without the sidecar, you'd see 1/1.
Exec into the agent container and create a test log:
Inside the container:
Output:
Exit:
Now exec into the logging sidecar and read the same file:
Inside the sidecar:
Output:
Both containers see the same logs. The sidecar can now stream this data to a centralized logging service (e.g., Elasticsearch, Splunk, or Stackdriver).
Most production Pods run multiple sidecars. Let's add a metrics sidecar alongside logging.
Create agent-with-logging-and-metrics.yaml:
Deploy:
Output:
Check Pod status:
Output:
Three containers running in one Pod: agent, logging sidecar, metrics sidecar. All share the Pod's network namespace, so the metrics sidecar can reach the agent's metrics endpoint at localhost:8001/metrics without any network configuration.
When you use initContainers with restartPolicy: Always, Kubernetes guarantees:
When a Pod terminates:
Let's build a realistic logging sidecar for your agent. Your agent logs inference requests and latencies. The sidecar streams these to stdout for collection.
Create agent-inference-logging.yaml:
Deploy:
Output:
View logs from the sidecar:
Output:
Open a terminal and work through these scenarios with an AI assistant's help:
Your task: Design a Pod manifest where:
AI Prompting Guide:
Your task: Your agent now needs:
AI Prompting Guide:
Your task: You deployed a Pod with an agent and logging sidecar. The Pod shows 2/2 READY, but logs aren't being collected.
AI Prompting Guide: