Module 7 takes the agent you built in Module 6 and turns it into a production cloud service. You'll containerize the stack, orchestrate it on Kubernetes, automate delivery, and operate it with observability, security, and cost controls. The goal: a reliable Digital FTE that runs 24/7 for real users.
Prerequisites: Modules 4-6. You need a working agent service to deploy.
You've deployed version 2 of your Task API. It passed all tests in staging. But staging is not production. Real users behave differently than test suites. Network conditions vary. Edge cases appear that no one anticipated. Shipping a new version directly to 100% of users is gambling—if something breaks, everyone breaks.
Traffic splitting lets you deploy with confidence. Send 5% of traffic to the new version. Watch the error rates. If errors spike, roll back instantly. If metrics look good, increase to 25%, then 50%, then 100%. This is progressive delivery—reducing risk by validating in production with real traffic, while keeping most users safe on the proven version.
This lesson teaches three deployment patterns: canary (gradual rollout), blue-green (instant switch), and A/B testing (feature experiments). You'll implement each using HTTPRoute traffic weights and header-based routing. By the end, you'll deploy new versions without the 3 AM panic of a full cutover gone wrong.
Traffic splitting divides incoming requests between multiple backend services. Gateway API implements this through HTTPRoute backendRefs with weight fields.
Weights are relative. In this example:
Weights don't need to sum to 100:
This produces the same 90/10 split (9/10 = 90%, 1/10 = 10%).
Canary deployment releases changes to a small subset of users first. If the canary version fails, only that subset is affected. The name comes from coal mining—canaries detected toxic gases before miners were harmed.
Deploy version 2 alongside version 1:
Output:
Create HTTPRoute with 5% canary traffic:
Apply:
Output:
Send requests and check distribution:
Output:
Approximately 5% of requests reach the canary.
As confidence grows, increase canary traffic:
Stage 1: 5% (initial validation)
Stage 2: 25% (broader validation)
Stage 3: 50% (confidence building)
Stage 4: 100% (full rollout)
After full rollout, rename canary to stable and remove the old deployment.
Watch error rates between versions:
Output:
If canary shows significantly higher errors, rollback immediately.
When canary metrics look bad, rollback is one YAML change away.
Shift all traffic back to stable:
Apply:
Output:
Traffic instantly stops going to canary. Requests in flight complete, but no new requests reach the problematic version.
Output:
All traffic now goes to stable.
After rollback, clean up the failed deployment:
Output:
Blue-green deployment maintains two identical environments. One serves production traffic (blue), the other sits idle with the new version (green). When ready, you switch all traffic instantly.
Deploy both environments:
Output:
Route all traffic to blue:
When green is validated and ready:
Apply:
Output:
Traffic instantly switches to green. To rollback, swap weights back.
If green has issues, instant rollback:
The old environment is still running—just flip the switch.
A/B testing routes specific users to different versions. Unlike canary (random percentage), A/B testing is deterministic—users with certain attributes always see the test version.
Route beta users to the experimental version:
Apply:
Output:
Output:
Output:
Route premium users to dedicated infrastructure:
Premium users get:
Real deployments often combine patterns. Canary the new version to 10% of standard users while all premium users stay on stable.
Premium users are protected—they never see the canary. Standard users get the gradual rollout.
Traffic splitting without monitoring is flying blind. You need to see error rates, latency, and throughput per version.
Error rate by version:
Latency by version (p99):
Requests per second by version:
Run:
Output:
Canary looks healthy—proceed with rollout.
Set up 10% canary traffic:
Verify:
Expected Output:
Route beta users to a test version:
Test (if services exist):
Create blue-green configuration and switch environments:
Switch to green (modify and reapply):
Expected Output:
Practice rollback by setting canary weight to 0:
Verify:
Expected Output:
You built a traffic-engineer skill in Lesson 0. Based on what you learned about traffic splitting:
Your skill should generate configuration for each pattern:
Canary template:
Blue-green template:
A/B testing template:
Your skill should include rollback commands:
Ask your traffic-engineer skill:
What you're learning: AI generates traffic splitting configurations. Review the output—are weights correct? Is the path matching appropriate? Do the service names match your request?
Check AI's output:
If you need progressive stages, ask:
What you're learning: AI adapts configurations through iteration. Compare the stage 1 and stage 2 outputs—only weights should change.
Extend for premium user protection:
What you're learning: Rule ordering matters. AI should place the premium rule first so it matches before the general canary rule.
Ask for operational commands:
What you're learning: AI generates operational commands, not just YAML. Review the patch syntax—is it correct for your HTTPRoute?
Traffic splitting affects real users in production. Always test configurations with kubectl apply --dry-run=client first. Start canary deployments with small percentages (1-5%) and monitor error rates before increasing. Keep rollback commands ready—you may need them at 3 AM.