Container images that work are good. Container images that work AND are small are great. This lesson teaches you why small images matter and how to achieve 70-85% size reduction through iterative optimization.
When you build a Docker image for a Python service, you typically need compilers and development libraries during the build process. But in production, you only need the installed packages themselves. A naive Dockerfile includes everything---build tools, development headers, cache files---adding hundreds of megabytes of unnecessary weight.
Multi-stage builds solve this elegantly. You perform dependency installation in a large build image, then copy only the artifacts you need into a minimal production image. Combined with UV (a Rust-based package manager that's 10-100x faster than pip) and Alpine base images, you can reduce a 1.2GB image to under 120MB.
In this lesson, you'll containerize a Task API service---the same pattern you'll use for your Module 6 FastAPI agent. You'll start with a bloated Dockerfile and progressively optimize it through four iterations, measuring the size reduction at each step.
Create a fresh project with UV (30 seconds):
Now add your application code. Open main.py and replace its contents with:
UV automatically created pyproject.toml with your dependencies. For Docker, we need a requirements.txt. Export it:
Verify your setup works:
Visit http://localhost:8000/health to confirm the API responds. Press Ctrl+C to stop.
Let's start with a Dockerfile that works but doesn't consider image size at all.
Create Dockerfile.naive:
Build this image and check its size:
Output:
Now check the image size:
Output:
1.21GB for a simple Task API. That bloat comes from:
None of that is needed to RUN the application. You only need the installed Python packages---maybe 100MB total.
The python:3.12 image is the full-featured version. Docker provides leaner alternatives:
Let's try slim first---it's the safest improvement with the least risk.
Create Dockerfile.v1-slim:
Note: We added --no-cache-dir to pip to avoid storing the download cache.
Build and measure:
Output:
Output:
Progress: 62% reduction from a single change (base image). But we're still carrying pip overhead and could do better.
Multi-stage builds use multiple FROM instructions in a single Dockerfile. Each stage can use a different base image. You build dependencies in a large stage, then copy only what you need into a small stage.
We'll also introduce UV, a Rust-based Python package manager that's 10-100x faster than pip.
Create Dockerfile.v2-multistage:
Let's understand what's happening:
Stage 1 (builder):
Stage 2 (runtime):
Build and measure:
Output:
Notice how UV installed dependencies in 1.8 seconds vs pip's 8+ seconds.
Output:
Progress: 85% reduction. The runtime image has no UV, no pip, no build tools---only the installed packages.
Alpine Linux is a minimal distribution (~5MB base) designed for containers. Combined with multi-stage builds and UV, we can achieve maximum size reduction.
Create Dockerfile.v3-alpine:
Build and measure:
Output:
Output:
Progress: 90% reduction. From 1.21GB to 118MB.
Docker builds images in layers. Each RUN instruction creates a new layer. By combining commands and cleaning up in the same layer, we can squeeze out a few more megabytes.
Create Dockerfile.v4-optimized:
Optimizations applied:
Build and measure:
Output:
Output:
Let's see all versions side by side:
Output:
Result: 1.21GB reduced to 115MB (90.5% reduction)
The docker history command shows what each layer contains:
Output:
The SIZE column shows the contribution of each layer:
If you see an unexpectedly large layer, that's where to focus optimization efforts.
For this chapter: Alpine is excellent for AI services that don't require complex C dependencies. Your Task API (and most FastAPI services) work perfectly with Alpine.
When Alpine fails: If you need numpy, pandas, or other packages with C extensions that aren't available as Alpine wheels, fall back to slim.
A critical consideration for AI services: never embed large model files in Docker images.
Wrong approach (image becomes 4GB+):
Correct approach (use volume mounts):
Run with volume mount:
Output:
Your application code loads models from the mounted directory:
Benefits:
Here's the pattern to apply to any Python AI service:
When to deviate: