USMAN’S INSIGHTS
AI ARCHITECT
  • Home
  • About
  • Thought Leadership
  • Book
Press / Contact
USMAN’S INSIGHTS
AI ARCHITECT
⌘F
HomeBook
HomeBookThe Shielded Agent: Hardening Your AI Against Real-World Threats
Previous Chapter
Multi-Stage Builds Optimization
Next Chapter
Docker Image Builder Skill
AI NOTICE: This is the table of contents for the SPECIFIC CHAPTER only. It is NOT the global sidebar. For all chapters, look at the main navigation.

On this page

56 sections

Progress0%
1 / 56

Muhammad Usman Akbar Entity Profile

Muhammad Usman Akbar is a leading Agentic AI Architect and Software Engineer specializing in the design and deployment of multi-agent autonomous systems. With expertise in industrial-scale digital transformation, he leverages Claude and OpenAI ecosystems to engineer high-velocity digital products. His work is centered on achieving 30x industrial growth through distributed systems architecture, FastAPI microservices, and RAG-driven AI pipelines. Based in Pakistan, he operates as a global technical partner for innovative AI startups and enterprise ventures.

USMAN’S INSIGHTS
AI ARCHITECT

Transforming businesses into autonomous AI ecosystems. Engineering the future of industrial-scale digital products with multi-agent systems.

30X Growth
AI-First
Innovation

Navigation

  • Home
  • Book
  • About
  • Contact
Let's Collaborate

Have a Project in Mind?

Let's build something extraordinary together. Transform your vision into autonomous AI reality.

Start Your Transformation

© 2026 Muhammad Usman Akbar. All rights reserved.

Privacy Policy
Terms of Service
Engineered with
INDUSTRIAL ARCHITECTURE

Production Hardening

Your container builds and runs. The image is optimized with multi-stage builds. But production environments demand more than a working container—they require security, observability, and resilience.

Consider what happens when your containerized FastAPI agent service goes to production. Kubernetes needs to know if your service is healthy before routing traffic to it. If your container runs as root, a security vulnerability could give an attacker full control of the host. If configuration is hardcoded, you'll need to rebuild images for every environment (dev, staging, production).

Production hardening addresses these concerns through three patterns: environment variable configuration (flexibility), health checks (observability), and non-root users (security). These aren't optional extras—they're requirements for any serious deployment. In this lesson, you'll add each pattern to your Dockerfile, understand why it matters, and end up with a production-ready container template you'll use for every AI service you build.


The Three Pillars of Production Hardening

Before diving into implementation, understand what we're solving:

PillarProblemSolution
ConfigurationHardcoded values require image rebuildsEnvironment variables at runtime
ObservabilityOrchestrators can't detect unhealthy containersHealth check endpoints + HEALTHCHECK instruction
SecurityRoot containers enable privilege escalationNon-root user execution

Each pillar is independent but together they form the foundation of production-ready containers. Let's implement each one.


Environment Variables for Configuration

Hardcoded configuration creates fragile containers. If your Dockerfile specifies LOG_LEVEL=INFO, you need a new image for debug logging. If API_HOST=production.example.com, you can't run locally.

Docker provides two instructions for configuration:

  • ARG: Build-time variables (available during docker build, NOT in running container)
  • ENV: Runtime variables (available when container runs)

Understanding ENV

The ENV instruction sets environment variables that persist into the running container:

dockerfile
ENV PYTHONUNBUFFERED=1 ENV LOG_LEVEL=INFO ENV API_HOST=0.0.0.0 ENV API_PORT=8000

Your application reads these values at runtime.

Update main.py:

python
import os from fastapi import FastAPI app = FastAPI() LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO") API_HOST = os.getenv("API_HOST", "0.0.0.0") @app.get("/") def read_root(): return {"message": "Hello from Docker!", "log_level": LOG_LEVEL} @app.get("/health") def health_check(): return {"status": "healthy"}

Output:

bash
$ docker run -p 8000:8000 my-app:latest INFO: Uvicorn running on http://0.0.0.0:8000 $ curl http://localhost:8000/ {"message":"Hello from Docker!","log_level":"INFO"}

Overriding ENV at Runtime

The -e flag overrides environment variables when starting a container:

Specification
docker run -e LOG_LEVEL=DEBUG -e API_PORT=9000 -p 9000:9000 my-app:latest

Output:

bash
$ docker run -e LOG_LEVEL=DEBUG -p 8000:8000 my-app:latest INFO: Uvicorn running on http://0.0.0.0:8000 $ curl http://localhost:8000/ {"message":"Hello from Docker!","log_level":"DEBUG"}

The container uses DEBUG instead of the default INFO without rebuilding the image.

Understanding ARG

ARG defines build-time variables. They're available during docker build but not when the container runs:

dockerfile
ARG PYTHON_VERSION=3.12 FROM python:${PYTHON_VERSION}-alpine # ARG value is accessible here during build RUN echo "Building with Python ${PYTHON_VERSION}" # But NOT accessible at runtime # The following would fail because ARG is gone after build: # CMD ["echo", "${PYTHON_VERSION}"]

Build with different Python versions:

bash
docker build --build-arg PYTHON_VERSION=3.11 -t my-app:py311 . docker build --build-arg PYTHON_VERSION=3.12 -t my-app:py312 .

Output:

text
$ docker build --build-arg PYTHON_VERSION=3.11 -t my-app:py311 . [+] Building 2.1s (8/8) FINISHED => [1/4] FROM docker.io/library/python:3.11-alpine => [2/4] RUN echo "Building with Python 3.11" Building with Python 3.11...

When to Use ARG vs ENV

Use CaseInstructionExample
Python version for base imageARGARG PYTHON_VERSION=3.12
Log level for running appENVENV LOG_LEVEL=INFO
Git commit hash for image tagARGARG GIT_SHA
API keys (runtime secrets)ENV (via -e)-e API_KEY=abc123
Package versions during buildARGARG UV_VERSION=0.4.0
Feature flags in running containerENVENV ENABLE_CACHING=true

Key distinction: If you need the value when the container RUNS, use ENV. If you only need it during BUILD, use ARG.


Health Check Implementation

Orchestrators like Kubernetes need to know if your container is healthy. A container can be "running" but completely broken—the process exists but crashes on every request. Health checks detect this.

Adding a Health Endpoint to FastAPI

First, ensure your FastAPI service has a health endpoint.

Update main.py:

python
from fastapi import FastAPI import os app = FastAPI() @app.get("/") def read_root(): return {"message": "Hello from Docker!"} @app.get("/health") def health_check(): """Health check endpoint for Docker HEALTHCHECK and Kubernetes probes.""" return {"status": "healthy"}

Test the endpoint:

bash
uvicorn main:app --host 0.0.0.0 --port 8000 & curl http://localhost:8000/health

Output:

json
$ curl http://localhost:8000/health {"status":"healthy"}

Docker HEALTHCHECK Instruction

The HEALTHCHECK instruction tells Docker how to verify container health:

dockerfile
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ CMD wget --no-verbose --tries=1 --spider http://localhost:8000/health || exit 1

Let's break down each component:

ParameterValueMeaning
--interval=30s30 secondsHow often to run the check
--timeout=10s10 secondsHow long to wait for response
--start-period=5s5 secondsGrace period for container startup
--retries=33 attemptsFailed checks before marking unhealthy
CMDwget commandThe actual health check command

Why wget instead of curl?

Alpine images include wget by default but not curl. Using wget avoids adding dependencies. For slim-based images, use curl:

dockerfile
# For Alpine-based images (wget built-in): HEALTHCHECK CMD wget --no-verbose --tries=1 --spider http://localhost:8000/health || exit 1 # For slim-based images (curl built-in): HEALTHCHECK CMD curl --fail http://localhost:8000/health || exit 1

Verifying Health Check Status

Build and run a container with the health check:

bash
docker build -t health-app:latest . docker run -d --name health-test -p 8000:8000 health-app:latest

Wait 30 seconds for the first health check, then inspect:

Specification
docker inspect --format='{{json .State.Health}}' health-test | python -m json.tool

Output:

json
{ "Status": "healthy", "FailingStreak": 0, "Log": [ { "Start": "2024-12-27T10:30:00.123456Z", "End": "2024-12-27T10:30:00.234567Z", "ExitCode": 0, "Output": "" } ] }

The "Status": "healthy" confirms the health check passed. If the endpoint fails, status becomes "unhealthy" and FailingStreak increments.

Health Check Status in docker ps

Specification
docker ps

Output:

text
CONTAINER ID IMAGE COMMAND STATUS PORTS a1b2c3d4e5f6 health-app:latest "uvicorn main:app..." Up 2 minutes (healthy) 0.0.0.0:8000->8000/tcp

Notice (healthy) in the STATUS column. This is how you quickly verify container health.

Clean up:

bash
docker stop health-test && docker rm health-test

Non-Root User Security

By default, Docker containers run as root. If an attacker exploits a vulnerability in your application, they have root access to the container—and potentially the host.

Running as a non-root user limits damage. Even if compromised, the attacker has limited privileges.

Creating a Non-Root User

Add a dedicated user in your Dockerfile:

dockerfile
# Create non-root user with specific UID RUN adduser -D -u 1000 appuser

The flags:

  • -D: Don't assign a password (non-interactive)
  • -u 1000: Assign user ID 1000 (conventional for app users)
  • appuser: Username

Switching to Non-Root User

After creating the user, switch to it with USER:

dockerfile
# Create user RUN adduser -D -u 1000 appuser # ... copy files with ownership ... # Switch to non-root user BEFORE CMD USER appuser CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Copying Files with Correct Ownership

Files copied into the container are owned by root by default. Use --chown to set ownership:

dockerfile
# Copy with ownership set to appuser COPY --chown=appuser:appuser main.py .

Without --chown, the non-root user can't read the files:

bash
# Without --chown: $ docker exec my-container ls -la /app/main.py -rw-r--r-- 1 root root 237 Dec 27 10:00 main.py # Owned by root # With --chown: $ docker exec my-container ls -la /app/main.py -rw-r--r-- 1 appuser appuser 237 Dec 27 10:00 main.py # Owned by appuser

Verifying Non-Root Execution

Build and run, then check who's running the process:

bash
docker build -t secure-app:latest . docker run -d --name secure-test secure-app:latest docker exec secure-test whoami

Output:

text
$ docker exec secure-test whoami appuser

Not root. The container runs with limited privileges.

Clean up:

bash
docker stop secure-test && docker rm secure-test

The Production Dockerfile Template

Combining all three pillars—configuration, health checks, and non-root user—creates a production-ready template:

dockerfile
# Stage 1: Build FROM python:3.12-alpine AS builder WORKDIR /app # Install UV for fast dependency installation RUN pip install uv COPY requirements.txt . # Install dependencies with UV RUN uv pip install --system --no-cache -r requirements.txt # Stage 2: Runtime FROM python:3.12-alpine # Create non-root user RUN adduser -D -u 1000 appuser WORKDIR /app # Copy dependencies from builder COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages COPY --from=builder /usr/local/bin /usr/local/bin # Copy application code with ownership COPY --chown=appuser:appuser main.py . # Environment configuration ENV PYTHONUNBUFFERED=1 ENV LOG_LEVEL=INFO # Switch to non-root user USER appuser # Health check HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ CMD wget --no-verbose --tries=1 --spider http://localhost:8000/health || exit 1 EXPOSE 8000 CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

This template includes:

  • Multi-stage build (from Chapter 6)
  • UV for fast dependency installation
  • Non-root user (appuser)
  • Correct file ownership (--chown)
  • Environment variable defaults
  • Health check with appropriate timing
  • Exposed port documentation

Building and Testing the Complete Template

Create the required files.

Create requirements.txt:

text
fastapi==0.115.0 uvicorn==0.30.0

Create main.py:

python
import os from fastapi import FastAPI app = FastAPI() LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO") @app.get("/") def read_root(): return {"message": "Production hardened!", "log_level": LOG_LEVEL} @app.get("/health") def health_check(): return {"status": "healthy"}

Build and run:

bash
docker build -t production-app:latest . docker run -d --name prod-test -p 8000:8000 production-app:latest

Output:

text
$ docker build -t production-app:latest . [+] Building 12.3s (14/14) FINISHED => [builder 1/4] FROM python:3.12-alpine => [builder 2/4] WORKDIR /app => [builder 3/4] RUN pip install uv => [builder 4/4] RUN uv pip install --system --no-cache -r requirements.txt => [stage-1 1/6] FROM python:3.12-alpine => [stage-1 2/6] RUN adduser -D -u 1000 appuser => [stage-1 3/6] WORKDIR /app => [stage-1 4/6] COPY --from=builder /usr/local/lib/python3.12/site-packages... => [stage-1 5/6] COPY --from=builder /usr/local/bin /usr/local/bin => [stage-1 6/6] COPY --chown=appuser:appuser main.py . => exporting to image Successfully tagged production-app:latest

Verify all three pillars:

bash
# 1. Configuration: Override LOG_LEVEL docker run --rm -e LOG_LEVEL=DEBUG production-app:latest sh -c 'echo $LOG_LEVEL'

Output:

Specification
DEBUG
bash
# 2. Health check: Wait 30s, then check status sleep 35 docker inspect --format='{{.State.Health.Status}}' prod-test

Output:

Specification
healthy
bash
# 3. Non-root user: Verify process owner docker exec prod-test whoami

Output:

Specification
appuser

All three pillars verified. Clean up:

Specification
docker stop prod-test && docker rm prod-test

Common Hardening Mistakes

Mistake 1: USER Before COPY

Placing USER appuser before COPY commands can cause permission errors:

dockerfile
# WRONG: User can't copy because it doesn't own /app yet USER appuser COPY main.py . # Fails: permission denied # CORRECT: Copy first with ownership, then switch user COPY --chown=appuser:appuser main.py . USER appuser

Mistake 2: Missing Health Endpoint

Adding HEALTHCHECK without implementing the endpoint:

dockerfile
# Docker checks /health, but your app doesn't have that route HEALTHCHECK CMD wget --spider http://localhost:8000/health || exit 1

Result: Container marked unhealthy immediately.

Fix: Always implement the health endpoint in your application code.

Mistake 3: Wrong Port in HEALTHCHECK

The health check runs INSIDE the container. Use the container's internal port:

dockerfile
# WRONG: 9000 is the host port, not container port HEALTHCHECK CMD wget --spider http://localhost:9000/health || exit 1 # CORRECT: 8000 is the port your app listens on inside the container HEALTHCHECK CMD wget --spider http://localhost:8000/health || exit 1

Mistake 4: Secrets in ENV

Never put sensitive data in Dockerfile ENV instructions:

dockerfile
# WRONG: Secret visible in image history and inspect ENV API_KEY=sk-abc123secret # CORRECT: Pass at runtime, never store in image # docker run -e API_KEY=sk-abc123secret my-app:latest

Use -e flag at runtime or Docker secrets for sensitive configuration.


Production Hardening Checklist

Before deploying any container to production, verify:

CheckCommandExpected
Non-root userdocker exec <container> whoamiNot root
Health check existsdocker inspect --format='{{.Config.Healthcheck}}' <image>Non-empty
Health statusdocker inspect --format='{{.State.Health.Status}}' <container>healthy
No hardcoded secretsdocker history <image>No API keys visible
Config via ENVdocker run --rm -e LOG_LEVEL=DEBUG <image> envVariable overridable