Your Task API works. Users create tasks via POST requests, the database stores them, and responses return in milliseconds. But now you need to notify other services when tasks are created. The notification service, audit logger, and reminder scheduler all need to know about new tasks---and they shouldn't slow down your API response.
This is where event-driven architecture shines. Instead of your API calling each service directly (and waiting for responses), you publish a task.created event to Kafka. Services consume at their own pace. Your API stays fast. Consumers process independently.
But integrating Kafka with FastAPI raises architectural questions. Kafka's Python libraries use blocking I/O, but FastAPI is async. How do you initialize producers at startup and clean them up at shutdown? How do you consume events in the background without blocking your API? This chapter answers these questions with production patterns you'll use repeatedly.
FastAPI is built on asyncio. When you write async def create_task(), you're telling Python this function can pause and let other requests run while waiting for I/O. This is why FastAPI handles thousands of concurrent requests efficiently.
But here's the problem: confluent-kafka-python, the most robust Kafka client, is not async. It wraps librdkafka, a C library that uses blocking calls internally. When you call producer.produce(), it doesn't block---it queues the message. But producer.flush() blocks until messages are delivered. And consumers are entirely blocking: consumer.poll() waits for messages.
Two approaches to async Kafka in Python:
For production systems, we'll use confluent-kafka-python because of its reliability and feature completeness. The threading patterns you'll learn work with any blocking library. We'll show where aiokafka makes sense as an alternative.
FastAPI's lifespan events let you run code at startup and shutdown. This is perfect for Kafka producers: initialize once at startup, share across requests, flush and close at shutdown.
Output (on startup):
Output (on shutdown):
Why lifespan instead of startup/shutdown events? FastAPI deprecated @app.on_event("startup") in favor of lifespan context managers. The lifespan pattern guarantees cleanup runs even if startup fails partially.
With the producer initialized, endpoints can publish events without blocking:
Output (on request):
Critical pattern: producer.poll(0)
The produce() call queues the message but doesn't wait for delivery confirmation. The delivery callback won't fire until you call poll(). By calling poll(0) (zero timeout), you process any pending callbacks without blocking. This pattern:
The confluent-kafka Producer is thread-safe. Multiple FastAPI workers (via Uvicorn workers) can share the same producer instance. However, each worker process needs its own producer because Python processes don't share memory.
Consuming events is trickier. The consumer poll loop is blocking---it waits for messages. You can't run it in an async function without blocking the event loop. The solution: run the consumer in a background thread.
Update the lifespan to start and stop the consumer thread:
Output (startup):
Output (when event is published and consumed):
Output (shutdown):
You might wonder: "FastAPI is async, shouldn't the consumer be async too?"
The answer involves understanding what confluent-kafka does internally:
The threading approach is recommended by Confluent for asyncio applications. The consumer thread runs independently of the event loop, processing messages at its own pace. This matches how production Kafka consumers typically run---as separate processes or threads from the API.
If you prefer fully async code and can accept the trade-offs, aiokafka provides native asyncio support:
When to choose aiokafka:
When to choose confluent-kafka + threading:
Here's the full pattern combining producer, consumer, and proper lifecycle management:
Output (full flow):
You've built a working FastAPI + Kafka integration. Let's explore how to refine it for your specific requirements.
Initial design question:
"Should the API wait for Kafka acknowledgment before responding to the client?"
Exploring the trade-offs:
The current pattern returns immediately after produce() without waiting for delivery confirmation. This means:
Alternative: Wait for acknowledgment
This adds ~20-50ms latency but guarantees the client knows if publishing failed.
What emerged from this exploration:
You built a kafka-events skill in Chapter 1. Test and improve it based on what you learned.
Ask yourself:
If you found gaps:
Apply what you've learned by designing FastAPI + Kafka integrations for your domain.
Setup: Open your AI assistant with the FastAPI project context.
Prompt 1: Analyze your async architecture
What you're learning: AI helps you think through architectural decisions specific to your scale and deployment model. The answer depends on your failure modes and operational preferences.
Prompt 2: Debug a consumer issue
What you're learning: AI walks you through common causes (session timeout, max.poll.interval exceeded, unhandled exceptions) and debugging strategies (logging, health endpoint enhancement, thread monitoring).
Prompt 3: Design for your domain
What you're learning: AI collaborates on translating generic patterns to your specific domain, helping you decide what to publish, how to structure events, and where consumers should run.
Safety note: When testing FastAPI + Kafka integration locally, ensure your Kafka cluster is running before starting the app. The lifespan will fail at startup if it can't connect, which is the correct behavior---you want fast failure rather than an app that starts but can't publish events.