You understand why events beat direct API calls. You know the difference between events and commands. Now you need a mental model for how Kafka actually works - one you can carry into production debugging sessions and architecture discussions.
This chapter builds that model through a familiar analogy: the newspaper industry. By the end, you'll be able to sketch Kafka's architecture on a whiteboard, explain consumer groups to a teammate, and trace exactly what happens when your agent publishes a "task.created" event.
No code yet. First, the concepts. Then, in Chapters 5-9, you'll deploy and code against a real Kafka cluster.
Imagine a major newspaper operation. Every day, the newspaper:
Kafka works the same way. Let's map each concept.
A topic is a named stream of events - like a newspaper section.
When your Task API creates a task, it publishes to the task-events topic. When a user signs up, the auth service publishes to user-events. Each topic is independent - consumers subscribe to the topics they care about.
Key insight: Topics are categories, not destinations. Unlike a traditional message queue where messages go to one consumer, Kafka topics are logs that multiple consumers can read independently.
Events are appended to the end. They're never modified or deleted (until retention expires). This append-only log model is what makes Kafka different from traditional queues.
A topic with millions of events per second can't run on a single machine. Kafka solves this with partitions - independent segments of a topic that can live on different machines.
Think of partitions as newspaper printing facilities in different cities:
Each partition has the following characteristics:
Critical concept: Ordering is guaranteed within a partition, but not across partitions. If you need events for a specific task to be processed in order, they must go to the same partition. Kafka uses the message key to determine partition assignment - events with the same key always go to the same partition.
When you read a book, you use a bookmark. Kafka uses offsets.
An offset is a sequential number assigned to each event within a partition. It tells consumers "where they are" in the stream:
Unlike traditional queues that delete messages after delivery, Kafka retains events based on time or size limits (default: 7 days). Multiple consumers can read the same events independently, each tracking their own offset.
Producers write events to topics. They're like journalists filing stories:
Consumers read events from topics. They're like subscribers reading the newspaper:
Each consumer maintains its own offset. The notification service might be at offset 1000 while the analytics service is at offset 500 - they're independent.
Here's where Kafka gets powerful. A consumer group is a team of consumers that share the work of reading a topic.
Imagine home delivery for a large city. One delivery person can't cover all routes. So you assign:
Each route is covered by exactly one person. If person 2 calls in sick, their route gets reassigned to someone else.
Kafka consumer groups work identically:
The rules:
Scaling insight: Want more parallelism? Add partitions. Want to process faster? Add consumers (up to partition count). Have 10 partitions but 15 consumers? 5 consumers will be idle.
A broker is a Kafka server that stores partitions and serves producers/consumers. A Kafka cluster is a group of brokers working together.
KRaft mode (Kafka Raft): In Kafka 4.0+, cluster metadata is managed by a built-in Raft consensus protocol. No external ZooKeeper needed. This simplifies deployment and reduces operational complexity.
Each partition has:
If a broker dies, another broker's replica becomes the new leader. Producers and consumers automatically reconnect to the new leader.
Let's trace what happens when your Task API publishes a "task.created" event:
What happens at each step:
Key observations:
Understanding Kafka's mental model helps you:
In the next chapter, you'll deploy a real Kafka cluster with Strimzi and see these concepts in action.
You built a kafka-events skill in Chapter 1. Test and improve it based on what you learned.
Ask yourself:
If you found gaps:
Use your AI companion to reinforce and extend this mental model.
What you're learning: Active recall strengthens mental models. Your AI partner acts as an expert interviewer, testing edge cases you might not have considered.
What you're learning: Applying abstract concepts to your specific domain. The AI helps you think through trade-offs rather than prescribing a solution.
What you're learning: Failure modes reveal how well you understand the system. Understanding what Kafka guarantees versus what your code must handle is crucial for production reliability.
When exploring Kafka with AI, verify configuration recommendations against official documentation. Default settings differ between development and production, and incorrect settings (like acks=0 for critical data) can cause data loss. Always test failure scenarios in a non-production environment first.