Kafka for AI Services

Name: Digital FTEs: Engineering — Achieving 10× Productivity
Author: Muhammad Usman Akbar

You build the kafka-events skill first, then use each lesson to test and deepen it—from EDA fundamentals to production-grade operations. Kafka 4.0+ KRaft mode (no ZooKeeper) is the default.

Goals

Understand event-driven architecture and Kafka’s core model (topics, partitions, consumer groups)
Deploy Kafka with Strimzi in KRaft mode
Build reliable producers/consumers with delivery guarantees and transactions
Integrate Kafka with FastAPI using schemas (Avro + schema registry)
Apply advanced patterns: Connect, CDC with Debezium, agent events, saga
Operate Kafka: production config, monitoring, debugging
Capture everything in a reusable Kafka skill

Chapter Progression

Chapter	Title	Focus
1	Build Your Kafka Skill	AI-native scaffolding and expertise building
2-4	EDA Foundations	Why events, EDA patterns, and the Kafka mental model
5-9	Kafka Core	Strimzi deployment, producers, consumers, and groups
10-14	Production Patterns	Async FastAPI, schemas, delivery semantics, and transactions
15-18	Advanced Patterns	Connect, CDC with Debezium, agent events, and Sagas
19-20	Operations	Production-grade Strimzi, monitoring, and debugging
21	AI-Assisted Development	Authoring and tuning Kafka logic with AI assistants
22	Capstone: Event Notifications	Building a production-ready notification pipeline

Each lesson ends with a skill reflection: test, find gaps, and improve.

Outcome & Method

You finish with a production-ready Kafka deployment, reliable producer/consumer code integrated with FastAPI, and a Kafka skill for future projects. The chapter follows the 4-Layer approach: foundations → production patterns → AI-assisted authoring → spec-driven capstone.

Prerequisites

Chapters 79-81: container image and Kubernetes/Helm familiarity
Ability to run a local Kubernetes cluster (e.g., Docker Desktop) for Strimzi
Implement reliable producers: acks semantics, retries, idempotent producer, error handling
Implement robust consumers: Consumer groups, rebalancing, offset management, lag monitoring
Integrate with FastAPI: Async producers/consumers, lifespan events, background tasks
Design event schemas: Avro with Schema Registry, schema evolution, breaking change prevention
Apply delivery guarantees: At-least-once, at-most-once, exactly-once semantics and trade-offs
Use transactions: Consume-process-produce pattern, zombie fencing, read_committed isolation
Build data pipelines: Kafka Connect, Debezium CDC, outbox pattern for microservices
Implement agent patterns: Task events, notification fanout, audit logs, saga pattern
Run Kafka on Kubernetes: Strimzi operator, Kafka CRDs, KRaft mode, production configuration
Debug production issues: Consumer lag, under-replicated partitions, rebalancing storms

Technology Choices

Component	Choice	Rationale
Kafka Operator	Strimzi	CNCF project, industry standard for Kafka on K8s
Kafka Mode	KRaft (no ZooKeeper)	Kafka 4.0+ default, simpler architecture
Python Client	confluent-kafka-python	Best performance, native async, Schema Registry support
Schemas	Avro + Confluent Schema Registry	Industry standard, evolution support
Platform	Docker Desktop Kubernetes	Consistent with Chapters 79-81
CDC	Debezium	Best-in-class change data capture

What's NOT Covered

This chapter focuses on developer skills, not SRE operations:

Docker Compose — we use Kubernetes throughout Module 7
Multi-datacenter replication (MirrorMaker 2)
Security deep dive (SASL, SSL, ACLs) — covered at overview level only
Kafka Streams framework — separate advanced topic
Broker hardware sizing and tuning
ZooKeeper — removed in Kafka 4.0

Looking Ahead

This chapter teaches Kafka directly. Chapter 83 (Dapr) shows how to abstract pub/sub behind Dapr's API, making your code portable across message brokers while retaining the concepts you learned here.

Goals

Understand event-driven architecture and Kafka’s core model (topics, partitions, consumer groups)

Deploy Kafka with Strimzi in KRaft mode

Build reliable producers/consumers with delivery guarantees and transactions

Integrate Kafka with FastAPI using schemas (Avro + schema registry)

Apply advanced patterns: Connect, CDC with Debezium, agent events, saga

Operate Kafka: production config, monitoring, debugging

Capture everything in a reusable Kafka skill

Chapter Progression

Chapter	Title	Focus
1	Build Your Kafka Skill	AI-native scaffolding and expertise building
2-4	EDA Foundations	Why events, EDA patterns, and the Kafka mental model
5-9	Kafka Core	Strimzi deployment, producers, consumers, and groups
10-14	Production Patterns	Async FastAPI, schemas, delivery semantics, and transactions
15-18	Advanced Patterns	Connect, CDC with Debezium, agent events, and Sagas
19-20	Operations	Production-grade Strimzi, monitoring, and debugging
21	AI-Assisted Development	Authoring and tuning Kafka logic with AI assistants
22	Capstone: Event Notifications	Building a production-ready notification pipeline

Each lesson ends with a skill reflection: test, find gaps, and improve.

Prerequisites

Chapters 79-81: container image and Kubernetes/Helm familiarity

Ability to run a local Kubernetes cluster (e.g., Docker Desktop) for Strimzi

Implement reliable producers: acks semantics, retries, idempotent producer, error handling

Implement robust consumers: Consumer groups, rebalancing, offset management, lag monitoring

Integrate with FastAPI: Async producers/consumers, lifespan events, background tasks

Design event schemas: Avro with Schema Registry, schema evolution, breaking change prevention

Apply delivery guarantees: At-least-once, at-most-once, exactly-once semantics and trade-offs

Use transactions: Consume-process-produce pattern, zombie fencing, read_committed isolation

Build data pipelines: Kafka Connect, Debezium CDC, outbox pattern for microservices

Implement agent patterns: Task events, notification fanout, audit logs, saga pattern

Run Kafka on Kubernetes: Strimzi operator, Kafka CRDs, KRaft mode, production configuration

Debug production issues: Consumer lag, under-replicated partitions, rebalancing storms

Technology Choices

Component	Choice	Rationale
Kafka Operator	Strimzi	CNCF project, industry standard for Kafka on K8s
Kafka Mode	KRaft (no ZooKeeper)	Kafka 4.0+ default, simpler architecture
Python Client	confluent-kafka-python	Best performance, native async, Schema Registry support
Schemas	Avro + Confluent Schema Registry	Industry standard, evolution support
Platform	Docker Desktop Kubernetes	Consistent with Chapters 79-81
CDC	Debezium	Best-in-class change data capture

What's NOT Covered

This chapter focuses on developer skills, not SRE operations:

Docker Compose — we use Kubernetes throughout Module 7

Multi-datacenter replication (MirrorMaker 2)

Security deep dive (SASL, SSL, ACLs) — covered at overview level only

Kafka Streams framework — separate advanced topic

Broker hardware sizing and tuning

ZooKeeper — removed in Kafka 4.0