You have been building event-driven systems, but there is a dangerous gap hiding in your architecture. When your application writes to the database and then publishes an event to Kafka, what happens if the app crashes between those two operations? The database has the data, but the event never reaches Kafka. Your downstream services never know the change happened.
This is the dual-write problem, and it has caused countless production incidents. Polling the database for changes does not solve it either--polls can miss changes, create duplicate events, and add significant load to your database. The solution is Change Data Capture (CDC): reading changes directly from the database transaction log, where every committed change is guaranteed to appear exactly once.
Debezium is the industry-standard CDC platform for Kafka. It reads the PostgreSQL Write-Ahead Log (WAL), transforms changes into events, and delivers them to Kafka topics with exactly-once semantics. Combined with the transactional outbox pattern, you can guarantee that database writes and event publishing are atomic--they either both succeed or both fail.
In this chapter, you'll deploy Debezium on Kubernetes using Strimzi and implement the outbox pattern to build reliable event-driven agents.
Consider what happens when your Task API creates a task and publishes an event:
The problem is fundamental: you cannot make a database write and a Kafka publish atomic using two-phase commit (2PC). Most databases and message brokers do not support it, and even when they do, the coupling creates fragile systems.
Change Data Capture solves this by reading the database's own transaction log--the Write-Ahead Log (WAL) in PostgreSQL. Every committed change appears in the WAL, and Debezium reads it in near real-time:
Debezium acts as a PostgreSQL replication client. It subscribes to the WAL using logical replication, transforms each change into a structured event, and produces it to Kafka. The database handles the hard work of tracking changes; Debezium simply reads and forwards them.
Debezium runs as a Kafka Connect connector. With Strimzi, you deploy it using the KafkaConnector custom resource.
PostgreSQL must be configured to allow logical replication. Add these settings to your PostgreSQL configuration:
Output:
Strimzi requires a custom Kafka Connect image that includes the Debezium connector. Create a KafkaConnect resource that builds the image:
For local development without a registry, use an ephemeral output:
Apply the resource:
Output:
Wait for the Connect cluster to be ready:
Now deploy the connector itself using a KafkaConnector resource:
Apply the connector:
Output:
Check the connector status:
Output:
Insert a test record and verify it appears in Kafka:
Output:
The "op": "c" indicates a create operation. Debezium uses operation codes: c (create), u (update), d (delete), and r (read/snapshot).
CDC captures all table changes, but capturing the tasks table directly has problems:
The transactional outbox pattern solves this. Instead of capturing the business table, you write domain events to an outbox table in the same transaction as your business data. Debezium captures the outbox table and transforms the records into proper events.
Design your outbox table to hold domain events:
Output:
When your application creates a task, write to both tables in one transaction:
Output:
Debezium includes the Outbox Event Router transformation that converts outbox table records into properly formatted events. Update your connector configuration:
Apply the updated connector:
Output:
With this configuration, when you insert a row into the outbox table with aggregate_type: "Task", Debezium produces an event to the Task.events Kafka topic. The event key is the aggregate_id, and the payload is the JSON from the payload column.
Test the complete flow:
Output:
The event is clean and domain-focused--no database metadata, no before/after snapshots, just the business event payload.
One concern with the outbox pattern: the table grows with every event. There are three strategies:
For most applications, scheduled cleanup is simplest:
Run this as a Kubernetes CronJob or PostgreSQL scheduled job.
Check connector status with:
Look at .status.connectorStatus for error messages.
You built a kafka-events skill in Chapter 1. Test and improve it based on what you learned.
Ask yourself:
If you found gaps:
What you're learning: Translating the outbox pattern from generic knowledge to your specific domain. AI helps you think through the schema decisions and naming conventions that fit your business events.
What you're learning: Systematic debugging of CDC pipelines. AI provides the specific commands and queries for each diagnostic step while you evaluate whether the outputs indicate problems.
What you're learning: Architectural decision-making. AI helps you identify considerations you might miss while you evaluate whether each factor applies to your specific context.
Safety note: Always test CDC configurations in a non-production environment first. Logical replication creates replication slots that consume WAL space--if the connector stops reading, WAL can fill your disk. Monitor replication slot lag in production.