Your Task API is publishing events to Kafka. Today, your task.created event looks like this:
Next month, the product team wants to add task priority. You add the field:
But the notification service consuming these events wasn't updated. When it receives a message with the new priority field, it crashes. Or worse, it silently ignores the field and loses data. You've just experienced schema drift---the silent killer of event-driven systems.
In production, you'll have dozens of services reading and writing events. Without schema enforcement, any producer can add, remove, or rename fields at will. Consumers break in unpredictable ways. Debugging becomes a forensic investigation: "Which version of the event was this consumer built for?"
This chapter introduces Apache Avro for binary schema-based serialization and Schema Registry for centralized schema management. By the end, you'll design event schemas that evolve safely, enforce contracts between producers and consumers, and prevent the integration failures that plague untyped messaging systems.
In a typical Kafka deployment without schemas, producers and consumers communicate through implicit contracts:
This works until someone changes something:
The core problem: JSON doesn't enforce structure. Any producer can send anything, and you won't discover the mismatch until runtime---often in production.
Apache Avro is a data serialization system that provides:
An Avro schema is JSON that defines your data structure:
Key components:
To make a field optional, use a union type with null:
The ["null", "int"] union means the field can be either null or an integer. The default: null makes it optional---old messages without priority will deserialize with priority = null.
Here's a complete schema for Task API events:
Confluent Schema Registry provides:
When a producer sends a message:
The message payload starts with 5 bytes of metadata:
Add the required packages to your project:
Or with pip:
Important: Strimzi doesn't include Schema Registry. You need to deploy it separately. We'll use Apicurio Registry, which is Confluent Schema Registry-compatible and works well on Kubernetes.
Create schema-registry.yaml:
Apply the configuration:
Output:
Wait for the pod to be ready:
For local development, we use the NodePort URLs. For code running inside Kubernetes, use the internal URLs.
Output:
Output:
Output:
The real power of Schema Registry is compatibility enforcement. You can evolve schemas over time without breaking consumers.
With BACKWARD compatibility, consumers using the new schema can read data written with the old schema.
Safe changes (backward compatible):
Unsafe changes (breaks compatibility):
Original schema (v1):
New schema (v2) - Adding priority:
This is backward compatible because:
Original schema:
Incompatible change - Adding required field:
When you try to register this schema:
Output (error):
Schema Registry blocks the incompatible change, preventing production breakage.
Always check compatibility before deploying schema changes:
Output:
You've learned the mechanics of Avro schemas. Now let's design a real schema for the Task API.
Your starting point:
"I need to design event schemas for my Task API. I want to publish task lifecycle events."
Identifying requirements:
Consider what information each event needs:
Initial design attempt:
Evaluating the design:
This schema has problems:
Refining based on production requirements:
A better design separates event metadata from entity data:
What emerged from refinement:
Schema Registry organizes schemas by subject. The default naming strategy is:
For topic task-created:
Configure in producer:
Wrap all events in a standard envelope:
Don't use unions to represent "any type":
If you need this flexibility, you've lost the schema's contract value.
Flatten when possible, or use separate events for different entity states.
You built a kafka-events skill in Chapter 1. Test and improve it based on what you learned.
Ask yourself:
If you found gaps:
Apply schema design and evolution to your Task API events.
Setup: Open Claude Code or your preferred AI assistant in your project directory.
Prompt 1: Design an Event Schema
What you're learning: Schema design decisions---which fields are essential to the event's meaning (required) versus context that might not always be available (optional with defaults).
Prompt 2: Plan a Schema Evolution
What you're learning: Compatibility analysis---understanding which changes are safe and how to work around the restrictions when you need to add required fields to existing schemas.
Prompt 3: Debug a Compatibility Error
What you're learning: Compatibility debugging---understanding that renaming a field is effectively a delete+add operation, and how to handle migrations that require breaking changes (versioning strategies, new topics, dual-writes).
Safety Note: Schema changes affect all producers and consumers. Always test compatibility in a staging environment before production, and coordinate deployment order based on your compatibility mode (BACKWARD = upgrade consumers first).