USMAN’S INSIGHTS
AI ARCHITECT
  • Home
  • About
  • Thought Leadership
  • Book
Press / Contact
USMAN’S INSIGHTS
AI ARCHITECT
⌘F
HomeBook
HomeBookThe Guarantee: Delivery Semantics Deep Dive
Previous Chapter
Message Schemas Avro and Schema Registry
Next Chapter
Transactions for Stream Processing
AI NOTICE: This is the table of contents for the SPECIFIC CHAPTER only. It is NOT the global sidebar. For all chapters, look at the main navigation.

On this page

19 sections

Progress0%
1 / 19

Muhammad Usman Akbar Entity Profile

Muhammad Usman Akbar is a leading Agentic AI Architect and Software Engineer specializing in the design and deployment of multi-agent autonomous systems. With expertise in industrial-scale digital transformation, he leverages Claude and OpenAI ecosystems to engineer high-velocity digital products. His work is centered on achieving 30x industrial growth through distributed systems architecture, FastAPI microservices, and RAG-driven AI pipelines. Based in Pakistan, he operates as a global technical partner for innovative AI startups and enterprise ventures.

USMAN’S INSIGHTS
AI ARCHITECT

Transforming businesses into autonomous AI ecosystems. Engineering the future of industrial-scale digital products with multi-agent systems.

30X Growth
AI-First
Innovation

Navigation

  • Home
  • Book
  • About
  • Contact
Let's Collaborate

Have a Project in Mind?

Let's build something extraordinary together. Transform your vision into autonomous AI reality.

Start Your Transformation

© 2026 Muhammad Usman Akbar. All rights reserved.

Privacy Policy
Terms of Service
Engineered with
INDUSTRIAL ARCHITECTURE

Delivery Semantics Deep Dive

Your Task API publishes a task.created event. The notification service consumes it and sends an email. But what happens if the consumer crashes after sending the email but before committing the offset? When it restarts, it re-reads the same event---and sends a second email. Your user receives duplicate notifications. Or worse: what if the consumer commits the offset before processing, crashes mid-processing, and the email never sends at all?

These aren't edge cases. In production systems handling millions of events daily, they happen regularly. Network partitions, broker restarts, consumer crashes, and deployment rollouts all create situations where the relationship between message consumption and processing outcome becomes uncertain.

This chapter examines the three delivery semantic guarantees Kafka can provide, the trade-offs each involves, and the practical patterns that make at-least-once---the most common choice---work reliably. By the end, you'll have a decision framework for choosing the right semantic for your use case and the implementation knowledge to build consumers that handle duplicates gracefully.

The Three Delivery Semantics

Every distributed messaging system must answer: how many times will a consumer see each message? The answer depends on where failures can occur and how the system handles them.

At-Most-Once: Fast, But Messages Can Be Lost

With at-most-once delivery, a message is delivered zero or one times. If anything fails, the message is lost rather than redelivered.

How it works:

text
1. Consumer receives message from Kafka 2. Consumer commits offset BEFORE processing 3. Consumer processes message 4. If processing fails: message is lost (offset already committed)
python
from confluent_kafka import Consumer consumer = Consumer({ 'bootstrap.servers': 'localhost:30092', 'group.id': 'analytics-service', 'enable.auto.commit': True, # Commits before processing completes 'auto.commit.interval.ms': 1000 }) consumer.subscribe(['task-events']) while True: msg = consumer.poll(1.0) if msg is None: continue if msg.error(): continue # Offset already committed automatically # If we crash HERE, message is lost process_analytics(msg.value()) # May never complete

Output:

text
>>> # Consumer starts, processes some messages Processing analytics for task-123 Processing analytics for task-124 >>> # Consumer crashes during task-125 processing >>> # Consumer restarts >>> # task-125 is SKIPPED (offset was already committed) Processing analytics for task-126

When at-most-once is acceptable:

Use CaseWhy Loss Is Tolerable
Real-time metricsIndividual data points don't affect aggregate accuracy
Activity loggingMissing log entries don't break business logic
Click trackingStatistical sampling is acceptable
Sensor readingsRedundant sensors provide coverage

When at-most-once is dangerous:

  • Payment processing (lost payment = revenue loss)
  • Order fulfillment (lost order = angry customer)
  • Audit logging (missing audit = compliance violation)
  • Inventory updates (lost update = stock discrepancy)

At-Least-Once: Safe, But Messages May Duplicate

With at-least-once delivery, every message is delivered one or more times. If anything fails, the message is redelivered rather than lost.

How it works:

text
1. Consumer receives message from Kafka 2. Consumer processes message 3. Consumer commits offset AFTER processing 4. If crash before commit: message is redelivered (duplicate)
python
from confluent_kafka import Consumer consumer = Consumer({ 'bootstrap.servers': 'localhost:30092', 'group.id': 'notification-service', 'enable.auto.commit': False, # Manual commit after processing 'auto.offset.reset': 'earliest' }) consumer.subscribe(['task-events']) while True: msg = consumer.poll(1.0) if msg is None: continue if msg.error(): continue # Process FIRST send_notification(msg.value()) # Commit AFTER processing succeeds # If we crash before this line, message is redelivered consumer.commit(message=msg)

Output (normal operation):

text
>>> # Normal processing Sending notification for task-123 Committed offset 42 Sending notification for task-124 Committed offset 43

Output (crash and recovery):

text
>>> # Consumer processes task-125, sends notification Sending notification for task-125 >>> # CRASH before commit >>> # Consumer restarts from last committed offset (43) Sending notification for task-125 # DUPLICATE Committed offset 44 Sending notification for task-126 Committed offset 45

Why at-least-once is the most common choice:

  1. No data loss: Every message is guaranteed to be processed
  2. Simpler than exactly-once: No transaction coordination required
  3. Duplicates are manageable: Consumers can be made idempotent
  4. Lower latency: No transaction overhead

Exactly-Once: Correct, But Complex and Expensive

With exactly-once delivery, every message is delivered exactly one time---no losses, no duplicates. This requires coordinating the consumer's processing with Kafka's offset commits atomically.

How it works:

text
1. Consumer receives message from Kafka 2. Consumer begins transaction 3. Consumer processes message (writing results to Kafka or transactional store) 4. Consumer commits offset AND processing results atomically 5. If anything fails: entire transaction rolls back, message is redelivered
python
from confluent_kafka import Consumer, Producer consumer = Consumer({ 'bootstrap.servers': 'localhost:30092', 'group.id': 'order-processor', 'enable.auto.commit': False, 'isolation.level': 'read_committed' # Only read committed messages }) producer = Producer({ 'bootstrap.servers': 'localhost:30092', 'transactional.id': 'order-processor-1', 'enable.idempotence': True }) producer.init_transactions() consumer.subscribe(['orders']) while True: msg = consumer.poll(1.0) if msg is None: continue try: producer.begin_transaction() # Process order and produce result result = process_order(msg.value()) producer.produce('order-results', key=msg.key(), value=result) # Commit offset AND produced message atomically producer.send_offsets_to_transaction( consumer.position(consumer.assignment()), consumer.consumer_group_metadata() ) producer.commit_transaction() except Exception as e: producer.abort_transaction() # Message will be redelivered, but no partial results

Output:

text
>>> # Exactly-once processing Processing order-789 Transaction committed: order-789 processed, offset committed >>> # Even if crash occurs mid-transaction, no duplicates or losses

The hidden costs of exactly-once:

CostImpact
LatencyTransaction coordination adds 10-50ms per message
ThroughputFewer messages per second due to coordination overhead
ComplexityTransactional code is harder to write and debug
Failure modesTransaction timeouts, zombie fencing, coordinator failures
Resource usageMore broker CPU and memory for transaction state

The Decision Matrix

Which delivery semantic should you choose? The answer depends on your specific requirements:

FactorAt-Most-OnceAt-Least-OnceExactly-Once
Data loss acceptable?YesNoNo
Duplicates acceptable?N/AYes (with idempotency)No
Latency requirementLowestLowHigher
Implementation complexitySimpleModerateHigh
Consumer can be idempotent?N/AYes = use thisN/A

Decision Framework

Ask these questions in order:

1. Can you lose messages?

  • Yes: Use at-most-once (simplest)
  • No: Continue to question 2

2. Can your consumer handle duplicates?

  • Yes (consumer is idempotent): Use at-least-once
  • No: Continue to question 3

3. Can you make your consumer idempotent?

  • Yes: Use at-least-once + idempotent consumer (recommended)
  • No: Use exactly-once (last resort)

Why idempotent consumers are usually better than exactly-once:

Exactly-once in Kafka only works when:

  • You're reading from Kafka AND writing to Kafka
  • Or you're using a transactional external store that supports Kafka transactions

Most real-world consumers write to databases, call APIs, send emails, or update caches. These operations don't participate in Kafka transactions. Making these consumers idempotent is simpler and more reliable than attempting exactly-once semantics.

Implementing Idempotent Consumers

The key insight: if your consumer can safely process the same message multiple times with the same result, at-least-once becomes as good as exactly-once from a business logic perspective.

Pattern 1: Deduplication with Idempotency Key

Store processed event IDs and check before processing:

python
from confluent_kafka import Consumer import redis import json class IdempotentConsumer: def __init__(self, consumer_config: dict, redis_client: redis.Redis): self.consumer = Consumer(consumer_config) self.redis = redis_client self.ttl_seconds = 86400 * 7 # Keep IDs for 7 days def is_duplicate(self, event_id: str) -> bool: """Check if we've already processed this event.""" key = f"processed:{event_id}" return self.redis.exists(key) > 0 def mark_processed(self, event_id: str) -> None: """Mark event as processed with TTL.""" key = f"processed:{event_id}" self.redis.setex(key, self.ttl_seconds, "1") def process_with_deduplication(self, msg) -> bool: """Process message if not already processed.""" event = json.loads(msg.value().decode()) event_id = event.get('event_id') if not event_id: print(f"Warning: Message missing event_id, processing anyway") return self._do_process(event) if self.is_duplicate(event_id): print(f"Skipping duplicate: {event_id}") return True # Already processed successfully # Process the event success = self._do_process(event) if success: self.mark_processed(event_id) return success def _do_process(self, event: dict) -> bool: """Actual processing logic.""" # Your business logic here print(f"Processing: {event.get('event_id', 'unknown')}") return True

Output:

text
>>> # First delivery Processing: evt-123 >>> # Duplicate delivery (after crash/restart) Skipping duplicate: evt-123 >>> # New message Processing: evt-124

Pattern 2: Database Upsert with Natural Key

Use database constraints to prevent duplicates:

python
from confluent_kafka import Consumer import psycopg2 import json def process_task_event(msg, db_conn): """Process task event with upsert for idempotency.""" event = json.loads(msg.value().decode()) with db_conn.cursor() as cur: # Upsert: Insert or update based on task_id # If task_id already exists, this updates instead of duplicating cur.execute(""" INSERT INTO tasks (task_id, title, status, updated_at) VALUES (%(task_id)s, %(title)s, %(status)s, NOW()) ON CONFLICT (task_id) DO UPDATE SET title = EXCLUDED.title, status = EXCLUDED.status, updated_at = NOW() """, { 'task_id': event['data']['task_id'], 'title': event['data']['title'], 'status': event['data'].get('status', 'pending') }) db_conn.commit() print(f"Upserted task: {event['data']['task_id']}") return True

Output:

text
>>> # First delivery Upserted task: task-456 >>> # Duplicate delivery Upserted task: task-456 # Same result, no duplicate row >>> # Verify in database SELECT COUNT(*) FROM tasks WHERE task_id = 'task-456'; count: 1

Pattern 3: Conditional Processing with Version Check

For state changes, check current state before applying:

python
from confluent_kafka import Consumer import json def process_task_completion(msg, db_conn): """Only complete task if it's currently pending.""" event = json.loads(msg.value().decode()) task_id = event['data']['task_id'] with db_conn.cursor() as cur: # Only update if task is still pending cur.execute(""" UPDATE tasks SET status = 'completed', completed_at = NOW() WHERE task_id = %s AND status = 'pending' RETURNING task_id """, (task_id,)) result = cur.fetchone() db_conn.commit() if result: print(f"Completed task: {task_id}") return True else: print(f"Task already completed or not found: {task_id}") return True # Still success (idempotent)

Output:

text
>>> # First delivery Completed task: task-789 >>> # Duplicate delivery Task already completed or not found: task-789 >>> # Both commits succeed, no duplicate state change

Choosing Your Idempotency Strategy

StrategyBest ForComplexityStorage Requirement
Deduplication keyAny event type, external API callsLowRedis/cache for event IDs
Database upsertCreate/update operationsLowNatural key in database
Version/state checkState transitionsMediumState field in database
Outbox patternMulti-system coordinationHighOutbox table + CDC

When Each Strategy Fits

Deduplication key (Pattern 1):

  • External API calls (send email, charge payment)
  • Operations without natural idempotency
  • High-volume events where storage is cheap

Database upsert (Pattern 2):

  • CRUD operations on entities with unique IDs
  • Simple inserts or updates
  • When the database is the source of truth

Version/state check (Pattern 3):

  • State machine transitions
  • When duplicate transitions should be no-ops
  • Audit-sensitive operations

Collaborating on Delivery Strategy

When designing your event processing strategy, you might start with a simple question and discover the nuances through exploration.

Your initial question:

"My notification service sends emails when tasks are created. How do I prevent duplicate emails?"

Exploring the problem:

This is actually two separate concerns:

  1. Consumer reliability: Ensuring you don't miss events
  2. Idempotency: Ensuring you don't send duplicate emails

For consumer reliability, use at-least-once (commit after processing). For idempotency, you need to track which emails you've already sent.

Discovering the implementation:

A simple approach uses Redis to track sent notifications:

python
def send_notification_idempotently(event_id: str, email: str, message: str) -> bool: """Send notification only if not already sent.""" key = f"notification:sent:{event_id}" # Check if already sent if redis_client.exists(key): return True # Already handled # Send the email success = email_service.send(email, message) if success: # Mark as sent with 7-day TTL redis_client.setex(key, 86400 * 7, "sent") return success

Refining the approach:

But what if sending the email succeeds and Redis fails? You'd send the email again on the next attempt. For truly idempotent email sending, you might need:

  1. Email service that accepts idempotency keys (like Stripe's API)
  2. Or database transaction that commits email ID before sending
  3. Or acceptance that rare duplicates are better than complex infrastructure

What emerged from this exploration:

  • At-least-once + idempotent consumer is the standard pattern
  • Idempotency can be implemented at different layers
  • Perfect idempotency requires careful thought about failure modes
  • Sometimes "good enough" idempotency beats "perfect" complexity

Reflect on Your Skill

You built a kafka-events skill in Chapter 1. Test and improve it based on what you learned.

Test Your Skill

Specification
Using my kafka-events skill, configure a consumer for exactly-once delivery semantics in a payment processing scenario.Does my skill explain idempotent consumers, transactional reads, and deduplication strategies?

Identify Gaps

Ask yourself:

  • Did my skill explain the difference between at-least-once, at-most-once, and exactly-once?
  • Did it show when each delivery semantic is appropriate?

Improve Your Skill

If you found gaps:

Specification
My kafka-events skill is missing delivery semantics (at-least-once, exactly-once, idempotent consumers).Update it to include how to achieve exactly-once processing and when the trade-offs are worth it.

Try With AI

Apply these delivery semantics to your own event processing scenarios.

Setup: Open Claude Code or your preferred AI assistant in your Kafka project directory.


Prompt 1: Analyze Your Processing Requirements

Specification
I'm building an order processing system with these events: - Order Created: Must create order in database - Payment Received: Must update order status and trigger fulfillment - Order Shipped: Must send shipping notification email For each event, help me determine:1. Can the consumer be made idempotent? How? 2. What delivery semantic should I use? 3. What's the deduplication strategy? 4. What happens if processing fails halfway?Show me the consumer implementation for Payment Received with idempotency.

What you're learning: Mapping real business requirements to delivery semantics and implementing appropriate idempotency patterns for each event type.


Prompt 2: Design Idempotency for External API Calls

Specification
My consumer calls an external payment API that doesn't support idempotency keys.The flow is:1. Receive payment.requested event 2. Call external API to charge card 3. Publish payment.completed event 4. Commit offset If my consumer crashes after step 2 but before step 3, I'll charge the cardtwice when the message is redelivered. How do I make this idempotent?Consider: - I can't modify the external API - I have access to PostgreSQL and Redis - I need an audit trail of all payment attempts

What you're learning: Handling idempotency for external systems that don't natively support it---a common real-world challenge.


Prompt 3: Evaluate Exactly-Once vs Idempotent Consumer

Specification
My team is debating whether to implement exactly-once semantics for ourorder processing pipeline. We currently use at-least-once with idempotentconsumers.Our pipeline:1. Read from 'orders' topic 2. Validate order 3. Write to 'validated-orders' topic 4. Write to 'inventory-reservations' topic Arguments for exactly-once: - "It's cleaner, no deduplication logic needed" - "Kafka supports it natively"Arguments against: - "More complexity" - "Higher latency"Help me analyze this decision. What questions should I ask? What are thehidden costs of each approach for THIS specific use case?

What you're learning: Critical evaluation of exactly-once vs at-least-once trade-offs for a specific architecture, not just theoretical understanding.


Safety Note: Test your idempotency logic by deliberately causing failures (kill consumer mid-processing, simulate network partitions). Idempotency bugs often only surface under failure conditions that are hard to reproduce in development.