Your ChatActor works beautifully for maintaining conversation state. But what happens when you need to process a task that spans hours or days? A customer onboarding flow that waits for document verification. An order fulfillment process that coordinates inventory, payment, and shipping. A multi-step AI analysis pipeline that should resume exactly where it stopped if the server restarts.
Actors excel at stateful entities with identity. But long-running orchestration across multiple services—where you need retries, timeouts, parallel execution, and rollback on failure—calls for a different pattern.
This is where Dapr Workflows shine.
Imagine you're building an AI agent that processes research requests. The workflow looks like:
Each step might take minutes. The human approval step might take days. What happens if your server restarts between steps 3 and 4?
Without workflows, you'd lose everything. The user would need to start over. With Dapr Workflows, the process resumes exactly where it stopped—the search results are preserved, and the workflow continues waiting for human approval.
This is durable execution: your business logic survives infrastructure failures.
Traditional code runs in memory. When the process dies, everything is lost:
Durable execution persists progress at each step:
Each checkpoint records what happened. On restart, the workflow engine replays the history to reconstruct state, then continues from where it left off.
Dapr Workflows use event sourcing to persist progress. Instead of storing "current state," they store "sequence of events that led to current state."
When your workflow calls an activity:
On restart, the workflow engine:
This is why checkpointing is called "durable"—the state store (Redis, PostgreSQL, etc.) persists across restarts.
Dapr Workflows run on the Durable Task Framework, the same foundation used by Azure Durable Functions. Here's how the pieces fit together:
Key insight: Dapr Workflows use the actor backend for state storage. This means:
Here's where many developers get confused—and where bugs become subtle and hard to debug.
Workflow code must be deterministic. Every time the workflow replays, it must make the same decisions in the same order. Otherwise, replay produces a different execution path than the original, and the workflow engine can't match history events to code execution.
These patterns cause replay failures:
Think of it this way:
If you need to call an API, generate a random number, or get the current time—do it in an activity. The activity's result gets stored in history, and on replay, the workflow receives that cached result instead of re-executing.
This is the question every Dapr developer faces. Here's the decision framework:
Use Plain State When:
Use Actors When:
Use Workflows When:
Often the best architecture combines both:
Does your dapr-deployment skill understand when to recommend workflows versus actors? Test it:
If your skill recommends a workflow with compensation logic for the failure case, it understands the pattern.
What you're learning: The mechanics of event sourcing and replay—understanding that workflow history is a sequence of events, not a snapshot of state.
Prompt 2: Identify Determinism Violations
What you're learning: How to audit workflow code for determinism issues and the pattern of moving non-deterministic operations into activities.
Prompt 3: Design Decision Framework Application
What you're learning: Applying the actors vs workflows decision framework to a realistic scenario, including hybrid patterns where both are needed.
Safety Note: When implementing workflows in production, always test your replay behavior by actually killing and restarting the workflow service mid-execution. Determinism bugs only surface during replay, and they can be subtle—the workflow might run fine normally but fail mysteriously on restart.