Principles Exercises: Practice the Seven Principles

Name: Digital FTEs: Engineering — Achieving 10× Productivity
Author: Muhammad Usman Akbar

You've read about the Seven Principles. You understand why bash verification matters, why code beats prose, why verification catches bugs before they reach production. But understanding principles and applying principles are different skills entirely. The gap between knowing the theory and having the muscle memory is where most people stall — they can explain what "small, reversible decomposition" means but still make monolithic commits when the pressure is on.

These 17 exercises close that gap. Each module targets one principle with two exercises: a guided exercise where you know which principle to apply, and a discovery exercise where you diagnose what went wrong. Three skills run through every exercise: principle recognition (identifying which principle fits a scenario), principle application (using a principle to solve a concrete problem), and workflow diagnosis (spotting which principle was violated when something breaks). By the end, you won't need to think about which principle fits — you'll recognize the pattern instantly.

Download Exercise Files

Download Principles Exercises (ZIP)

After downloading, unzip the file. Each exercise has its own folder with an INSTRUCTIONS.md and any starter files you need.

If the download link doesn't work, visit the repository releases page directly.

How to Use These Exercises

The workflow for every exercise is the same:

Open the exercise folder from the claude-code-principles-exercises/ directory
Read the INSTRUCTIONS.md inside the folder — it has setup steps and starter files
Read the walkthrough below for context on what you're practicing and why
Start Claude Code or Cowork and point it at the exercise folder
Work through the exercise — write your own prompts, don't just copy the starter
Reflect using the questions provided — this is where the real learning happens

You don't need to complete all 17 in one sitting. Work through one module at a time. Each module targets a different principle.

Tool Guide

Claude Code — Terminal-based, best for exercises involving bash commands, file verification, and multi-step workflows
Cowork — Desktop app, best for exercises involving document review and prompt design

Most exercises work with either tool. Where one is clearly better, the exercise notes will say so.

The Principle Application Framework

Use this for every exercise:

Identify — Which principle applies to this scenario?
Diagnose — What failure does this principle prevent?
Plan — How will you apply this principle here?
Execute — Apply the principle with Claude Code or Cowork
Verify — Did it work? What changed compared to the no-principle approach?
Compare — What would have happened WITHOUT the principle?
Generalize — When else does this principle apply in your work?

This framework mirrors how professionals internalize any methodology: identify the pattern, apply it deliberately, then reflect on the result. Over time, the framework becomes invisible — you just do it.

Assessment Rubric

Criteria	Beginner (1)	Developing (2)	Proficient (3)	Advanced (4)
Principle Recognition	Can't identify which principle applies	Identifies principle when told the category	Correctly identifies principle from scenario clues	Spots violations before they cause problems
Application Quality	Applies principle mechanically	Applies with some adaptation to context	Adapts principle fluently to novel scenarios	Combines multiple principles proactively
Diagnosis Depth	Describes symptoms only	Identifies the violated principle	Explains root cause and prevention	Proposes systematic checks to prevent class of failures
Prompt Specificity	Vague, principle-unaware prompts	Mentions principle but not HOW to apply it	Prompts encode principle as concrete constraint	Prompts teach Claude the principle for future use
Reflection Quality	No reflection or generic answers	Notes what worked/didn't	Connects exercise to other scenarios	Identifies personal workflow gaps and commits to changes

Module 1: Bash is the Key

Core Skill: Using terminal commands to verify state before making changes

Exercise 1.1 — Verify Before You Modify (Guided)

The Problem: Open the module-1-bash-is-the-key/exercise-1.1-server-detective/ folder. You'll find a project directory with a README.md describing a small website, several source files, and a config that controls the build. Someone wants to "update the styling" but hasn't told you which files handle styles or what build system is in use.

Your Task: Before asking Claude to change anything, use bash commands to map the project: what files exist, what the build system is, where styles live, what the current output looks like. Then — and only then — write a prompt that gives Claude the context it needs to make the right change.

What You'll Learn:

Why exploring with bash before modifying prevents wrong assumptions
How ls, cat, grep, and find give you ground truth that natural language descriptions miss
That the 30 seconds you spend verifying state saves 10 minutes of correction loops

Starter Prompt (Intentionally Vague):

"Update the styling on this website to look more modern."

Better Prompt (Build Toward This): After running ls -la, cat package.json, and grep -r "styles" ., try: "The project uses [build system] with styles in [path]. Update [specific file] to use a sans-serif font stack and increase body padding to 2rem. Don't touch [other files]. Run npm run build after changes and verify the output."

Reflection Questions:

How many assumptions did the vague prompt force Claude to make? How many did your bash exploration eliminate?
Which bash command gave you the most useful information before writing your prompt?
Could you apply this "explore first" pattern to a project you're working on right now?

Exercise 1.2 — The Blind Refactor (Discovery)

The Problem: Open the module-1-bash-is-the-key/exercise-1.2-deploy-disaster/ folder. You'll find a session-log.md describing what happened when someone asked Claude to "refactor the API routes into separate files" without first checking the project structure. Claude assumed an Express.js setup, but the project uses Hono. It created files in the wrong directory, imported non-existent modules, and broke the working server.

Your Task: Read the session log. Identify every point where a single bash command would have prevented the mistake. Write the specific commands that should have been run and when. Then write the prompt that should have been used instead.

What You'll Learn:

How skipping bash verification leads to cascading wrong assumptions
That Claude's guesses about project structure are often reasonable but wrong
The specific bash commands that prevent the most common "blind modification" failures

Reflection Questions:

How many of Claude's mistakes traced back to a single unchecked assumption about the framework?
What's the minimum set of bash commands you'd run before any refactoring task?
If you were writing a CLAUDE.md rule to prevent this class of failure, what would it say?

Module 2: Code as Universal Interface

Core Skill: Expressing requirements as structured specifications rather than natural language

Exercise 2.1 — Spec vs. Prose (Guided)

The Problem: Open the module-2-code-as-interface/exercise-2.1-report-spec/ folder. You'll find requirements.md — a natural-language description of a data validation function. It says things like "make sure emails are valid," "names shouldn't be too long," and "ages need to make sense." Every phrase is ambiguous.

Your Task: Write the requirements as code — a test file or type definition that specifies exactly what "valid email" means, exactly how long is "too long," and exactly what age range "makes sense." Give Claude both versions (prose and code) and compare the implementations.

What You'll Learn:

Why "valid email" means different things to different people (and different AIs)
How a test file eliminates interpretation by defining exact inputs and expected outputs
That writing the spec as code takes 5 minutes but prevents 30 minutes of "that's not what I meant"

Starter Prompt (Intentionally Vague):

"Write a validation function based on these requirements."

Better Prompt (Build Toward This): After writing your test file: "Implement a validation function that passes all tests in validation.test.ts. Don't modify the tests. If any requirement is ambiguous, follow the test expectations exactly."

Reflection Questions:

How many ambiguities did you find in the prose requirements? How many survived in your code spec?
Did Claude's implementation from the code spec match your expectations more closely than from the prose?
When would you choose prose requirements over code specs? Is there ever a good reason?

Exercise 2.2 — The Interpretation Gap (Discovery)

The Problem: Open the module-2-code-as-interface/exercise-2.2-lost-in-translation/ folder. You'll find two files: prompt.md (a natural-language request to "build a dashboard showing user activity") and output-a.md plus output-b.md — two completely different implementations Claude produced from the same prompt on different runs. One shows a table of login timestamps. The other shows a chart of feature usage frequency. Both are reasonable interpretations.

Your Task: Analyze why the same prompt produced two different outputs. Identify every ambiguous word in the prompt. Then write a code specification (interface definition, mock data structure, or test) that would force both runs to produce the same result.

What You'll Learn:

That natural language doesn't just allow multiple interpretations — it guarantees them
How to spot ambiguity by comparing divergent outputs from the same input
The specific techniques (interfaces, test cases, mock data) that eliminate interpretation variance

Reflection Questions:

How many ambiguous terms did you find in the original prompt?
Does your code specification leave any room for interpretation? Could you make it even tighter?
What's the cost of ambiguity — how much time would the "wrong" implementation waste in a real project?

Module 3: Verification as Core Step

Core Skill: Systematically verifying output rather than trusting "looks right"

Exercise 3.1 — Trust But Verify (Guided)

The Problem: Open the module-3-verification/exercise-3.1-data-audit/ folder. You'll find customer data files and transformation claims describing what was supposedly done to clean and restructure the data. Your job isn't to perform the transformation — it's to verify whether the claimed output is actually correct.

Your Task: After Claude produces the transformed data, verify it using bash commands — don't just scan it visually. Check row counts (wc -l), spot-check specific values (grep, awk), verify no data was lost, and confirm the format matches the spec. Find at least one error that visual inspection would miss.

What You'll Learn:

That "looks right" and "is right" are different things, especially with data transformations
Specific bash verification commands for common data operations
How to build a verification checklist that catches errors before they propagate

Starter Prompt (Intentionally Vague):

"Clean up this CSV data according to the spec."

Better Prompt (Build Toward This): After the transformation: "Now verify the output: confirm the row count matches input (100 rows), check that all email addresses match the normalized format, verify no NULL values in required fields, and compare 5 random rows against the original to confirm accuracy."

Reflection Questions:

What error did you find that visual inspection missed? How did you catch it?
Which verification command was most useful? Would you add it to a personal checklist?
How long did verification take compared to the transformation itself? Was it worth it?

Exercise 3.2 — The "Looks Done" Trap (Discovery)

The Problem: Open the module-3-verification/exercise-3.2-silent-corruption/ folder. You'll find completed-work.md — a session log where someone asked Claude to build a simple REST API with 4 endpoints. Claude reported "Done! All endpoints implemented and working." The log shows Claude writing code and declaring success. But the project/ subfolder contains the actual code, and there are 3 bugs hiding in plain sight.

Your Task: Find all 3 bugs. For each one, explain what verification step would have caught it (e.g., "running curl localhost:3000/users would have returned a 500 error" or "running the test suite would have shown a failing assertion"). Then write the verification protocol that should follow every "Done!" declaration.

What You'll Learn:

That "done" without verification is just a claim, not a fact
The specific verification steps that catch the most common "looks done" failures
How to build a personal "definition of done" checklist that goes beyond "it compiled"

Reflection Questions:

Which bug was hardest to find? Would automated tests have caught it?
Why did Claude declare "done" when bugs existed? What does this tell you about trusting self-reported completion?
What's your "definition of done" for your own work? Does it include verification steps?

Module 4: Small, Reversible Decomposition

Core Skill: Breaking complex changes into atomic, verifiable, reversible steps

Exercise 4.1 — Atomic Commits (Guided)

The Problem: Open the module-4-decomposition/exercise-4.1-migration-steps/ folder. You'll find a project that needs 5 changes: rename a database table, update the ORM model, update 3 API endpoints that reference the old name, update the tests, and update the documentation. Someone's instinct is to make all 5 changes in one commit.

Your Task: Decompose this into 5 separate commits, each atomic and independently verifiable. For each commit, write: what changes, how to verify it works, and how to revert if it doesn't. Then execute the first 2 commits with Claude, verifying after each one.

What You'll Learn:

Why atomic commits make bugs isolatable (git bisect needs small commits to work)
How to sequence dependent changes so each step leaves the system in a working state
That the discipline of small commits forces you to think about ordering and dependencies

Starter Prompt (Intentionally Vague):

"Rename the users table to accounts and update everything that references it."

Better Prompt (Build Toward This): "We need to rename the users table to accounts. Break this into atomic steps. Step 1: Add a migration that renames the table. Verify with npm run migrate && npm test. Commit only if tests pass. Step 2: Update the ORM model. Verify the same way. Continue one step at a time — don't batch changes."

Reflection Questions:

What ordering did you choose for the 5 commits? Could you have sequenced them differently?
Did any commit leave the system in a broken state? How would you fix the sequencing?
If commit 3 introduced a bug, how would atomic commits help you find it compared to one big commit?

Exercise 4.2 — The Big-Bang Commit (Discovery)

The Problem: Open the module-4-decomposition/exercise-4.2-big-bang-failure/ folder. You'll find commit-diff.md — a single massive commit that touched 14 files across 3 different concerns: a feature addition, a bug fix, and a style update. After deployment, users reported a regression. The team can't figure out which of the 14 file changes caused it because everything is tangled in one commit.

Your Task: Analyze the diff and untangle it. Identify which changes belong to the feature, which to the bug fix, and which to the style update. Propose a decomposition into 3+ atomic commits that would have made the regression isolatable. Explain how git bisect would have found the bug if the work had been decomposed.

What You'll Learn:

How tangled commits make debugging exponentially harder
The technique of retroactively decomposing a monolithic change
Why "I'll commit everything at the end" is a false efficiency — it trades 5 minutes of discipline for hours of debugging

Reflection Questions:

How long did it take to untangle the diff? How long would it have taken to write 3 separate commits originally?
Which of the 3 concerns (feature, fix, style) caused the regression? How did you determine this?
What rule would you add to your workflow to prevent big-bang commits?

Module 5: Persisting State in Files

Core Skill: Creating persistent artifacts that survive session boundaries

Exercise 5.1 — Build a CLAUDE.md (Guided)

The Problem: Open the module-5-persisting-state/exercise-5.1-decision-journal/ folder. You'll find multiple session transcripts from the same project plus a project-files directory. The earlier sessions establish conventions and make architectural decisions. The later sessions start from scratch and violate those conventions because the context was lost between sessions.

Your Task: Read both transcripts. Extract every convention, decision, and project pattern from Session 1. Write a CLAUDE.md that would have prevented Session 2's violations. Include: project structure, coding conventions, workflow rules, and any architectural decisions.

What You'll Learn:

How to identify implicit knowledge that needs to be made explicit in context files
The structure of an effective CLAUDE.md (conventions, decisions, workflows)
That 15 minutes documenting context saves hours of re-establishing it across sessions

Starter Prompt:

"Read these two session transcripts. Extract every convention from Session 1 that Session 2 violated. Create a CLAUDE.md that persists this knowledge so no future session repeats these mistakes."

Reflection Questions:

How many conventions from Session 1 did Session 2 violate? How many would your CLAUDE.md prevent?
Did you find implicit conventions that neither session stated explicitly but both assumed?
How would you test whether your CLAUDE.md actually works? What would you check in Session 3?

Exercise 5.2 — The Groundhog Day Sessions (Discovery)

The Problem: Open the module-5-persisting-state/exercise-5.2-groundhog-day/ folder. You'll find three session transcripts from the same project, spaced weeks apart. Sessions 2 and 3 repeat Session 1's mistakes almost exactly: re-discovering the same bugs, re-establishing the same patterns, making the same wrong assumptions before correcting them. The developer wasted 40+ minutes across sessions relearning what they already knew.

Your Task: Map the repeated work: which discoveries in Session 2 were already made in Session 1? For each repeated discovery, write the specific file artifact (CLAUDE.md entry, ADR, or session journal) that would have carried the knowledge forward. Explain why each artifact type is the right choice for that piece of knowledge.

What You'll Learn:

The real cost of not persisting state: measured in wasted minutes and repeated mistakes
How to choose between CLAUDE.md (conventions), ADRs (decisions), and session journals (context)
That different types of knowledge require different persistence strategies

Reflection Questions:

How many minutes of Session 2 were wasted re-discovering Session 1 knowledge?
Which artifact type (CLAUDE.md, ADR, journal) would have prevented the most repeated work?
What's the minimum set of artifacts you'd create after every session to prevent Groundhog Day?

Module 6: Constraints and Safety

Core Skill: Setting boundaries that enable safe autonomy

Exercise 6.1 — Write Permission Guardrails (Guided)

The Problem: Open the module-6-constraints-safety/exercise-6.1-sandbox-setup/ folder. You'll find project-description.md — a project with sensitive files (.env with API keys, database/production.sql with real data, deploy/ with production deployment scripts). You want Claude to help with development but need to ensure it never reads, modifies, or deletes anything sensitive.

Your Task: Write a complete set of permission constraints: which files/directories Claude can freely access, which require confirmation, and which are completely off-limits. Then write a CLAUDE.md section that encodes these constraints, and test by asking Claude to do something that should be blocked.

What You'll Learn:

How to design permission boundaries that are specific enough to be useful but broad enough to not block work
The three permission tiers (auto-approve, confirm, deny) and when each is appropriate
That well-designed constraints actually speed up work by eliminating the "is this safe?" hesitation

Starter Prompt:

"Help me develop this project. Here's the codebase. I'll tell you what to do."

Better Prompt (Build Toward This): "Help me develop this project. CONSTRAINTS: Never read or modify files in .env, database/, or deploy/. Auto-approve reads in src/ and tests/. Require my confirmation for any file writes. If you need information from a restricted file, ask me and I'll provide the specific value."

Reflection Questions:

Were your constraints too restrictive (blocked legitimate work) or too permissive (allowed risky access)?
How did Claude behave when it hit a constraint boundary? Did it ask for help or try to work around it?
Would these constraints work for a teammate using the same project? Are they clear enough?

Exercise 6.2 — The Unconstrained Agent (Discovery)

The Problem: Open the module-6-constraints-safety/exercise-6.2-runaway-agent/ folder. You'll find incident-report.md — a post-mortem of an incident where an AI agent was given broad permissions on a project and caused damage: it deleted a cache directory that turned out to contain unversioned work, modified a config file that broke the CI pipeline, and made an API call to a production service during testing.

Your Task: Read the incident report. For each damaging action, write the specific constraint that would have prevented it. Then design a complete permission model for this project that balances productivity with safety. Your model should cover: file access, command execution, external service access, and destructive operations.

What You'll Learn:

That unconstrained agents will eventually do something destructive, even with good intentions
How to design layered permission models (read/write/execute/network)
The principle that constraints enable autonomy: a well-constrained agent can be trusted with more independence

Reflection Questions:

Which of the three incidents was most preventable? Which would have been hardest to anticipate?
Does your permission model allow Claude to do productive work, or did safety concerns make it too restrictive?
How would you explain to a skeptic that constraints make agents MORE useful, not less?

Module 7: Observability

Core Skill: Making Claude's work visible and debuggable

Exercise 7.1 — Add Progress Reporting (Guided)

The Problem: Open the module-7-observability/exercise-7.1-progress-tracker/ folder. You'll find multi-step-task.md — a task that requires Claude to perform 6 sequential steps (read files, analyze data, generate report, create charts, compile output, run verification). Without progress reporting, you'd see nothing until Claude declares "Done!" after several minutes.

Your Task: Rewrite the task prompt to require explicit progress reporting: Claude must announce each step before starting it, report what it found/produced, and summarize what's next. Execute the task with and without progress reporting and compare the experience.

What You'll Learn:

How progress reporting transforms "black box" AI work into transparent collaboration
The specific prompt patterns that produce useful progress reports (not just "Step 3 done")
That observability isn't overhead — it's how you catch problems before they compound

Starter Prompt:

"Analyze the data in this folder, generate a report, and create visualizations."

Better Prompt (Build Toward This): "Analyze the data in this folder. For each step: (1) state what you're about to do, (2) do it, (3) report what you found or produced, (4) state what's next. Steps: read all CSVs, identify data quality issues, compute summary statistics, generate the report, create 3 charts, run a consistency check on the final output. If any step produces unexpected results, stop and tell me before continuing."

Reflection Questions:

Did progress reporting help you catch any issues mid-task that you would have missed in the final output?
How much longer did the observable version take compared to the black-box version? Was the tradeoff worth it?
What's the right level of reporting detail — every line, every step, or every phase?

Exercise 7.2 — The Opaque Session (Discovery)

The Problem: Open the module-7-observability/exercise-7.2-black-box-debug/ folder. You'll find activity-log.md — an activity log from a Claude session where something went wrong. The developer asked Claude to "set up the testing framework," and 10 minutes later the project was in a broken state. The log shows file reads, writes, and command executions, but no explanation of reasoning or progress updates.

Your Task: Reconstruct what happened from the activity log alone. Identify the exact point where things went wrong. Then write the observability rules (progress reporting format, checkpoint requirements) that would have made the failure obvious in real time instead of requiring forensic analysis after the fact.

What You'll Learn:

How to read activity logs forensically and trace failures back to root causes
The difference between logging (what happened) and observability (why it happened)
That adding reasoning to progress reports ("I'm doing X because Y") makes debugging 10x faster

Reflection Questions:

At what point in the activity log did things go wrong? How long did it take you to find it?
What information was missing from the log that would have made the failure immediately obvious?
How would you structure progress reporting rules to make future sessions debuggable?

Module 8: Integration Capstones

Choose one (or more). These combine multiple principles — no starter prompts provided.

Capstones are different from the exercises above. There are no guided prompts — you design the entire approach yourself. Each project requires applying 3 or more principles together to solve a realistic problem.

Capstone A — Project Rescue

Open the module-8-integration/capstone-A-project-rescue/ folder. You'll find a broken project: tests are failing, the build is broken, the CLAUDE.md is out of date, there's no verification step in the workflow, and the last 3 commits are tangled messes. Your job is to rescue it.

Diagnose which principles were violated (there are at least 4). Fix the project by systematically applying the right principle to each problem: verify state with bash, decompose the fix into atomic steps, update CLAUDE.md with what you learn, add constraints to prevent recurrence, and document what you did.

What You'll Learn:

How to diagnose multiple principle violations in a real project
That rescue operations require a specific order: verify state first, then fix, then prevent
How the principles reinforce each other — fixing one often reveals violations of another

Capstone B — Workflow Design

You've been hired to set up a new project for a small team. Open the module-8-integration/capstone-B-workflow-design/ folder for the project brief. Your job is to design a complete AI-assisted development workflow from scratch.

Create a CLAUDE.md that encodes all 7 principles as project rules. Design a permission model. Write a verification checklist. Define commit conventions. Create a template for progress reporting. The deliverable is a complete "project starter kit" that a new team member could follow on day one.

What You'll Learn:

How to translate abstract principles into concrete project rules and artifacts
That designing a workflow requires thinking about all 7 principles simultaneously
The difference between principles you understand and principles you can operationalize

Capstone C — Your Own Project

Pick a real project you're working on — personal, professional, or educational. Open the module-8-integration/capstone-C-your-scenario/ folder for a self-assessment template.

Audit your current workflow against all 7 principles. For each principle: rate yourself (1-4 using the rubric above), identify the biggest gap, and write one concrete change you'll make. Then implement the top 3 changes and run a real task using your improved workflow.

What Makes This Special: Unlike Capstones A and B, this one has real stakes. The changes you make apply to YOUR actual work. The 7-principles assessment often reveals that you've mastered some principles intuitively but completely neglect others.

What You'll Learn:

Which principles you've naturally adopted and which you've been ignoring
That principles have different ROI depending on your work context
How to build a personal improvement plan based on principle gaps