James had nine tools spread across four chapters. State tools in Module 9.3, Chapter 3, content tools in Module 9.3, Chapter 4, pedagogy tools in Module 9.3, Chapter 5, code and upgrade tools in Module 9.3, Chapter 6. Each one tested individually, each one working in isolation.
"Time to see if they play together," he said, opening the project directory.
Emma set her coffee down. "Start the server. Call every tool. In order. One bad import and the whole server refuses to start."
You are doing exactly what James is doing. Nine tools, one server, one verification run.
Open a terminal in your tutorclaw-mcp project directory and ask Claude Code:
Claude Code starts the server. Watch the startup output. You should see all nine tool names listed.
What to check:
If you see all nine: move to Step 2.
If a tool is missing or the server fails to start, read the error message. Common problems at this stage:
Import conflicts. Two tool modules import the same helper differently, or a circular import between modules prevents the server from loading. Paste the traceback to Claude Code:
Missing data directories. A tool tries to read from data/ or content/ but the directory does not exist yet. Claude Code can create the directory structure or add a check that creates it on first run.
Shared state file locks. Two tools try to open the same JSON file at the same time during registration. This is rare in single-process servers but can happen if the startup sequence initializes state. Describe the behavior to Claude Code and let it restructure the file access.
Now call every tool in the order a real tutoring session would use them. Ask Claude Code:
Claude Code calls each tool and shows the output. You built each of these tools in Module 9.3, Chapters 3 through 6, so the individual responses should look familiar. What is new here is the sequence: each tool reads or writes state that the next tool depends on. If the chain breaks at any point, you will see it in the output.
If every tool returned a valid response, you are done with the verification. Skip to the closing.
If any tool failed, describe the error to Claude Code:
Give Claude Code the context of where in the sequence the failure happened. A tool that works in isolation might fail when called after another tool has modified the shared state. The sequence matters because each tool depends on what the previous tools wrote.
Common sequence failures:
State not found. get_learner_state fails because register_learner wrote the ID to a different key than get_learner_state expects. This is a contract mismatch between tools built in different chapters.
Content directory empty. get_chapter_content returns nothing because the sample content files from Module 9.3, Chapter 4 are missing or in the wrong path. Verify the content/chapters/ directory has your markdown files.
Wrong stage passed. generate_guidance rejects the stage parameter because the tool expects "predict" but the caller sent "Predict" (capitalization mismatch). Small contract issues like this surface only in integration.
After each fix, run the full sequence again from the beginning. The goal is one clean run through all nine tools.
What you are learning: A product does not control the order users call tools. The agent might call assess_response before generate_guidance if the user jumps ahead. Each tool should handle unexpected sequences gracefully, returning an error or a sensible default rather than crashing.
What you are learning: The spec from Module 9.3, Chapter 2 is the contract. Integration testing is where you verify the implementation honors that contract. Checking spec against behavior is the core of the describe-steer-verify workflow. If the behavior drifts from the spec, the spec or the code needs to change.
What you are learning: The connection pattern is identical to Module 9.2: start the server, run openclaw mcp set, restart the gateway, check the dashboard. Planning the next step before executing it reinforces the workflow and helps you anticipate problems. The WhatsApp message you choose determines which tool the agent calls first.
James ran the sequence. register_learner returned a learner_id. get_learner_state showed chapter 1, predict stage, 0.5 confidence. get_chapter_content returned the first chapter. generate_guidance produced a prediction prompt. assess_response evaluated his test answer. update_progress recorded the interaction. get_exercises pulled practice problems. submit_code ran his print statement and returned the output. get_upgrade_url gave him a placeholder URL.
"All nine. Working together."
Emma looked at the terminal output. "When I built my first 9-tool server, I had three circular import errors and a shared state race condition. Yours started on the first try because Claude Code structured the modules properly." She paused. "I spent an afternoon untangling mine. You spent twenty minutes verifying yours."
"So it is done?"
"Locally." Emma closed her laptop halfway. "Module 9.3, Chapter 8, you connect this to OpenClaw and test from your phone. That is a different kind of test. Your tools work when you control the input. When the agent decides which tool to call based on a WhatsApp message, you find out if your tool descriptions are good enough."
James looked at his terminal. Nine tools, nine responses, zero errors. "I will take that test."