Build the State Tools

Name: Digital FTEs: Engineering — Achieving 10× Productivity
Author: Muhammad Usman Akbar

James looked at his tutorclaw-mcp project from Module 9.2. One tool: register_learner. It worked. He had tested it, connected it to OpenClaw, verified it from WhatsApp. But something nagged him.

He restarted the server. Then he sent a message from WhatsApp: "Show me my learning progress."

Nothing. The learner he had registered was gone.

"That is the problem with in-memory storage," Emma said, looking at his screen. "Every restart wipes the slate. Your learner registered five minutes ago and the server has already forgotten them."

"So I need a database."

"You need files on disk. JSON files. Simple, local, no database server to install." She pointed at the spec they had designed in Module 9.3, Chapter 2. "Three state tools: register_learner upgraded to persist data, get_learner_state to read it back, update_progress to change it. Describe each one to Claude Code. One tool per prompt. You have the specs from yesterday."

You are doing exactly what James is doing. Three tools, JSON persistence, data that survives restarts.

Tool 1: Upgrade register_learner

Open Claude Code in your tutorclaw-mcp project. Send this prompt:

text

I want to upgrade the register_learner tool to persist data.

Requirements:
- Storage: Use a JSON file located at data/learners.json.
- Entry Fields: Unique ID, name, tier (default "free"), and a mock API key.
- Persistence: Data must survive server restarts.
- Initial Mocks: Use MOCK_LEARNER_ID = "learner-001" and MOCK_API_KEY = "test-key-001".

Spec this out before building.

Review the spec Claude Code produces. Check the same elements you checked in Module 9.2:

Element

What to Look For

Tool description

Specific enough that an agent knows when to call this instead of get_learner_state or update_progress

Input parameters

Name (required), anything else optional

Output format

Returns the learner record including the generated ID

Persistence

Reads from and writes to data/learners.json

If the tool description is vague, steer it. The agent will have nine tools eventually. A description like "handles learner data" is useless when three different tools all handle learner data. A description like "Register a new learner by name. Call this when a user wants to sign up or create a new learning profile for the first time" gives the agent a clear selection signal.

Once the spec looks right:

Specification

The spec looks good. Build this.

Claude Code creates the data/ directory, sets up the JSON file handling, and updates your server with the persistent version of register_learner. Run the tests:

Specification

uv run pytest

If tests fail, paste the output back to Claude Code and ask it to fix them. That is the cycle.

Tool 2: get_learner_state

Now add the second tool. Send this prompt to Claude Code:

text

Add a get_learner_state tool to the MCP server.

Functionality:
- Source: Read from data/learner_state.json.
- Parameters: learner_id.
- Returns:
  - chapter (default 1)
  - stage (default "predict")
  - confidence (default 0.5)
  - tier (lookup from learners.json)
  - exchanges_remaining (50 for free, unlimited for paid)
  - weak_areas (list)

Mock ID: learner-

001. Spec this before building.

Review the spec. Pay special attention to the tool description. This tool reads state. It does not change state. The description must make that distinction clear, because the agent will also have update_progress, which does change state. If the descriptions overlap, the agent will pick the wrong one.

Good distinction:

get_learner_state: "Retrieve the current learning state for a learner. Call this when you need to check where a learner is in the curriculum, what their confidence level is, or how many exchanges they have remaining. This tool reads state but does not modify it."
update_progress: "Record a learning interaction and update the learner's state. Call this after a learner completes a stage, answers a question, or finishes an exercise."

If Claude Code's description does not make the read-versus-write distinction, steer it:

text

Update the tool description for get_learner_state.

Logic Update:
Ensure the description clearly conveys that this tool is read-only. The agent must be able to distinguish it from update_progress, which modifies the state.

Approve the spec, let Claude Code build, run tests:

Specification

uv run pytest

Tool 3: update_progress

The third state tool. Send this prompt:

text

Add an update_progress tool to handle state writes.

Requirements:
- Inputs: learner_id, chapter, stage, confidence_delta.
- Logic:
  - Update data/learner_state.json.
  - Confidence logic: delta is added to current score and clamped (0.0 to 1.0).
  - Initialization: If no existing state, create a default first.
  - Returns: The newly updated state object.

Mock ID: learner-

001. Spec this before building.

Review the spec. This tool writes state. The description should say so. The output should return the updated state so the agent can confirm what changed.

Approve, build, test:

Specification

uv run pytest

The Restart Test

This is the verification that matters most. Everything before this was unit testing: does each tool work in isolation? The restart test answers a bigger question: does your data survive?

Step 1: Register a learner

Call register_learner through Claude Code or your test suite. Confirm you get a learner record back with an ID.

Step 2: Stop the server

Kill the running server process. The in-memory version from Module 9.2 would lose everything here. Your JSON version should not.

Step 3: Start the server again

Specification

uv run tutorclaw

Step 4: Call get_learner_state

Query the learner you registered before the restart. If the data comes back with the correct name, tier, and default state values, the persistence layer works.

If the data is missing, something went wrong with the JSON file write. Ask Claude Code:

text

Troubleshoot the Persistence Layer.

Problem:
I registered a learner, restarted the server, and called get_learner_state, but the data is missing.

Task:
Check the JSON file handling in register_learner. Verify if the tool is explicitly writing to data/learners.json on every registration event.

This is the normal cycle: test, find the gap, describe the fix, verify again.

Connect to OpenClaw and Test from WhatsApp

Do not wait until all 9 tools are built to connect. Connect now with three tools.

Start the server, then connect to OpenClaw the same way you did in Module 9.2:

Specification

openclaw mcp set tutorclaw --url http://localhost:8000/mcp --transport streamable-httpopenclaw gateway restart

Open the dashboard. Navigate to Agents > Tools. You should see three tools: register_learner, get_learner_state, update_progress.

Now pick up your phone. Send a WhatsApp message:

Specification

Register a new learner named James

The agent calls register_learner and returns a learner ID. Then send:

Specification

Show me my learning progress

The agent calls get_learner_state and returns chapter 1, predict stage, 0.5 confidence.

Three tools, working from your phone. As you add more tools in the next three chapters, they appear automatically when you restart the server. You do not need to reconnect; openclaw gateway restart picks up new tools from the running server.

What You Built

Three tools now handle the state layer of TutorClaw:

Tool	Purpose	Data File
register_learner	Create a new learner record with ID, name, tier, API key	data/learners.json
get_learner_state	Read current chapter, stage, confidence, exchanges, weak areas	data/learner_state.json
update_progress	Write updated chapter, stage, confidence after a learning interaction	data/learner_state.json

All three use JSON files on disk. No database server. No cloud storage. The data survives restarts because it lives in files, not in memory.

The mock auth pattern (MOCK_LEARNER_ID and MOCK_API_KEY) lets you test without building a real authentication system. Every tool call uses the same hardcoded learner. Real auth with unique API keys comes later in the chapter. This is a common development technique: mock the parts you are not building yet so you can focus on the parts you are.

Try With AI

Exercise 1: Audit the Tool Descriptions

Ask Claude Code to review all three state tool descriptions together:

text

Compare the tool descriptions for the State layer.

Analysis:
List the descriptions for register_learner, get_learner_state, and update_progress side-by-side.

Question:
Identify any overlapping phrases or ambiguities that could cause an agent to pick the wrong tool during a multi-turn conversation.

What you are learning: When multiple tools operate on the same data (learner state), the descriptions must carve out distinct responsibilities. Overlap in descriptions means confusion in tool selection.

Exercise 2: Test the Edge Case

What happens if you call get_learner_state for a learner that does not exist?

text

Test the edge case for missing learners.

Scenario:
Call get_learner_state with a non-existent learner_id.

Evaluation:
What does the tool return? Is the error message clear enough for an agent to recover by suggesting the register_learner tool?

What you are learning: Tools need to fail clearly. An agent that gets a vague error cannot recover. An agent that gets "Learner not found. Register this learner first using register_learner" knows exactly what to do.

Exercise 3: Evaluate the Storage Choice

Ask Claude Code to think about the tradeoffs of JSON file storage:

text

Evaluate TutorClaw's storage architecture.

Analysis:
What are the hard limitations of using flat JSON files for learner state?

Perspective:
At what specific scale (concurrency, data size, server count) would we be forced to migrate to a database? Identify the first failure points we would encounter.

What you are learning: JSON files work for a single-user local product. They break under concurrent writes, large datasets, or multi-server deployments. Knowing the limits of your storage choice is a product design skill, not just a technical one.

James registered a learner, stopped the server, started it again, and called get_learner_state. The data was there. Name, tier, default chapter, default stage. All of it.

"JSON files on disk," he said. "That is all it took."

"For now," Emma said. She paused, then added: "Honestly, I am not sure exactly when JSON files stop being enough. I have seen projects run on flat files longer than anyone expected, and I have seen them break earlier than planned. It depends on how many learners hit the server at once, how often you write, how big the files get." She shrugged. "The right answer is: ship the product first. Scale the storage later. You will know when JSON is not enough because the tests will start failing under load."

"So we have three tools. State is handled."

"State is handled. But your tutor has nothing to teach yet." She opened the spec from Module 9.3, Chapter 2. "Content tools are next. Chapter text, exercises, teaching material. The tutor needs a curriculum before it can tutor."