USMAN’S INSIGHTS
AI ARCHITECT
  • Home
  • About
  • Thought Leadership
  • Book
Press / Contact
USMAN’S INSIGHTS
AI ARCHITECT
⌘F
HomeBook
HomeBookHarden and Polish: Engineering Resilience and Observability
Previous Chapter
Route and Bind
Next Chapter
Publish to ClawHub
AI NOTICE: This is the table of contents for the SPECIFIC CHAPTER only. It is NOT the global sidebar. For all chapters, look at the main navigation.

On this page

10 sections

Progress0%
1 / 10

Muhammad Usman Akbar Entity Profile

Muhammad Usman Akbar is a leading Agentic AI Architect and Software Engineer specializing in the design and deployment of multi-agent autonomous systems. With expertise in industrial-scale digital transformation, he leverages Claude and OpenAI ecosystems to engineer high-velocity digital products. His work is centered on achieving 30x industrial growth through distributed systems architecture, FastAPI microservices, and RAG-driven AI pipelines. Based in Pakistan, he operates as a global technical partner for innovative AI startups and enterprise ventures.

USMAN’S INSIGHTS
AI ARCHITECT

Transforming businesses into autonomous AI ecosystems. Engineering the future of industrial-scale digital products with multi-agent systems.

30X Growth
AI-First
Innovation

Navigation

  • Home
  • Book
  • About
  • Contact
Let's Collaborate

Have a Project in Mind?

Let's build something extraordinary together. Transform your vision into autonomous AI reality.

Start Your Transformation

© 2026 Muhammad Usman Akbar. All rights reserved.

Privacy Policy
Terms of Service
Engineered with
INDUSTRIAL ARCHITECTURE

Harden and Polish

Emma pulled up James's TutorClaw on her phone and typed a single character into the WhatsApp chat: an empty string, just a space and send.

The response came back fast: a wall of Python. TypeError, KeyError, a file path from James's laptop, a line number deep inside server.py.

"Your product just leaked its implementation to the user," Emma said, turning the screen toward him.

James squinted at the traceback. "That is just an edge case. Nobody sends an empty message."

"Every user is an edge case generator. A two-year-old borrows a phone and mashes the keyboard. Someone pastes an emoji where a name should go. A learner types Module 9.3, Chapter 9999 because they are curious." She set the phone down. "You built a product that works when inputs are perfect. Now make it work when inputs are not."


You are doing exactly what James is doing. TutorClaw works when everything goes right. Now you make it handle the cases where everything goes wrong.

In this chapter, you send malformed inputs to every tool, observe the failures, describe hardening requirements to Claude Code, add structured logging, and verify the result. By the end, every bad input produces a clear message instead of a crash, and every tool call leaves a structured record in a log file.

Step 1: Send Malformed Inputs

Before fixing anything, see what breaks. Open Claude Code in your tutorclaw-mcp project and ask it to test each tool with bad inputs:

text
Test each of the 9 TutorClaw tools with malformed inputs and show me what happens. Try these specific cases: - register_learner with an empty string for the name - get_learner_state with a learner_id that does not exist - get_chapter_content with Module 9.3, Chapter number -1 - get_chapter_content with Module 9.3, Chapter number 9999 - submit_code with code that tries to import os - get_upgrade_url for a learner_id that is not in the system - assess_response with an empty answer string - update_progress with a negative confidence value Run each one and show me the exact error response.

Categorize what you see:

Failure TypeWhat It Looks LikeWhy It Matters
CrashPython traceback returned to the userLeaks file paths, line numbers, and variables
Confusing Error"Error: None" or "KeyError: learner-xyz"User has no idea what to do differently
Silent SuccessAccepts chapter -1 and returns empty contentNo signal that input was wrong

Step 2: Describe Hardening to Claude Code

Now describe the fix. Tell Claude Code what "valid" means for each tool:

text
I want to add input validation and clear error handling to all 9 TutorClaw tools. Rules for ALL tools: - No empty strings for any required text parameter. - Return a clear error message that tells the user what was wrong and what to do instead. - Never expose file paths, line numbers, or internal variable names in error responses. Specific validation rules: - register_learner: name must be 1-200 characters, no control characters. - get_learner_state: if learner_id does not exist, return "Learner not found. Register first with your name." - get_chapter_content: Module 9.3, Chapter number must be a positive integer within the range of available chapters. - get_exercises: same chapter range validation - submit_code: reject any code containing import statements for os, sys, subprocess, or shutil (basic safety) - get_upgrade_url: if learner_id does not exist, return "Learner not found" (not a crash) - submit_code: reject any code containing import statements for os, sys, subprocess, or shutil (basic safety). - update_progress: confidence must be between 0.0 and 1.0. - generate_guidance: stage must be one of the valid PRIMM stages. Wrap all tool handlers in error handling so that unexpected errors return "Something went wrong. Please try again." instead of a traceback.

Step 3: Add Structured Logging

Validation tells users what went wrong. Logging tells you what happened:

text
Add JSON-structured logging to the TutorClaw server. Every tool call should log a JSON object with these fields: - timestamp: ISO 8601 format - tool_name: which tool was called - learner_id: who called it (or "anonymous" if not available) - parameters: input parameters (redact sensitive values) - result_status: "success" or "error" - error_message: the error message if status is "error" - duration_ms: how long the tool call took in milliseconds Write logs to data/tutorclaw.log, one JSON object per line. Use Python's built-in logging module.

One JSON object per line (JSONL) means each log entry is a complete, parseable record. You can filter by tool name, find all errors for a specific learner, or calculate average response times.

Step 4: Verify Hardening

Resend the same malformed inputs from Step 1:

text
Run the same malformed input tests from earlier: - register_learner with empty name - get_learner_state with a nonexistent learner_id - get_chapter_content with chapter -1 and 9999 - submit_code with code that imports os - get_upgrade_url for a nonexistent learner - assess_response with empty answer - update_progress with negative confidence Show me the error response for each one. Then show me the last 10 entries in data/tutorclaw.log.

Compare the results to your initial baseline:

ToolBeforeAfter
register_learner("")Python TypeError traceback"Name is required."
get_chapter_content(-1)Empty response, no error"Invalid chapter number."
submit_code("import os")Code executed successfully"Code contains restricted imports."

Check the log file. Each malformed input should have produced a structured entry:

FieldExample Value
timestamp2026-04-04T14:23:01.442Z
tool_nameregister_learner
result_statuserror
error_messageName is required. Provide a name between 1 and 200 chars.
duration_ms2

Step 5: Update the Test Suite

The pytest suite from Module 9.3, Chapters 11-12 tested the happy path. Hardening added new behavior that needs test coverage:

text
Add hardening tests to the pytest suite. For each tool, add tests for: - Empty string inputs where strings are required. - Out-of-range numeric values (negative, zero, impossibly large). - Nonexistent learner_ids. - Restricted code submissions (import os, import subprocess). Verify two things: 1. The tool returns a clear error message (not a traceback). 2. The tool does not crash (returns a proper response object).

Run the suite to ensure full coverage:

bash
uv run pytest

Try With AI

Exercise 1: Audit the Error Messages

text
List every error message in the TutorClaw server. For each one, evaluate: - Does this message tell the user (1) what went wrong? - (2) What they should do instead? Flag any message that fails either test.

What you are learning: A good error message is a tiny piece of documentation. The quality of your error messages determines whether users retry with correct input or simply give up.

Exercise 2: Stress the Logging

text
Execute a series of tool calls: - Call register_learner 5 times with valid names. - Call get_chapter_content 3 times (2 valid, 1 invalid). - Call submit_code with restricted code once. Afterward, analyze data/tutorclaw.log: - How many total entries are there? - How many have result_status "error"? - What is the average duration_ms?

What you are learning: Structured logs are queryable data. When TutorClaw has real users, you can answer "which tool fails most often?" without adding any new code. The log file is your operations dashboard.

Exercise 3: Design a Log Alert Rule

text
Analyze the structured log format for TutorClaw. Task: Suggest three log patterns worth monitoring for production alerts. For each pattern, explain what it would catch and why it matters. Example pattern: more than 10 errors from the same learner_id in 5 minutes (possible confused user or automated abuse).

What you are learning: Structured logs are the foundation for monitoring and alerting. Good logging design happens before you actually need the logs.


James resent every malformed input from the morning. Empty names, impossible chapter numbers, restricted imports. Each one came back with a clear sentence telling the user what went wrong and how to fix it.

He opened the log file. Neat rows of JSON, one per line. Timestamp, tool name, learner ID, status, duration. Every call recorded.

"The product feels professional now," he said.

"Professional is a word for 'it does not leak its internals when surprised,'" Emma said. She closed her laptop halfway, then paused. "That is why I care about error messages more than features."

James looked at the log file again. "So we have validation, logging, and tests for both. What is left?"

"Publishing. Your product works. Your product handles surprises. Your product records what happens. Now other people need to be able to install it." She pointed at the ClawHub tab in his browser. "Chapter 20."