The Great Inversion answered one question and raised another. James now understood that TutorClaw's operating cost was $50-70 per month because the learner provides their own LLM. But that raised an obvious follow-up.
"If my learners are paying for their own tokens," James said, "then I need to know what that costs them. If the cheapest option is ten dollars a day, the inversion looks good on my balance sheet but terrible for adoption."
Emma pulled up a pricing table. "Look at the rightmost column first. Then look at the leftmost. Tell me the ratio."
You are doing exactly what James is doing. The Great Inversion shifts LLM costs to the learner. Now you need to understand the range of what those costs look like, because it directly affects whether learners will use your product.
Claude Sonnet output costs $15 per million tokens. GPT-5 Nano costs $0.40 per million tokens. That is a 37x difference across the practical range most learners will choose from. Claude Opus sits above this range at $75/M (187x compared to Nano), but few learners use Opus for daily tutoring.
This table is the terrain your learners navigate. A learner using Claude Opus pays 300x more per input token than a learner using GPT-5 Nano. On output tokens (which dominate tutoring), the practical range is 37x.
You do not pay for these tokens. So why should you care?
Because your product's reputation depends on the learner's experience. If a learner picks the cheapest model and gets confused, garbled tutoring, they do not blame the model. They blame TutorClaw.
This creates a product design constraint: TutorClaw must work acceptably across the entire 37x range.
The MCP server makes this possible. TutorClaw's MCP server returns structured tool responses: which chapter the learner is in, which PRIMM stage to use, which exercise to present. This pedagogical structure comes from the server, not the model. The LLM wraps that structure in natural language, but the core teaching logic is model-independent.
The naive way to evaluate model costs is cost per token. The correct metric is:
CPAO = (Token Cost + Human Correction Cost) / Accepted Outputs
In pure token economics, Scenario B costs 24x more than Scenario A. The budget model wins on price even after adjusting for failures. However, CPAO does not capture trust erosion. A 40% failure rate causes churn, regardless of how cheap the tokens are.
Pick two models and estimate success rates. Calculate the CPAO for both:
The model with the lower CPAO is the better value per successful interaction.
James ran the numbers. "Fifty exchanges a day on Claude Opus is $7.50. On GPT-5 Nano, it is four cents. My learners are making that choice every day."
"And some will pick the cheapest option regardless of quality," Emma said. "Architecture 4 is designed to be indifferent to model pricing. The question is which budget tier remains viable for actual tutoring."
James thought in supply chain terms. "The recipe is the intelligence. The ingredients are the model. My job is to make the recipe work with whatever ingredients they source."