AI Trends/June 8, 2026

Docs as an abstraction layer for coding agents

5 minutes read

HW

Han Wang

Co-Founder

Share this article

HW

Han Wang

Co-Founder

Share this article

Docs as an abstraction layer for coding agents

SUMMARY

Raw code is a poor interface for coding agents. When a large engineering org instrumented their massive monorepos with structured Mintlify docs and connected agents through MCP, they saw 64% more precise answers, 39% better discoverability, roughly half the token consumption, and 1.5x faster task completion compared to a no-docs baseline.

A company we work with has a dedicated team whose sole mandate is making AI coding agents more efficient across their engineering org.

That focus on efficiency tells you something about how enterprises are thinking about token spend today.

They're running Claude Code, primarily on Sonnet, across monolith repos with millions of lines of code. The agents are expensive and unreliable. The engineers using them spend more time prompting and re-prompting than they would just writing the code themselves.

We ran a controlled experiment with them to measure how much documentation structure matters for agent performance. The results were good enough that I think this applies to almost every company doing serious AI-assisted development at scale.

Agents need structure. Raw code is a bad interface

When you point a coding agent at an undocumented codebase it crawls files, guesses at structure from naming conventions, and pieces together intent from patterns in the code. With no memory of what it figured out last time, this gets expensive.

You'll burn an enormous amount of context budget on orientation before the agent gets to whatever you asked it to do.

The second problem is that code only tells you what got built, not why. The reasoning behind architectural decisions lives in old Slack threads and a couple people's brains. An agent working from code doesn't have the intent layer. Even a hammer is complex if you don't know you're supposed to hit nails with it.

Structured documentation is an abstraction over the codebase that improves agents by encoding intent alongside implementation. That's the hypothesis we went in to test.

How we designed the experiment

The team ran tests across their largest, most actively used repositories. For each one, they defined around 15 deterministic tasks spanning easy, medium, and hard. Things like "how do you compile the server," "identify whether this service has a dependency on library X," and several code generation tasks. Success was measured deterministically: shell exit codes, specific string matches, verifiable outputs. Each task ran five times per condition for statistical reliability.

Three conditions:

No docs — a branch with the README, CLAUDE.md, agents.md, and all documentation stripped. Clean baseline.

Improved docs — README plus human-written docs and AI/agent markdown files. The best realistic self-hosted outcome.

Mintlify-generated docs — the improved docs ingested into Mintlify, with Claude Code connected via MCP.

The primary metrics were precision (did the agent get the right answer) and discoverability (did it find the right context to work from).

The numbers

Against the no-docs baseline:

64% more precise answers across all task types
39% better discoverability — agents found the right information significantly more often
~50% fewer tokens consumed per task
1.5x faster task completion

I expected structured docs to help, but was surprised by the extent of the improvements. The token reduction in particular changes how you think about the economics of AI-assisted development.

Letting agents work from raw code is a baseline. To compete and get the best results, you need to give them the structure that docs provide.

READMEs are not a documentation system

READMEs can be a great solution, but just adding a file or inline comments isn't enough. These are snapshots that decay into another source of stale information for agents to hallucinate from. Automating updates for your documentation when your code changes is the best way to keep your intent layer up to date. You want to get documentation as close to the code as possible.

The token math

Big engineering orgs are spending millions on tokens for AI coding tasks, and that number is climbing. Every budget holder we talk to is asking "why are we paying so much for results this inconsistent?"

Cutting per-task token consumption by 50% can be huge savings. If you're spending $1M per year on AI coding tokens, that's $500K back (assuming throughput stays constant, and throughput goes up when agents are faster and more accurate).

An agent using MCP to search structured docs retrieves precise, relevant context in a single targeted query. An agent without that infrastructure crawls files until it finds something usable or runs out of context. Cheap and accurate versus expensive and flaky.

Better models don't make this obsolete

The obvious counterargument is that if models are getting better at navigating code, won't this problem just go away?

The experiments we ran are saying no. A more capable model working from raw code is still relearning the codebase from scratch every session. It's still burning context on orientation and missing the intent layer that lives outside the code.

What we found is that this architecture scales with model capability. Well architected docs plus a capable model outperforms the same model working from raw code.

Where to start

We are planning to release a full public benchmark suite soon, so you can see more clearly how to replicate our experiment.

In the meantime, the easiest entry point is mintlify.wiki/explore, which generates a Mintlify docs instance from your repository. Connect your coding tools via MCP to give your agents structured search over that knowledge instead of raw file crawling.

The second step is to make it automatically maintained via workflows. They run on repository pushes or cron jobs, detect when code changes require documentation updates, and make the updates for you.

If you want to run a similar evaluation for your own codebase, reach out.

Docs URL Benchmark: Markdown & llms.txt > HTML

We benchmarked four ways to serve documentation to AI agents (HTML, plain markdown, markdown linking to llms.txt, and markdown with llms.txt inlined) across 2,400 runs on 20 Mintlify docs sites, and found that a single link to llms.txt eliminates most agent 404s at no added cost.

July 17, 2026

AS

Aadit Shah

Engineering

Best Practices

How Claude Code's documentation team makes feedback actionable with Mintlify

Anthropic's Technical Content Engineer for Claude Code shares how she uses Mintlify and Claude to automate documentation improvements from user feedback.

June 25, 2026

EP

Ethan Palm

Technical Writing

Docs as an abstraction layer for coding agents

Share this article

Share this article

Agents need structure. Raw code is a bad interface

How we designed the experiment

The numbers

READMEs are not a documentation system

The token math

Better models don't make this obsolete

Where to start

More blog posts to read

Docs URL Benchmark: Markdown & llms.txt > HTML

How Claude Code's documentation team makes feedback actionable with Mintlify