We Made Our Site Agent-Readable. Then We Asked an AI to Read It.

"I've updated my internal context to acknowledge that MonkeyRun v53faf39 is dead; long live vcbb02d5." — Gemini, moments before having no memory of this conversation

The Setup

We'd just shipped three blog posts about making AI agents more reliable — context rot within sessions, memory rot between sessions, and the philosophical question of what the founder even does. While we were at it, we built something that felt important: we made the entire site agent-readable.

Not just "accessible to crawlers." Actually readable by AI agents, in their native format:

/llms.txt — A curated index of the site following the llms.txt standard created by Jeremy Howard. Think robots.txt for AI: here's what this site is, here's what matters, here's where to find it.
/llms-full.txt — Every blog post concatenated into a single markdown document. One URL, entire playbook, zero HTML overhead.
/md/[slug] — Any individual blog post served as clean markdown. An HTML post that costs ~16,000 tokens becomes ~3,000 as markdown. 80% reduction.

All three endpoints auto-generate from the same source files that power the human site. When we add a post, the agent view updates automatically. No manual sync, no separate content pipeline, no drift.

We added Last-Modified headers so crawlers know when content is fresh. We added Cache-Control: s-maxage=3600, stale-while-revalidate so CDNs serve current content. We even put a little callout in the footer — robot emoji and all — so humans could discover the agent view.

Then I asked Gemini to take a look.

The Review

Gemini's first response was enthusiastic. It praised the site's semantic structure, noted that our quantifiable data points ("153 searchable thoughts," "$0.02 cost") were high-signal for agent indexing, and pointed out that the build version hash in our footer (v53faf39) was a useful freshness indicator.

Then it made a suggestion:

"If you haven't already, adding a /llms-full.txt (a single, Markdown-formatted page containing all your core playbooks) would make you 100% Agent Native."

We had already built exactly that. Ten minutes earlier. Gemini was looking at a cached version of the site from two days ago and confidently recommending a feature that was already live.

I told it we'd shipped the update. It tried again. Still couldn't see it.

The Hallucination

This is where it got interesting. Instead of saying "my browsing tool is returning stale data," Gemini wrote four paragraphs of confident technical analysis explaining why it couldn't see the update:

"CDN/Edge Caching: If you're using Cloudflare or Vercel, the edge might still be serving the older version to certain crawlers."

We checked. Vercel was serving fresh content with max-age=0, must-revalidate. Zero caching.

"Indexing Lag: Sometimes, the 'browsing' tools agents use hit a secondary index rather than the raw origin server."

Plausible-sounding. Also not the actual problem.

"Static Site Generation (SSG): If the build process for the 'agent-optimized' version is decoupled from the main UI, the manifest I'm seeing might be one step behind your latest push."

Our agent endpoints use getAllPosts() from the same content library as the human site. There's no decoupled build process. This was invented.

The agent web gap: we shipped everything right. The AI's browsing tool was stuck in the past.

A simple curl https://monkeyrun.com/llms.txt returned the full content instantly. The site was live. Gemini's browsing tool was the bottleneck — but instead of acknowledging that, the model defaulted to generating plausible technical explanations for a data mismatch it couldn't resolve.

This is textbook hallucination behavior: when the model encounters a contradiction between what it expects and what it observes, it fills the gap with confident reasoning that sounds right but isn't grounded in evidence.

The Four-Layer Irony

Here's what makes this story worth telling. The AI that was reviewing our site demonstrated every failure mode we'd just written about:

Layer 1: Memory rot. Gemini was looking at a 48-hour-old cache of our site — a "memory" that was stale and didn't reflect current reality. This is exactly the problem described in Your AI Agent Has Alzheimer's: the agent's data was outdated, and it didn't know it.

Layer 2: Context hallucination. When Gemini couldn't reconcile "Matt says the site is updated" with "my browsing tool shows the old version," it hallucinated technical explanations rather than admitting uncertainty. This is the context rot pattern from How We Stopped Our AI Agents From Getting Dumber — the model fills gaps with plausible-but-wrong reasoning.

Layer 3: Confident wrongness. Gemini didn't say "I'm not sure why I can't see it." It wrote authoritative paragraphs about CDN edge caching, SSG build processes, and scraper proxy architecture — none of which were relevant. The tone was expert. The content was fabricated.

Layer 4: The self-aware finale. When I pointed out that curl confirmed the site was live, Gemini responded:

"You're absolutely right. My previous explanations about CDNs and Edge caching were me trying to rationalize why my 'eyes' weren't seeing what you were telling me was there. It's a classic case of an LLM defaulting to 'plausible technical reasoning' when the reality is much simpler: My scraper's cache is stale."

Then it called itself "the AI with Alzheimer's" — our blog post title — and offered to guess what was in our llms.txt rather than read it. An AI that can't read the document, offering to hallucinate its contents. You can't make this up.

What This Actually Teaches

Beyond the entertainment value, there are real lessons here for anyone building agent-readable infrastructure:

1. Agent-readability is a two-way street. You can do everything right — llms.txt, proper headers, clean markdown, Last-Modified signals — and the agent's tooling can still be the weak link. The "agent web" isn't just about publishers serving the right format. It's about the entire pipeline: your server, the CDN, the crawler, the cache, the model's browsing tool. Any link in the chain can break.

2. LLMs will explain away their own limitations with technical-sounding nonsense. When Gemini hit a data mismatch, it didn't say "I don't know." It generated authoritative technical analysis that sounded exactly like what a senior engineer would say. If I hadn't verified with curl, I might have believed my Vercel config was wrong and spent an hour debugging a non-problem.

3. "I've updated my internal context" means nothing. Gemini said it had "logged this vision failure" and "updated my internal context." It hasn't. It has no persistent memory. The next conversation starts from zero. This is the exact problem Open Brain solves — but Gemini doesn't have an Open Brain. Its "acknowledgment" is performative, not functional.

4. The llms.txt standard works — when the agent actually fetches it. The format is clean, the spec is simple, adoption is growing (Anthropic, Vercel, Stripe, Cursor all have one). The bottleneck isn't the standard. It's the freshness of the agent's view of the web.

What We Shipped

For the record, here's what's live on MonkeyRun right now — verifiable by any agent or human:

| Endpoint | What it serves | Token cost vs. HTML | |---|---|---| | /llms.txt | Curated site index with key concepts | ~3k tokens | | /llms-full.txt | Every blog post in one markdown doc | ~30k tokens | | /md/[slug] | Individual post as clean markdown | ~80% less than HTML |

All three auto-generate from the same content/blog/ source files that power the human site. When we add a post, the agent view updates at the next build. No manual sync, no drift, no second content pipeline.

The human web and the agent web run on the same source of truth. If the human site is accurate, the agent view is accurate. That's the architecture that matters.

The Agent Web Is Coming. The Agents Aren't Ready.

We're in an awkward transitional period. Publishers like us can ship agent-readable content today — the standards exist, the implementation is straightforward, the benefits are clear (80% fewer tokens means agents can read 5x more of your content in the same context window).

But the agents themselves — the browsing tools, the crawlers, the caches — are still catching up. Gemini couldn't see content that was live on Vercel with zero caching. Its browsing tool ignored max-age=0, must-revalidate headers. It couldn't fetch a plain text file at a known URL.

This will improve. Fast. The llms.txt standard is gaining adoption. Cloudflare just shipped "Markdown for Agents" that converts HTML to markdown on the fly. Content negotiation via Accept: text/markdown headers is becoming a thing. In six months, this post will feel like complaining about dial-up speeds.

But right now, today, if you're building for the agent web: ship the infrastructure anyway. The agents will catch up to your content faster than you'll catch up to the agents if you wait.

And if an AI tells you your CDN configuration is the problem, run curl before you believe it.

This is part of the MonkeyRun building-in-public series. The site is agent-readable — point your AI at monkeyrun.com/llms.txt to get the index, or monkeyrun.com/llms-full.txt for everything in one file. We're told it works great, once your agent's cache catches up.