One Brain, Every AI: Building an MCP Server That Connects All Your Tools

In a multi-agent startup studio, your biggest enemy is amnesia.

Every time you open a new Cursor window, start a new Claude Code session, or spin up a new agent, it starts from zero. It doesn't know the architectural decision you made yesterday. It doesn't know the bug you fixed last week.

To fix this, we built Open Brain—a persistent AI memory layer based on Nate B. Jones's architecture. We wrote previously about how we hooked it up to Gmail to passively capture context.

But the real magic isn't the capture. It's the retrieval. Here is how we built the Model Context Protocol (MCP) server that connects every AI tool we use to a single, shared brain—and the walls we hit trying to do it.

The Architecture

The goal was simple: one Postgres database (via Supabase), accessed by any AI client that supports MCP.

The Database: Supabase with pgvector enabled.
The Server: A Deno edge function acting as the MCP server.
The Clients: Cursor, Claude Code, and Claude Desktop.
The Inputs: A Discord #capture channel for manual thoughts, and a Gmail pipeline for passive capture.

When an agent needs context, it calls the MCP tool search_thoughts with a query. The server embeds the query using OpenRouter (text-embedding-3-small), runs a vector similarity search in Supabase, and returns the results.

It sounds easy. It wasn't.

Bug 1: The IVFFlat vs. HNSW Trap

When you first set up pgvector, almost every tutorial tells you to use an IVFFlat index. It's standard. It's fast.

So we built our thoughts table and added an IVFFlat index. We started dumping thoughts into it from Discord. Then we asked Cursor to search for a thought we knew was in there.

Zero results.

We queried the database directly. The thought was there. The embedding was correct. The cosine similarity was high. But the index was returning nothing.

The fix: IVFFlat requires you to build the index after you have a substantial amount of data (usually >10,000 rows) so it can calculate the optimal centroids. If you build an IVFFlat index on an empty table and then incrementally add rows (like you do with a memory layer), the index fails silently.

We dropped IVFFlat and switched to HNSW (Hierarchical Navigable Small World). HNSW is slightly more memory-intensive, but it handles incremental additions perfectly. The moment we switched, the MCP server started returning perfect matches.

(Note: We've added this as a pattern on our Patterns page and included the exact SQL in our GitHub repo.)

Bug 2: The Claude Desktop OAuth Wall

We wanted to use Claude Desktop as one of our clients. But Claude Desktop runs locally and has strict rules about how it connects to remote MCP servers.

Our initial plan was to use Supabase's built-in JWT authentication. The MCP server would pass a token, Supabase would verify it, and we'd have secure row-level security.

The problem? Claude Desktop's MCP implementation doesn't currently support complex OAuth flows or dynamic token refresh easily out of the box for custom remote servers. If the token expires, the MCP connection breaks, and the AI just tells you "I can't access that tool right now."

The fix: We had to abandon JWTs for the MCP server. Instead, we implemented Key-Based Auth. We generate a long-lived, high-entropy API key, store it as a secret in Supabase, and require the MCP client to pass it via a custom header (or query param, depending on the transport).

It's less elegant than JWTs, but it's bulletproof. Cursor, Claude Code, and Claude Desktop all support static environment variables for MCP servers. You set it once in the config, and it never expires.

Bug 3: The "Push + Pull" Context Collapse

Once the MCP server was working, we realized we had a data shape problem.

We had two streams of data coming in:

Push (Discord): Short, highly contextual, manually written thoughts. "Hey, we are using HWW-1.5 layout for all new repos."
Pull (Gmail): Long, noisy, automated chunks of email threads.

When an agent searched the brain, the vector search would often return the Gmail chunks because they were longer and had more keyword overlap, drowning out the concise, high-value Discord thoughts.

The fix: We had to add a weight column to our schema.

Manual thoughts (from Discord) get a weight of 1.5. Automated thoughts (from Gmail) get a weight of 1.0. We updated our Supabase match_thoughts RPC function to multiply the cosine similarity score by the weight before sorting the results.

Suddenly, the AI's memory prioritized explicit human directives over passive background noise, while still keeping the background noise available if nothing explicit matched.

The Result: A Shared Mind

The ROI on this build has been absurd.

Yesterday, I opened a brand new project in Cursor. I hadn't written a line of code yet. I opened the AI pane and typed: "Draft the initial architecture based on our standard multi-agent studio conventions."

Cursor called the Open Brain MCP server. It searched for "multi-agent studio conventions." It pulled back the thoughts I had dropped into Discord three weeks ago, plus an email I sent to a founder last month.

It wrote the exact folder structure we use, without me having to explain it.

AI models are commodities. The intelligence isn't the moat. The context is the moat. And now, every agent we spin up inherits the entire moat on day one.

Note: Open Brain was created by Nate B. Jones. If you're not already following his work, you should be. His Substack is the best zero-hype AI implementation resource we've found, and his original Open Brain video is what started all of this.

For the specific extensions we built for MonkeyRun (the Deno MCP server, the Discord capture, and the HNSW migration), you can check out our open-source repo on GitHub.