The Markdown File That Beat a $50M Vector Database

Source: Micheal Lanham, cited in Email 08 bundle (2026-04-13).

Key takeaways

Provocative headline — real architectural point underneath. Well-structured markdown + good LLM + CLAUDE.md beat elaborate vector-search infrastructure for many retrieval use cases.
Confirms the Karpathy “LLM Wiki” thesis (News & Research page) — retrieval-with-context is easier than retrieval-alone for LLMs.
Directly relevant to any Harris team considering whether to build elaborate RAG infrastructure or start with a plain-markdown wiki pattern.

The core argument

“A well-organised markdown directory, maintained by an LLM, can outperform a $50M enterprise vector database for most retrieval-augmented generation use cases.”

The reason isn’t that vector search is bad. It’s that:

LLMs read markdown natively — no retrieval step needed
Structure beats similarity for most “find-the-right-piece” problems
A wiki maintained by the LLM compounds; a vector DB doesn’t
You can audit a markdown file; you can’t audit a vector

Practical takeaway

Before building a vector-search layer for your Claude Code workflow, try the Karpathy LLM Wiki pattern first. Ingest → Query → Lint with markdown-only storage. If that breaks down at scale, then add a vector layer.

For any Harris internal knowledge base (customer support, engineering onboarding, compliance references), this is a direct recommendation: start with markdown, add complexity only when you prove you need it.

Knowledge & Context — full LLM Wiki pattern reference
Karpathy LLM Wiki — the foundational pattern
The Orchestrator Was Missing — how orchestration reduces the need for elaborate retrieval

The Markdown File That Beat a $50M Vector Database

Key takeaways

The core argument

Practical takeaway

Related Playbook pages