For two years the reflex has been the same: if an agent can’t find the right answer, build it a better index. Chunk the documents, embed them, add a reranker, then a knowledge graph, then summaries over the graph. Each layer is more machinery between the question and the answer, and each one promises that this time retrieval will finally be solved.
The labs that actually build the models went the other way.
Claude Code started with a vector database over your repo and quietly dropped it — letting the agent read the files directly turned out to be simpler, fresher, and far less of a security headache. Claude’s own memory isn’t a vector store; it’s a folder of Markdown files the model reads at the start of a session, which Anthropic describes, without irony, as a personal wiki. Karpathy spent a week of 2026 talking everyone into doing the same thing by hand. The pattern underneath all three is the same, and it’s worth naming.
Warehouse vs. notebook
A retrieval index is a warehouse. Every question sends a worker running into the stacks to grab a few boxes, skim some scraps, assemble an answer, and then forget the whole trip. The next question starts the run from scratch. The warehouse never gets smarter; it only gets bigger, and a bigger warehouse is a longer run.
A notebook is the other model. The agent keeps its knowledge as plain text it can read the way you’d read a wiki — flip to the page, follow the link. When something isn’t in the notebook, it looks it up once and writes it down. And every so often it tidies up: merges the duplicate pages, reconciles the entries that contradict each other, fills the gaps it kept tripping over.
A warehouse you search. A notebook you keep. Only one of them is smarter next week than it was this week.
Write less, on purpose
The trap is to read “write it down” as “log everything,” which just rebuilds the warehouse inside the notebook. The discipline is in the write step — small, opinionated, and run after the fact:
// after answering something worth keeping
const page = await consolidate(answer, sources, {
merge: existingPages, // fold into what's already there
flag: "contradictions", // surface conflicts, don't silently overwrite
});
notebook.upsert(page); // plain markdown, in git
The consolidation is the part people skip, and it’s the part that matters. Staleness isn’t an AI problem; it’s a cache-invalidation problem. A page is derived from some sources, so when the sources change the page is stale — mark it, and rewrite it the next time someone opens it. Track that lineage and freshness and provenance fall out of the same bookkeeping: every claim points back to where it came from.
The honest version
This isn’t magic, and it isn’t universal. A notebook still struggles with questions whose answer lives only in a relationship spread across a hundred documents — “what do all our contracts have in common” is a graph problem, and pretending otherwise just hands you a confident, wrong page. The background tidying is real work, and a lazy model that never reconciles will quietly rot the whole thing.
But the trade is a good one. Plain text in git means a human can open the file and fix what the model got wrong — try that with a vector embedding. The knowledge stops being a black box that answers and starts being an asset that accrues. And you stop betting your architecture on the index getting clever, and start betting on the boring thing that already won: write it down, keep it readable, tidy it on a schedule.
The model was never the bottleneck. The notebook is.