Writing

Notes from the middle layer.

Essays on AI agents, the engineering around them, and the design choices that make software feel calm.

Inside Kimi Code: a teardown of its agent engine

I cloned a shipping coding agent and read it the way I read a paper — to see which design decisions survive contact with production. Here are the ones worth stealing.

Engineering
Don't put symptoms in the prompt

When an agent misses a case, the reflex is to teach it that case in the system prompt. That's whack-a-mole. The fix is structural.

Engineering
The verifier is the hard part

Loop engineering took off in coding because coding ships with a free oracle. Everywhere else, you have to build one.

Engineering
The day my agent lied about its job

A daily digest agent reported "all sources fetched" — while quietly dropping one. The bug wasn't the fetch. It was the missing guardrail.

Engineering
From context to loop engineering

The thing worth getting right keeps moving up a level — prompt, context, harness, loop. The hard part was never the while-loop.

Engineering
The agentic composition stack

Stop adding a tool for every request. Autonomy is a stack of composition types with two layers of glue, not a single trick.

Engineering
Why my agent runtime stays in-process

I migrated a production agent onto a server runtime, ran the whole playbook, then rolled it all back. Sometimes the senior move is not adopting the thing.

Engineering
Code-action: write code, don't collect tools

Standard tool-calling can't express composition. So stop picking tools — let the model write the code that calls them.

Engineering
Designing for the model's mistakes

Reliable agent UX isn't about preventing errors — it's about making the wrong answer cheap to notice and undo.

Design
The four layers of agent memory

Everyone treats agent memory as one framework, and compares the options as if they're rivals. They answer different questions — here's the map.

Engineering
Six ways to give a model knowledge — and what each is for

Vector RAG, RAGFlow, GraphRAG, PageIndex, an LLM wiki, Obsidian — they get lumped together as rivals. They aren't even at the same layer. A field guide to which one fits which job.

Engineering
There's no best agent framework, only yours

I scored two agent frameworks across eleven dimensions. Strip the weights and they tie. The gap was never about quality — it was about who I am.

Engineering
Your knowledge base wants to be a notebook, not a warehouse

Everyone is racing to build a smarter index. The labs that build the models quietly went the other way — toward plain text the agent reads and tidies itself.

Essay
Why agents need memory, not just context

A bigger context window buys you a better goldfish. Persistence is a different problem — and most of the work happens between the prompts.

Essay
The quiet cost of tool-use loops

Every retry feels free in isolation. Stacked across a session, they are where latency, cost, and confusion quietly accumulate.

Engineering