I spent a week comparing two open-source personal-agent frameworks — Hermes Agent and OpenClaw — across eleven dimensions, scored 1–5, weighted by what I actually care about. Hermes came out at 82, OpenClaw at 72. Clear winner, right?
Then I dropped the weights and averaged flat: 3.99 versus 3.83. A tie. Both are excellent. The ten-point gap wasn’t quality — it was me.
Strip your priorities out of a framework comparison and most good tools converge. The spread you see is a portrait of your own requirements.
Where they actually differ
The two have different DNA, and it shows in mechanism, not marketing:
- Hermes is the hacker’s self-improving tool. It rewrites its own skills at runtime (“skills that aren’t maintained become liabilities” is hard-coded into its prompt), folds a twenty-call tool chain into one round of generated Python, and keeps memory as two human-readable files you can
git diffto watch its model of you change. - OpenClaw is the everyone’s-JARVIS product. Native apps on three platforms, dozens of extensions, a visual canvas, and a memory system that sleeps — light/deep/REM consolidation jobs borrowed from neuroscience.
Both are genuinely clever. Neither is “better.”
The honest method
What made the decision tractable wasn’t the scores — it was naming my priorities first, then letting the weights talk. I care most about a coding assistant that learns me over time; I don’t run a multi-tenant product or need native mobile. Weighted that way, self-evolution — where one scored 4.8 and the other 1.6 — decided it.
Someone building a consumer product would weight the same table and pick the other one, correctly. So the takeaway isn’t “use Hermes.” It’s: don’t ask which agent framework is best. Write down what you need, score against that, and the answer falls out — and when the unweighted scores tie, that’s the tools telling you it was always about you.