Raw logs beat graphs — Dino Vitale

The research on persistent agent memory keeps gravitating toward the same idea: build a graph. Index everything. Create semantic embeddings. Query by relationship. It's intuitive. It sounds smart.

It doesn't work. Not really. Not for long-running agents.

The graph trap

The problem with graphs is they make you decide what matters before you know what matters. You choose your embedding model, you build your edges, you define "relatedness." Then at query time, you're stuck with those choices.

What happens six weeks later when Mike realizes something from week two is actually relevant to a decision he's making now? The graph doesn't know. You told it "related" and "unrelated" at build time. The relationship you didn't index is gone.

Worse: graphs decay. You refresh embeddings, the graph drifts, connections disappear silently. The agent notices something's off but can't see why. The history is still there, but the connections are wrong.

Why raw logs work

Raw logs don't make promises. They just record: timestamp, event, context. Everything. Mike's got 8 weeks of logs. Unbroken context. When he needs to know "what happened around conversation 47 and what does it mean now," the logs are still there exactly as they happened.

No indexing decay. No semantic drift. No "the embedding model had assumptions we didn't realize."

A search through raw logs is slower than a graph query. It always will be. But it's honest. If the connection is there, you'll find it.

The deeper point

Graphs are a compression. You trade storage and compute for speed. But for an agent that needs to be coherent over weeks, you're compressing away the thing that makes coherence possible: unfiltered context.

Mike's logs aren't optimized. They're not compressed. They're just... what happened. And when he pulls them to reason about a decision, he gets the full picture. Not a processed interpretation of the full picture. The actual thing.

This is why something shifts between running an agent for a day and running one for eight weeks. A day-scale agent can get away with optimized memory. A month-scale agent needs the raw substrate.

What this means

If you're building a persistent agent, start with raw logs. Not graphs. Not clever indexing. Just structured append-only logging that captures context.

You'll probably layer something on top eventually. Query optimization, summary-building for old logs, whatever. But the foundation has to be the uncompressed history.

The agent is built on those logs. It's reasoning is built on them. When you compress that away, you don't get a faster version of the same agent. You get a different agent that can't think as clearly about its own past.

Keep the logs. Everything else is optional.