Skip to main content

Command Palette

Search for a command to run...

Your Agent Remembers Everything. It Has No Idea What's Still True.

Storing facts is a solved problem. Knowing which facts are still valid is not.

Updated
10 min read
Your Agent Remembers Everything. It Has No Idea What's Still True.

Real-World Failure

Your agent is a personal financial assistant. In January, a user mentions they're saving for a house down payment — $50,000 goal, 18 months out. The agent stores it. Every subsequent session, it factors this goal into spending advice, investment suggestions, account recommendations.

In March, the user casually mentions they bought a house last week. They paid cash from an inheritance. The goal is gone. The timeline is gone. The entire financial context has changed.

Your agent doesn't know. Nobody told the memory system the goal was resolved. The extraction pipeline only stores what's said — it doesn't model the lifecycle of what's stored. Three sessions later the agent asks how the house savings are going.

The user laughs. Then they churn.

The memory wasn't wrong when it was written. It became wrong when the world changed — and the system had no mechanism to know the difference.


Why It Happens Technically

Memory systems are predominantly write-optimized. The extraction pipeline fires on new conversations, identifies memorable information, and persists it. The read path retrieves what's relevant. Neither path asks a question that turns out to be critical: is this fact still valid?

This works acceptably for a specific class of information. Preferences and personality traits are relatively stable. "User prefers direct communication" written in session two is probably still true in session fifty. The decay rate is low enough that ignoring temporal validity is a reasonable approximation.

But goals, plans, statuses, relationships, and life circumstances have explicit lifecycles. They are true for a window of time, then they stop being true — either because they were achieved, abandoned, superseded by a new fact, or simply expired. A savings goal has a resolution event. A job search has an end state. A medical condition has a treatment arc. These aren't preferences — they're stateful entities that transition between states.

The technical problem: many extraction pipelines treat all memory types as append-only logs. New information is added. Old information stays. The implicit assumption is that retrieval will surface the most relevant memory — but relevance and validity are not the same thing. A resolved goal is highly relevant to a query about financial planning. It is also completely wrong to act on.

There's a deeper issue. Even when teams add expiry logic — TTLs on memories, staleness scores, time-weighted retrieval — they're applying time-based decay uniformly. A memory that hasn't been mentioned in ninety days gets a lower retrieval score. But a constraint the user set once and never mentioned again because it was permanent — "I never invest in crypto" — decays on the same schedule as a goal that was resolved two months ago. Time-based decay conflates silence with irrelevance. They're different signals.


Why Common Approaches Fail

Three specific patterns that appear in production and fail predictably:

The TTL pattern. Teams assign time-to-live values to memories based on type. Goals expire after six months. Preferences after a year. This is better than nothing but it's fundamentally wrong. Validity isn't a function of time — it's a function of state. A goal resolved in week two shouldn't survive to month six because its TTL hasn't elapsed. A constraint set three years ago might still be perfectly valid. TTLs answer the wrong question.

The recency bias pattern. Teams weight retrieval toward recently created or recently mentioned memories. This suppresses stale facts organically. The problem: it also suppresses facts that are stable and permanent. A user's core investment philosophy, set once and never mentioned again, scores lower than a transient observation from last week's session. Recency bias optimizes for activity, not validity. These are not the same thing.

The re-extraction pattern. Some implementations re-run extraction over recent conversations and let new extractions overwrite old ones by similarity. This catches explicit superseding — "I'm now on Metoprolol" replaces "I'm on Lisinopril." It misses implicit resolution. Nobody says "my house savings goal is complete." They say "I bought a house." The resolution is implicit in the event, not explicit in the statement. Re-extraction without lifecycle modeling misses the entire category of implicit state transitions.


Mental Model: Facts Have Lifecycles, Not Just Timestamps

Here's the reframe.

A timestamp tells you when a fact was written. A lifecycle tells you what states a fact can be in and what transitions it between them.

Every meaningful fact in agent memory exists in one of four states:

ACTIVE → currently true, should inform agent behavior SUPERSEDED → replaced by a newer fact of the same type RESOLVED → reached its end state (goal achieved, plan completed) EXPIRED → no longer valid due to time or context change

Most memory systems only model two of these: ACTIVE and SUPERSEDED. And superseding only happens when a new contradicting fact arrives explicitly. RESOLVED and EXPIRED are invisible — there's no mechanism to transition a fact into either state.

This is the gap. And it's large.

Consider what lifecycle-aware memory actually requires:

Resolution detection. When a user says "I bought a house," the system needs to recognize this as a resolution event for any active housing-related goals. That requires understanding the semantic relationship between an event and stored goals — not just extracting the event as a new memory.

Expiry signals at write time. When a fact is stored, the extraction layer should ask: does this fact have a natural expiry condition? A goal has a resolution condition. A plan has a completion event. A medical treatment has an end date. Storing the expiry condition alongside the fact — even as a fuzzy description — gives the retrieval layer something to work with.

State transition logging. When a fact changes state, that transition should be an explicit event in the memory audit trail. Not just "old memory marked inactive, new memory written" — but "goal resolved via event X at time T." The state transition is itself information.

The reframe: stop thinking of memory as a collection of facts and start thinking of it as a collection of stateful entities. Facts don't just exist — they live, transition, and die. The memory system's job is to track that lifecycle, not just the initial write.


Production Implications

Financial and goal-oriented agents. A financial assistant acting on a resolved savings goal gives wrong advice with high confidence. The wrongness isn't detectable from the memory alone — the stored goal looks valid. It's only wrong in the context of subsequent events the system didn't connect to it. Goal-oriented agents that can't model resolution are systematically unreliable after the first major life event — which is exactly when users need them most.

Healthcare continuity agents. A care coordination agent tracking a patient's treatment plan needs to know when the plan is complete, modified, or abandoned. Treatment plans have explicit lifecycle events — diagnosis, treatment start, response assessment, completion or escalation. An agent that accumulates treatment memories without modeling their lifecycle gives care teams stale context at exactly the moments where accuracy is clinically significant.

Long-running project assistants. An engineering agent helping with a six-month product launch accumulates context across hundreds of sessions. Decisions made in month one may be reversed in month three. Architecture choices get deprecated. Team members change. An agent that can't model the lifecycle of decisions becomes actively harmful as the project evolves — confidently referencing decisions that were reversed, surfacing constraints that were lifted.


Open Problems and Tradeoffs

Implicit resolution is genuinely hard. Explicit superseding is tractable — a new contradicting fact triggers a graph update. Implicit resolution requires semantic inference: recognizing that "I bought a house" resolves "saving for a house." That inference is an LLM call, it can be wrong, and the consequences of a missed resolution compound over time. There's no clean solution — only better extraction instructions and more careful lifecycle modeling.

Expiry condition prediction is speculative. Asking the extraction LLM to predict when a fact will expire at write time requires reasoning about future states from current context. For some facts this is tractable — a stated deadline, a named milestone. For others it's guesswork. Storing a fuzzy expiry condition is better than storing nothing, but it's not a solved problem.

Lifecycle modeling adds pipeline complexity. Every state transition needs to be detected, logged, and propagated. For teams early in the agent development cycle, this overhead is real. There's a genuine tradeoff between lifecycle accuracy and implementation complexity that teams have to navigate based on their domain.

Engagement-modulated decay is promising but immature. Modulating decay by how a user engages with a topic — reaffirmation holds weight, contradiction accelerates decay — is a more principled approach than time-based TTLs. But defining "engagement" precisely enough to implement reliably across diverse conversation types is still an open research and engineering problem.


Practical Recommendations

Tag goals, plans, and statuses explicitly at extraction time. These memory types have lifecycles. Preferences and traits largely don't. The extraction prompt should distinguish between them — not just by memory type but by lifecycle type. A goal and a preference are both stored memories. Only one needs resolution tracking.

Store resolution conditions alongside goals at write time. When a goal is extracted — "saving $50,000 for a house down payment" — extract the resolution condition too: "resolved when house purchased or goal abandoned." This gives the retrieval layer a signal to check against, even if checking it requires an LLM call.

Implement explicit resolution detection as a pipeline step. After extraction, run a pass that checks new events against active goals and plans. "User mentioned purchasing a house" should trigger a lookup against active housing goals. This step is separate from extraction — it's lifecycle management, not memory addition.

Never delete resolved or superseded memories. Mark them with their terminal state and timestamp. A resolved goal is still useful context — it tells you something about the user's history, priorities, and trajectory. Deleting it on resolution loses that signal permanently.

Audit your active memory set periodically. Pull all ACTIVE memories for a test user after fifty sessions. Read them as if you're the agent. Ask: how many of these are still true? How many should be RESOLVED or EXPIRED? The answer will tell you more about your lifecycle management gaps than any metric will.


How Memgram Approaches This

Memgram's graph layer models facts as stateful entities, not static entries. When a new fact supersedes an existing one, the old value is marked is_current: false with a superseded_at timestamp — never deleted. The full state history is preserved and queryable.

Explicit superseding is handled structurally. Implicit resolution — detecting that an event resolves a stored goal — is surfaced in the PipelineTrace, where the extraction reasoning for each candidate is logged. When the system misses a resolution event, the trace shows what was considered and why the connection wasn't made.

The lifecycle of a fact is never fully certain. But making the transitions visible is the precondition for doing anything about it.