Article 記事

Two Ways to Forget: Compliance Decay vs. Adaptive Decay in Kizuna-Mem

author Jonathan Conway
timestamp 23 May 2026
classification kizuna-mem / decay / memory / forgetting-curves / compliance / ebbinghaus / temporal / agent-memory

Two Ways to Forget

Every memory system needs a theory of forgetting. Without one, the graph grows without bound, retrieval quality degrades as irrelevant old nodes crowd out recent ones, and storage costs climb indefinitely. The question is not whether to forget, but how.

Until this release, Kizuna-Mem had one answer: exponential decay at a fixed rate. score *= 0.85^days. Simple, predictable, and the same for every node in the graph regardless of how important that node has been to the user.

We shipped a second answer: adaptive decay. It is opt-in, non-default, and fundamentally different in its assumptions about what forgetting should mean.

This post explains both modes, when you would choose each, and why they cannot coexist in the same deployment.


Compliance decay: the default

The default decay model in Kizuna-Mem is a pure exponential function of age:

score *= 0.85 ^ days_since_event

A memory from 7 days ago retains 32% of its original score. A memory from 30 days ago retains 0.76%. A memory from 90 days ago retains 0.00004%. The curve is steep, deterministic, and identical for every node.

Compliance decay vs. adaptive decay comparisonTwo memories compared under both decay modes. Under compliance decay both fade identically. Under adaptive decay the frequently-referenced memory retains strength much longer. FIG.01 / DECAY COMPARISON: COMPLIANCE vs. ADAPTIVE 減衰比較 COMPLIANCE DECAYscore = 0.85^days 1.0 0.75 0.50 0.25 0 0 30 60 90 120 DAYS memory A (10 refs) memory B (1 ref) Both curves are identical. Usage does not matter. ADAPTIVE DECAYR = exp(-(t - r) / (tau * (1 + eta * ln(1 + n)))) 1.0 0.75 0.50 0.25 0 0 30 60 90 120 DAYS memory A (10 refs, slow) memory B (1 ref, fast) Usage stretches the half-life. 10 references = 3x slower decay. tau = 180 DAYSeta = 0.8COMPLIANCE = AGE-ONLYADAPTIVE = AGE + REINFORCEMENT Left: both memories decay identically regardless of how often they were referenced. Right: the frequently-referenced memory retains strength 3x longer.

This model has three properties that matter for production deployments:

Determinism. Given a node’s creation timestamp and today’s date, you can compute its decay factor without knowing anything else about the system’s state. No runtime context required.

Auditability. A compliance officer can verify that “any memory older than X days has a decay factor below threshold Y.” The guarantee is a closed-form inequality: 0.85^X < Y. For X=90 days, the residual is below 0.0001%. This is the kind of guarantee that regulated environments need for data retention policies.

Uniformity. Every node decays at the same rate. A memory the user mentioned once in passing decays exactly as fast as a memory the user has referenced in 50 sessions. No exceptions, no special cases, no hidden state influencing the outcome.

The cost of these properties is that compliance decay has no concept of importance. A fact the user relies on daily fades at the same rate as a fact from a forgotten tangent. For enterprise deployments with data retention requirements, this is acceptable. For a personal assistant on someone’s laptop, it means the system forgets things the user clearly cares about.


Adaptive decay: the opt-in alternative

Adaptive decay replaces the fixed exponential with a reinforcement-aware forgetting curve inspired by Ebbinghaus’s 1885 experiments on memory retention. The formula:

R(m, t) = exp(-(t - r_m) / (tau * (1 + eta * ln(1 + n_m))))

Where:

Parameter Meaning
t Current time
r_m Time of last reference to memory m
tau Base half-life in days (default: 180)
eta Reinforcement sensitivity (default: 0.8)
n_m consolidation_count on the node, incremented every time the memory is referenced during retrieval or consolidated by the pipeline

The key difference is in the denominator. Each reference to a memory stretches its effective half-life. A memory with consolidation_count = 0 decays at the base rate. A memory with consolidation_count = 10 has an effective half-life of tau * (1 + 0.8 * ln(11)) = tau * 2.92, roughly three times longer than an unreferenced memory.

This matches a well-known finding from cognitive science: spaced repetition strengthens memory traces. A fact you encounter once fades fast. A fact you encounter repeatedly, across different contexts, persists. Ebbinghaus measured this in 1885 with nonsense syllables. The math has held up.

Configuration

Adaptive decay is off by default. You enable it per tenant:

[decay]
mode = "adaptive"    # default: "compliance"
tau_days = 180       # base half-life
eta = 0.8            # reinforcement sensitivity

Or via the API:

POST /api/v1/tenants/{tenant_id}/config
{
  "decay": {
    "mode": "adaptive",
    "tau_days": 180,
    "eta": 0.8
  }
}

Recommended parameters

Use case tau eta Rationale
Personal assistant 180 0.8 Aggressive reinforcement. Daily-use memories persist for months.
General consumer app 365 0.5 Moderate reinforcement. Slower forgetting overall.
Single-user dev tool 90 1.0 Short base half-life, but strong reinforcement for actively-used memories.

Where consolidation_count comes from

The consolidation_count field lives on every node in the graph. It starts at zero when the Observer creates the node during ingestion. Three pipeline stages increment it:

consolidation_count increment points in the pipelineThe Observer creates nodes with count 0. The Reflector increments on entity merge. The Consolidator increments on community formation and L1 generation. Retrieval increments on access. FIG.02 / CONSOLIDATION_COUNT IN THE PIPELINE パイプライン統合 Observer ingestion + L0 abstracts count = 0 Reflector entity resolution + dedup count += 1 on entity merge Consolidator communities + L1 + profiles count += 1 on consolidation pass Retrieval spreading activation + fusion count += 1 on accessed nodes EXAMPLE LIFECYCLE ingested 0 -> merged 1 -> consolidated 2 -> retrieved x3 5 Every pipeline interaction reinforces the memory. Frequently-accessed nodes accumulate higher counts and decay slower under adaptive mode.

Observer. Creates the node with consolidation_count = 0. The node starts life unreinforced.

Reflector. When entity resolution merges two references to the same real-world entity, the surviving node’s consolidation_count increments. If the user mentions “Sarah” in three different sessions and the Reflector resolves all three to the same entity node, that node’s count reaches 3 through merges alone.

Consolidator. When the Consolidator runs its background pass (community detection, L1 overview generation, profile building), it increments the count on every node it touches. A node that participates in a community formation or gets summarized into an L1 overview is, by definition, a node with enough connections to be structurally important.

Retrieval. When a node appears in a retrieval result set, its consolidation_count increments. This is the most direct signal of importance: the user asked a question and this node was part of the answer.

The result is that consolidation_count is a natural proxy for memory importance. It accumulates passively through pipeline operations and actively through user queries. No explicit “mark as important” API is needed.


Why the two modes are incompatible

This is the part most people miss on first reading.

Compliance decay guarantees a predictable retention envelope. A data protection officer can state: “All memories older than 90 days have decayed below the retrieval threshold.” They can prove it with arithmetic. The guarantee holds regardless of how the system was used, because usage does not affect decay.

Adaptive decay breaks that guarantee. A memory that has been referenced 20 times across 15 sessions has an effective half-life roughly four times longer than the base rate. After 90 days, instead of being below the retrieval threshold, it might still be at 40% strength. The DPO cannot make the same guarantee without knowing every memory’s individual consolidation_count, which defeats the purpose of a simple policy.

This is not a bug in adaptive decay. It is the point of it. Reinforcement-aware forgetting exists specifically so that important memories persist longer. But “important memories persist longer” and “all memories older than X have decayed below Y” are contradictory guarantees. You can have one or the other.

We could have tried to merge them, maybe by applying a hard ceiling on adaptive decay so that no memory survives past a configured maximum age regardless of reinforcement. But a hard ceiling with adaptive decay below it would create a false sense of compliance. The decay curve would not be monotonic in age, the retention envelope would depend on usage patterns, and auditors would need to understand the full formula to verify anything. That is worse than having two clean, well-understood modes.

So we shipped them as separate modes. You pick one per tenant. The default is compliance.


When to use each mode

Mode selection decision guideAn animated flowchart showing the decision path for choosing between compliance and adaptive decay modes. Three yes branches lead to compliance; the no path through all three leads to adaptive. FIG.03 / MODE SELECTION GUIDE モード選択ガイド SELECT DECAY MODE Regulated data? GDPR / HIPAA / SOC2 YES NO Multi-tenant deployment? YES NO Need verifiable retention? YES COMPLIANCE 0.85^days deterministic NO ADAPTIVE reinforcement-aware Default is compliance. Adaptive is opt-in for personal and single-user deployments.

Compliance decay: regulated and multi-tenant

If any of these apply, use compliance decay:

  • The deployment handles personal data subject to GDPR, HIPAA, or equivalent frameworks
  • You need to make verifiable retention guarantees to auditors or regulators
  • Multiple tenants share the system and you need uniform data handling policies
  • Right-to-erasure requests require provable decay timelines
  • Your compliance team needs to state “all data older than N days is below threshold T”

The entire point is that the math is trivial. 0.85^90 = 0.00000044. Any auditor with a calculator can verify the retention claim. No knowledge of usage patterns is required.

Adaptive decay: personal and single-user

If all of these apply, use adaptive decay:

  • The deployment serves a single user or a small number of trusted users
  • There is no regulatory requirement for deterministic retention guarantees
  • The user benefits from the system remembering frequently-referenced facts longer
  • You want the system’s forgetting behavior to reflect actual usage patterns

The canonical use case is a personal AI assistant running on someone’s laptop. The user talks about their project deadlines every day. They mentioned a restaurant once, three months ago. Under compliance decay, both memories are at the same strength at the same age. Under adaptive decay, the project deadlines persist because the user keeps referencing them, while the restaurant fades naturally.


The stale reinforcement paradox

There is a failure mode specific to adaptive decay that is worth understanding.

Consider this sequence:

  1. In January, the user says “I work at Acme Corp.” This fact gets referenced across 12 sessions over two months. consolidation_count = 14.
  2. In March, the user says “I left Acme. I joined Globex.” This fact has been mentioned twice. consolidation_count = 2.

Under adaptive decay, the January fact has a much longer effective half-life because of its high consolidation count. With tau=180 and eta=0.8, Acme’s effective half-life is 570 days versus Globex’s 338 days. If neither fact gets referenced again after March, the Acme memory decays more slowly. Given enough time without new references to Globex, the stale fact could overtake the correction.

This is the stale reinforcement paradox: a heavily-reinforced old fact can outlive a lightly-reinforced correction.

Stale reinforcement paradox and contradiction detectionA heavily-referenced old fact can outscore a newer correction under adaptive decay. Contradiction detection resolves this by marking the old fact as dormant. FIG.04 / STALE REINFORCEMENT PARADOX + RESOLUTION 矛盾検出 WITHOUT CONTRADICTION DETECTION 1.0 0.75 0.50 0.25 0 jan mar may jul sep CROSSOVER "works at Acme" n=14 "joined Globex" n=2 WITH CONTRADICTION DETECTION 1.0 0.75 0.50 0.25 0 jan mar may jul sep conflict detected Acme marked dormant "joined Globex" wins "Acme" dormant, suppressed RESOLUTION MECHANISM 1. Reflector groups edges by (source_entity, relation_type) in O(E) time 2. Edges with same subject+predicate but different object are flagged as conflicts 3. Older edges are invalidated (t_event_invalid set). The newer fact wins regardless of count. consolidation_count on dormant nodes is irrelevant: bitemporal filtering excludes them before scoring Contradiction detection is a hard override. No amount of reinforcement can keep a contradicted fact in the retrieval pool.

Kizuna-Mem’s contradiction detection prevents this from being a real problem. The Reflector runs O(E) conflict detection on every ingestion pass. When it sees two edges with the same source entity and relation type but different targets (e.g., “user works_at Acme” and “user works_at Globex”), it invalidates the older edge by setting t_event_invalid. The invalidated edge is excluded from retrieval before scoring even begins. Bitemporal filtering is a hard gate, not a soft signal.

The Consolidator adds a second layer. It groups atomic memories by entity mention, sorts by timestamp, and creates explicit “replaces” edges from newer facts to older ones. Older facts are marked dormant = true, which deprioritizes them in retrieval even if they somehow survive temporal filtering.

In concrete terms: once the user says “I joined Globex,” the Reflector detects the contradiction with “I work at Acme” on the next ingestion pass. The Acme edge gets invalidated. Its consolidation_count of 14 no longer matters, because the edge is temporally dead. Adaptive decay never gets a chance to keep it alive, because bitemporal filtering runs before decay scoring.

The stale reinforcement paradox only manifests if contradictions go undetected. In Kizuna-Mem, the multi-layer contradiction detection (Reflector-level edge grouping, Consolidator-level atomic memory dedup, trait-level timestamp comparison) catches them. We measured a 20 percentage point accuracy improvement on knowledge-update queries after adding conflict detection. That improvement applies in both decay modes.

Retrieval pipeline order of operationsThe retrieval pipeline showing that bitemporal filtering removes contradicted nodes before decay scoring runs. A contradicted node is ejected at the filter step and never reaches the decay calculation. FIG.05 / RETRIEVAL ORDER OF OPERATIONS 検索順序 QUERY INPUT User query arrives CANDIDATE SELECTION Embedding similarity + graph neighbors BITEMPORAL FILTER Remove temporally invalid nodes contradicted node removed DECAY SCORING Apply decay factor (compliance OR adaptive) ×0.85^d or ×R(m,t) SPREADING ACTIVATION Traverse graph edges, accumulate scores FUSION Merge multi-hop results, rank RESPONSE Top-k memories returned FILTER THEN SCORE Contradicted nodes never reach decay scoring. Consolidation count on a contradicted node is irrelevant.

Implementation details

The decay calculation lives in the temporal index during fusion retrieval. The relevant code path in zig/src/index/temporal.zig:

fn computeDecayFactor(
    node: *const Node,
    now_ns: i64,
    config: DecayConfig,
) f32 {
    return switch (config.mode) {
        .compliance => comp: {
            const age_days = @as(f32, @floatFromInt(
                @divTrunc(now_ns - node.t_event_valid_ns, DAY_NS)
            ));
            break :comp std.math.pow(f32, 0.85, age_days);
        },
        .adaptive => adapt: {
            const t_since_ref = @as(f32, @floatFromInt(
                @divTrunc(now_ns - node.last_accessed_ns, DAY_NS)
            ));
            const n = @as(f32, @floatFromInt(node.consolidation_count));
            const effective_tau = config.tau_days * (1.0 + config.eta * @log(1.0 + n));
            break :adapt @exp(-t_since_ref / effective_tau);
        },
    };
}

Two things to notice. First, compliance decay uses t_event_valid_ns (when the fact became true), while adaptive decay uses last_accessed_ns (when the memory was last referenced). This means adaptive decay resets its clock on every access, which is the reinforcement mechanism. Second, the adaptive formula uses @log(1.0 + n) rather than n directly, so the relationship between references and half-life is logarithmic. The first few references have a large effect. The 50th reference barely moves the needle. This prevents a single heavily-accessed node from becoming effectively immortal.

Logarithmic reinforcement saturation curveA chart showing that the relationship between consolidation_count and effective half-life multiplier is logarithmic. The first few references triple the half-life; later references barely move the needle. FIG.06 / REINFORCEMENT SATURATION 補強飽和 5.0x 4.0x 3.0x 2.0x 1.0x 0 10 20 30 40 50 CONSOLIDATION_COUNT HALF-LIFE MULTIPLIER HIGH IMPACT ZONE n=0 1.00x n=1 1.55x n=5 2.43x n=10 2.92x n=50 4.14x First 10 references triple the half-life. The next 40 add only 40% more. tau = 180 DAYS eta = 0.8 f(n) = 1 + eta * ln(1 + n)

The consolidation_count increment

The count increments in three places:

// In reflector.zig - on entity merge
fn mergeEntities(self: *Reflector, survivor: *Node, absorbed: *Node) !void {
    survivor.consolidation_count += 1;
    // ... merge edges, update indexes
}

// In consolidator.zig - on consolidation pass
fn consolidateNode(self: *Consolidator, node: *Node) !void {
    node.consolidation_count += 1;
    // ... generate L1, update community membership
}

// In fusion.zig - on retrieval access
fn recordAccess(node: *Node, now_ns: i64) void {
    node.consolidation_count += 1;
    node.last_accessed_ns = now_ns;
}

What we did not do

We considered and rejected several alternatives:

Per-node decay rates. Instead of a single formula with a reinforcement term, we could have let users set custom decay rates per node. This pushes the policy decision onto every API caller, which is the wrong abstraction level. The point of adaptive decay is that importance is inferred from usage, not declared by the user.

Hybrid mode. We considered a mode that uses compliance decay as a baseline with adaptive reinforcement as a bonus. The problem is that the compliance guarantee depends on having a known ceiling on the decay factor. Adding any usage-dependent term, even as a bonus, breaks that ceiling.

Decay mode as a per-node property. We considered letting individual nodes have different decay modes within the same tenant. This creates an auditing nightmare where some nodes decay predictably and others do not, with no tenant-level guarantee about overall behavior.

Automatic mode selection. We considered detecting whether a tenant is in a “regulated” context and auto-selecting compliance decay. But we have no reliable way to detect that, and getting it wrong in either direction has bad consequences.


Summary

Kizuna-Mem now has two decay modes. Compliance decay is the default. It is deterministic, auditable, and uniform. Every memory fades at 0.85^days regardless of usage. Adaptive decay is opt-in. It uses reinforcement-aware forgetting where frequently-referenced memories persist longer, following R(m,t) = exp(-(t - r_m) / (tau * (1 + eta * ln(1 + n_m)))).

The two modes serve different deployment contexts. Compliance decay is for regulated environments, multi-tenant systems, and anywhere that needs verifiable retention guarantees. Adaptive decay is for personal assistants, consumer apps, and single-user deployments where the forgetting curve should reflect what the user actually cares about.

They cannot be combined without undermining the guarantees that make compliance decay useful. So we did not combine them.

Contradiction detection works independently of the decay mode. A contradicted fact is temporally invalidated regardless of its consolidation_count. The stale reinforcement paradox exists in theory but not in practice, because bitemporal filtering runs before decay scoring.

Adaptive decay is available now. Set decay.mode = "adaptive" in your tenant config. The default remains "compliance".


For background on the retrieval pipeline, spreading activation, and bitemporal model: Your Agent’s Memory Is Lying to You. For the tuning infrastructure that optimizes retrieval weights: Nobody Else Is Tuning Their Memory Engine. For governed memory and compliance: Governed Memory: OAMP 1.2 ships.