Article 記事

Memory that was never there: hard-delete, governance by omission, and GDPR as a feature of the factory

      author
      Jonathan Conway
    

      timestamp
      12 May 2026
    

      classification
      gdpr / hard-delete / governance-by-omission / governed-memory / oamp / bitemporal / eu-ai-act / regulated / substrate
    

In January 2026 a European fintech received a Subject Access Request from a retail customer. Routine enough. The SAR process kicked off, compliance ran the standard query against the customer’s record, and the team dutifully assembled the response pack. Then someone asked the question that should have been asked on day one: “Do we know exactly which facts the customer’s AI profile assistant was using to reason about them six months ago?”

They did not. The vector store held embeddings. The graph DB held nodes. But the AI memory layer had been accumulating context for months, some of it derived from records that had since been corrected, and nobody could say with certainty what the system had known at any given point, let alone demonstrate that a deletion request had propagated cleanly into it. The regulator described the situation, diplomatically, as “incomplete data mapping.” The less diplomatic word is exposure.

This is not an edge case. It is the structural condition of almost every agentic system built today. Memory is treated as a fast path for context retrieval, and deletion is something that happens to the source database, with a vague expectation that the downstream AI layer will eventually catch up. It will not catch up reliably, and the liability does not wait.

The rest of this post explains why. It explains what governance by omission actually means as a design principle, why the deletion problem is harder than it looks in the specific context of AI memory systems, how the governed memory engine and OAMP approach it as a cryptographic first-class operation, and what the concrete obligation looks like under GDPR Article 17 and the EU AI Act simultaneously. The sibling post GDPR hard-delete and the right to be forgotten in agent memory graphs goes deeper on the ePrivacy and DPA enforcement mechanics; this one leads from the design principle.

The design principle: governance by omission

Most memory governance is additive. You store data, then you add access controls on top of it, then you add audit logs on top of those, then you add deletion workflows on top of those. At every layer you are trying to close a gap that the previous layer left open. The gaps multiply, and the machinery to close them becomes the dominant maintenance burden.

Governance by omission works differently. The starting point is not “everything is allowed unless restricted.” The starting point is “nothing exists for a given scope unless it was explicitly placed there.” An agent operating under a specific mission or policy perimeter has access only to the memories that are in scope for that mission. Out-of-scope memories do not return a denial response. They return nothing. They are not there.

This sounds like a semantic distinction. It is not. When a memory system returns a 403 Forbidden on an out-of-scope query, it tells the querying agent that the memory exists but is inaccessible. That is itself information leakage in some regulatory contexts. A 404 Not Found is architecturally different: from the agent’s perspective, the fact never existed. OAMP v1.3’s existence-hiding requirement formalises this precisely: out-of-scope entries MUST return 404, not 403, and the denial event in the audit log MUST NOT record the entry id. The form of the denial cannot reveal the existence of what was denied.

Governance by omission is why the governed memory engine’s architecture is not a “memory store with a permission layer bolted on.” The governed perimeter is the query interface itself. If a memory is outside the declared scope of the current mission, it is absent from the result set, absent from count responses, absent from streaming events. The architecture makes it structurally impossible for an agent to know that something was withheld, which is the meaningful guarantee.

Hard-delete is the most demanding test of this principle. When a data subject exercises their right to erasure, the question is not whether you can mark a record as deleted. The question is whether you can make it disappear from every surface the AI memory system exposes, including derived facts, cached embeddings, indexes that reference the original, and the activation paths that a spreading-activation retrieval engine would have traversed. Most systems fail that test in practice, not because of bad intentions, but because of structural choices made long before the deletion request arrived.

Why deletion is harder than it looks in AI memory

A traditional database deletion is conceptually simple. You issue a DELETE statement, the row is gone from the primary table, the transaction commits, and downstream systems can be notified. In practice there are replicas and backups to manage, but the model is coherent.

An AI memory system, especially one that uses vector embeddings for retrieval, violates that coherence in several ways at once.

The first problem is the embedding itself. When you ingest a fact into a vector store, you produce an embedding vector from the text. That vector encodes the semantic meaning of the fact. Delete the source fact and the embedding vector still exists in the index. The vector does not point back at the deleted fact in any way that a vacuum process can reliably find, particularly if the embedding was produced from a chunk that contained multiple facts, only one of which is being deleted. Rebuilding the index from scratch is the only reliable option, and at scale that is not a casual operation.

The second problem is derived facts. Agentic memory systems do not just store raw observations. They consolidate them. They run deduplication. They produce synthesised summaries that combine multiple source facts into a single semantic node. When you delete the source fact, the synthesised node may still hold the deleted information in a form that is factually accurate but no longer legally permissible to retain. You need to identify every derived node that was produced from the deleted source, assess whether the derived information survives the deletion, and cascade the deletion or regeneration accordingly. That is a graph surgery problem, not a row deletion problem.

The third problem is caches and replicas. A memory system optimised for the ~3 ms recall figure that the governed memory engine achieves will have multiple layers of caching. If a deletion request comes in while a particular fact is resident in a warm cache, the cache needs to know about the deletion before the next retrieval, not after the next cache eviction. This is a consistency problem that most memory architectures do not treat as first-class, because they were designed for retrieval performance, not for erasure guarantees.

The fourth problem is the audit trail for the deletion itself. GDPR Article 17(1) creates the right to erasure. Article 5(2) creates the accountability obligation: the data controller must be able to demonstrate compliance with the principles. Demonstrating compliance with an erasure request requires evidence that the erasure happened, when it happened, what data was affected, and that it was complete. A soft-delete with a “will be vacuumed eventually” posture generates none of that evidence reliably. The deletion event needs to be durable, tamper-evident, and specific.

Interactive: click a fact in the timeline to select it, then use the GDPR hard delete button. The fact disappears from both time axes, and a signed deletion event appears in the audit log at the bottom right. Scrub the as-of slider to confirm the fact is absent at all time points.

The diagram above shows the two-axis bitemporal structure that makes this tractable. Each fact has a valid-time (when it was true in the world) and a transaction-time (when it was recorded in the system). A hard delete removes the fact from both axes simultaneously, not just from the valid-time view. This is important because a regulator asking “what did your system know about this person at any point during this period?” needs a clean answer across the transaction-time axis, not just the valid-time one. The deletion event that appears in the audit log has its own transaction-time entry: it is the permanent record of when the erasure was carried out and by what authority.

How the governed memory engine and OAMP treat it as a first-class operation

The practical test for any memory system is what happens when you call the delete endpoint. Three outcomes are possible.

The first is the soft-delete: the record is marked as deleted, excluded from query results going forward, but not actually removed from storage. Vacuum or garbage collection will get to it eventually. This is adequate for operational performance reasons but inadequate for regulatory ones: the data is still there, and demonstrating that it cannot be accessed is a matter of trusting the access-control layer rather than the storage layer.

The second is the hard-delete without provenance: the record is removed from storage. Better. But the audit trail for the deletion may itself be incomplete, and derived data may remain.

The third is what the governed memory engine calls a cryptographic hard-delete. The operation is atomic across all representations of the fact: the primary storage, the embedding index, the consolidated graph nodes derived from it, and any cached copies. The operation produces a signed deletion record that commits to a hash of the deleted data, the time of the deletion, the identity (through the identity service’s Ed25519 authority) of the operation that triggered it, and the scope under which the deletion was authorised. The deletion record itself enters the append-only signed audit chain. You cannot remove the deletion record without breaking the chain.

The phrase “cryptographic hard-delete” is doing specific work here. It means the deletion is verifiable: a regulator or an auditor can inspect the signed deletion record and confirm that a specific piece of data was removed, from a specific scope, at a specific time, by a specific authorised operation, and that the audit chain has not been altered since. It is not a claim that data has been perfectly expunged from every conceivable physical medium (backups, disaster recovery tapes). It is a claim that the live memory system has no accessible representation of the data, and that the evidence of the deletion is itself tamper-evident.

OAMP provides the portable semantic contract for this operation. The crypto_shred delete mode in the OAMP delete semantics requires backends to apply the operation across all derived representations, report specifically which derived nodes were affected, and produce a provenance record for the operation. Backends that advertise governance.hard_delete.supported: true on their capabilities endpoint have declared compliance with this contract. A regulated buyer can therefore write RFP language that requires OAMP hard-delete compliance and test against it, rather than relying on vendor prose.

Interactive: the deletion event is a signed block in the audit chain, recorded alongside the regular action history of the agent. Hover a block to inspect the signed fields. Toggle the tamper simulation to see what happens to chain integrity when a deletion record is altered.

This is the audit chain view. The deletion event is not a separate log. It is a block in the same signed hash chain as every other agent action: it has a block index, a prev-hash pointer, the identity of the authorising operation, and its own Ed25519 signature. Tampering with the deletion record would break the chain exactly as tampering with any other block would. The deletion is provable to the same standard as any other action the factory took.

What this satisfies simultaneously: GDPR Art 17 and the EU AI Act

These two instruments are often treated as parallel obligations with separate compliance programmes. They share a common structural requirement that makes treating them separately fragile.

GDPR Article 17 creates the right to erasure. The conditions are well-known: the data is no longer necessary for the purpose it was collected, the data subject withdraws consent, the data subject objects under Article 21, the data was unlawfully processed, or erasure is required by Union or Member State law. Where any of these applies, the controller must erase the personal data without undue delay and, where the data has been disclosed to third parties, inform those third parties of the erasure requirement.

The complication for AI memory systems is that the definition of “personal data” in the context of AI-derived facts is not always obvious. A synthesised memory node that combines publicly stated preferences with behavioural patterns may constitute personal data under GDPR’s broad definition even if neither input is obviously personal in isolation. The practical advice from several EU national DPAs in 2025 and early 2026 has been to treat any AI-derived fact that relates to or could be used to infer characteristics about an identified or identifiable individual as personal data unless the controller can positively demonstrate otherwise. That puts the burden on the controller, not on the regulator.

The EU AI Act’s Article 12 obligation (applicable to high-risk systems under Annex III, with August 2026 as the enforcement deadline for the initial scope) requires that high-risk AI systems be designed and developed with capabilities to automatically log events over the lifetime of the system. The logs must be designed to cover at minimum: the period of each use of the system, the reference database against which the input data was checked, the input data that led to a match, the identity of the natural persons involved in the verification. The Act also requires logging of operational monitoring events and substantial modifications.

The conjunction is this: the same event log that satisfies Article 12’s logging requirement must also be consistent with Article 17’s erasure requirement. If you delete a personal data fact from the memory system and the Article 12 log contains direct copies of that fact, you have a conflict between two binding obligations. The GDPR erasure applies. The AI Act log requirement also applies. The only way to satisfy both simultaneously is to design the audit log such that it records the existence and effect of events without storing the personal data values themselves as retrievable content after an erasure.

The governed memory engine’s audit chain stores hashes and metadata, not payload content, in the tamper-evident record. The deletion event records a hash of the deleted data, not the data itself, and the hash is sufficient for later verification that the correct data was deleted. A regulator who wants to verify that a specific piece of personal data has been erased can compute the hash of the data from the subject’s original request and confirm it matches the deletion record, without the audit system needing to hold the personal data in the log.

This is not incidental design. It is the only architecture that actually satisfies both obligations simultaneously rather than trading one off against the other.

The sector walkthrough: healthcare claims and the patient who changed their mind

Take a concrete scenario. A healthcare AI system operating under a hospital trust processes clinical coding and claims. An agent ingests consultation notes (with appropriate consent and data processing agreements), extracts diagnoses, maps them to ICD-10 codes, and builds a memory representation of the patient’s clinical history to support future coding accuracy.

Eighteen months later, the patient requests erasure of their data from the system. This is a legitimate Article 17 request. The hospital trust’s data protection officer must act on it.

Before the governed factory approach, the process would look like this. The DPO runs a data mapping exercise to identify all systems holding the patient’s data. The clinical coding AI system appears on the list. The vendor is contacted. They confirm that the patient’s records can be deleted from the primary database. But the AI memory layer is a separate system. The embeddings are in a vector index. The agent’s memory graph contains synthesised nodes derived from the patient’s historical records. The vendor is not entirely sure what was synthesised, or whether deleting the source records cascades to the derived representations. The DPO marks the subject access response as pending further investigation. Legal is involved. Three months later the situation is still not fully resolved.

After the governed factory approach with the governed memory engine:

The DPO issues a deletion request via OAMP’s cryptographic hard-delete API, scoped to the patient’s identity. The operation is atomic: primary records, embedding index entries, derived graph nodes that were produced from the patient’s data, and cached representations are all covered in a single transactional operation. A signed deletion record enters the audit chain with the DPO’s authorised identity, the affected scope, and the timestamp. The operation completes in seconds and returns a cryptographic receipt. The OAMP capabilities endpoint allows the DPO to verify that the backend has executed the operation according to the v1.3 normative requirements. The Article 17 obligation is discharged. The Article 12 log still exists and is tamper-evident, but it does not hold the deleted personal data in retrievable form.

Before: months of uncertainty, legal exposure, and a compliance programme that looks unimpressive to the regulator. After: an automated, evidenced, verifiable operation that produces its own proof.

The same pattern applies to financial services, where GDPR erasure requests interact with MiFID II record-keeping requirements in ways that require careful architectural design. The same pattern applies to government casework, where citizen data subject rights apply to AI-assisted decision making under the UK Data Protection Act 2018 and equivalent instruments. In each case, the enabler is the same: an AI memory system that treats deletion as a first-class cryptographic operation rather than an operational afterthought.

The liability surface for glue-stack memory

If you are evaluating an agentic system that uses a third-party vector store, a graph database not purpose-built for AI memory, or a roll-your-own Postgres schema for agent context, the deletion question is where the liability surface concentrates.

Vector stores are explicitly designed for fast approximate-nearest-neighbour search. Deletion from a vector index is an operation that most vector database vendors acknowledge is harder than insertion. Some implementations require a full index rebuild to guarantee deletion completeness. In a production system with continuous ingestion, a full index rebuild is a multi-hour operation, during which time the “deleted” vectors may still be retrievable. The undue delay standard in GDPR Article 17(3) gives some room, but it is not unlimited, and the right to erasure does not come with a carve-out for “technically inconvenient at scale.”

Graph databases designed for general-purpose knowledge representation face the derived-facts problem described above. A deletion from a general graph DB removes a node and its direct edges. It does not automatically propagate to nodes that were synthesised from the deleted node’s content. That synthesis cascade is application logic that the system integrator has to write and maintain. Most do not write it at all, and many do not know they need to.

The Postgres pattern, where agent memory is stored as rows in a relational table and an agent queries against it, is the most familiar and the most misleading. The rows are deletable. But if an LLM has read those rows and stored a summary or embedding derived from them, the deletion of the source rows does not affect the model’s derived representation. If that representation is cached in another table, or in a vector extension, or in an agent’s in-context window that gets persisted somewhere, the deletion is incomplete.

None of this is a reason to avoid building agentic systems. It is a reason to choose memory infrastructure that was designed with deletion as a first-class requirement from the beginning, rather than infrastructure designed for retrieval that has deletion bolted on, or not bolted on at all.

What to demand in an RFP

If you are conducting a procurement for any agentic system that will hold personal data about EU, UK, or California-based data subjects (among others), the memory deletion question should appear explicitly in the technical specification.

Ask the vendor to describe their hard-delete mechanism for AI memory, specifically including: how derived representations (synthesised nodes, embeddings, cached inference outputs) are handled; what the latency of a complete deletion operation is from request to verified completion; and what evidence they produce that the deletion was executed correctly.

Ask whether their memory system distinguishes between soft-delete (marking as deleted) and hard-delete (removing from all storage representations). Ask whether the hard-delete is an atomic transaction or a deferred process. Ask what the consistency guarantee is if a retrieval request arrives during a deletion operation.

Ask whether deletion events are logged in the same tamper-evident audit chain as regular agent actions. If the answer is “in a separate log,” ask how they prove the deletion log has not been altered.

Ask whether their system satisfies both GDPR Article 17 and EU AI Act Article 12 simultaneously, and ask them to describe the specific design that achieves this. If they answer “yes” without being able to describe the design, the design probably does not exist.

Ask for a demonstration. Specifically: insert a known test record, issue a hard-delete for it, retrieve the deletion receipt, then attempt to retrieve the deleted record and verify that it returns nothing. Then ask how the system handles derived representations of the deleted record. The second question will tell you more than the first.

Finally, ask how the system behaves in an air-gapped or private cloud deployment, where no vendor infrastructure is involved. The deletion mechanism should be entirely under the customer’s control. If the deletion guarantee depends on a hosted API, that guarantee has the same sovereignty properties as the data itself.

A 90-day pilot design

For most regulated enterprises, the fastest way to test the deletion claim is to run a parallel operation alongside an existing data subject request process.

Pick a category of personal data that already generates occasional Article 17 requests. Enrol a small group of data subjects into a sandboxed version of the agentic system. Allow the system to build a genuine memory representation over four to six weeks. Then process a simulated deletion request for one of the enrolled subjects.

Three things tell you whether the system is real. The latency from request to verified completion: if it takes more than a few minutes for a non-bulk deletion, the architecture may have a deferred-cascade problem. The completeness of the affected-scope report: a well-designed system should be able to tell you exactly which primary records, derived nodes, and cached representations were covered by the deletion. And the audit evidence: the signed deletion record should be verifiable against the audit chain independently, without needing to contact the vendor.

If all three are satisfactory in the sandbox, the factory’s deletion capability is a feature you can rely on in production. If any of them reveals a gap, you have found the gap in a controlled environment at low cost, rather than in response to an actual regulatory enforcement action.

The governance by omission principle, properly implemented, means that a regulator auditing your system after a deletion should find not a redacted record, not a tombstone, not a gap in the sequence numbers, but simply nothing. The memory was never there for this scope. The only trace is the signed deletion event that proves it used to be and was properly removed. That is the standard. It is achievable. And the factory is the infrastructure that makes it the default rather than the aspirational.

For the deeper mechanics on how the governed memory engine’s bitemporal structure underpins compliance across time, read bitemporal memory as the compliance backbone. For how OAMP’s interoperability layer preserves this guarantee when you compose memory across multiple backends, read OAMP: compose best-in-class memory without lock-in. The specific EU AI Act logging obligations that constrain what the audit chain must and must not contain are mapped in detail in EU AI Act Article 12: what high-risk systems must log.

If you are at the stage of evaluating the factory for a regulated use case and want the full architecture picture with the head-to-head against glue stacks, the investor brief covers that in depth. You can request it from the Substrate page.