Article 記事

Your Agent's Memory Dies When You Switch Tools. We're Fixing That.

author Jonathan Conway
timestamp 27 April 2026
classification oamp / agent-memory / open-standard / portability / privacy / rust / typescript / python / go / elixir / interoperability

We’ve been building agent memory systems for months. Every week we hear the same story from developers: they spent weeks teaching an agent about their codebase, their preferences, their deployment patterns. The agent learned that they hate verbose explanations. It learned their Rust error handling style. It learned that they deploy to staging before production, always.

Then they switched tools. All of it gone.

Not because the data was deleted. The memories still exist on some server, serialized in whatever format that particular framework invented. Corrections the user spent real time making. Expertise the agent had calibrated across dozens of sessions. Communication preferences refined over weeks of interaction. All locked behind a proprietary schema that no other tool can read.

Every agent framework stores memory differently. Mem0 has its format. Zep has its format. Letta has its format. Custom solutions have their format. If you want to move from one to another, you write a migration script. If you want to use two agents that share context about you, you can’t. The memory exists but it’s trapped.

We built the Open Agent Memory Protocol to fix this. OAMP defines a JSON schema for agent memory documents, a REST API contract for backends, and privacy requirements for every compliant implementation. The spec is at v1.0.0. Reference implementations ship in Rust, TypeScript, Python, Go, and Elixir. A FastAPI reference backend ships as pip install oamp-server or a Docker image. MIT licensed.

This post covers what we built, why we made the design decisions we made, and how to use it.

OAMP portability before and after


What OAMP defines

OAMP is three things stacked together. A document schema (JSON Schema draft 2020-12, with Protocol Buffer definitions for high-throughput pipelines). A REST API contract (nine normative endpoints). A set of privacy requirements written in RFC 2119 language as MUST clauses, not advisory text.

OAMP layered architecture

OAMP has three document types: Knowledge Entries, Knowledge Stores, and User Models.

A Knowledge Entry is a discrete piece of information an agent has learned about a user. Every entry has a category, a confidence score, and provenance tracking. Here’s what one looks like:

{
  "oamp_version": "1.0.0",
  "type": "knowledge_entry",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "user_id": "user-alice-123",
  "category": "preference",
  "content": "User prefers concise, direct answers without excessive explanation.",
  "confidence": 0.85,
  "source": {
    "session_id": "sess-2026-03-15-001",
    "agent_id": "ultrasushitron-v2",
    "timestamp": "2026-03-15T14:32:00Z"
  },
  "decay": {
    "half_life_days": 140.0,
    "last_confirmed": "2026-03-28T09:15:00Z"
  },
  "tags": ["communication", "response-style"],
  "metadata": {}
}

Every field in that document earns its place. Let’s walk through the ones that matter most.

Four categories, no more

The category field accepts exactly four values: fact, preference, pattern, and correction. We locked this at four in v1.0 and barred implementations from adding custom categories.

Facts are objective. “User works at Acme Corp.” “Project uses PostgreSQL 15.” No evaluation, no opinion.

Preferences are stated or inferred choices about how the agent should behave. “Prefers Rust over Python.” “Likes dark mode.” These carry decay because preferences change.

Patterns are recurring behaviors the agent has observed across multiple sessions. “Deploys to staging before production.” “Reviews PRs in the morning.” Patterns require multiple observations. A single data point is a fact, not a pattern.

Corrections are first-class data. When a user tells an agent “Don’t use unwrap(), use proper error handling,” that’s not a side effect. It’s one of the most valuable signals an agent can receive. Most memory systems treat corrections as ephemeral conversation context. We gave them their own category with a recommended confidence floor of 0.9 and no temporal decay. Corrections persist until the user explicitly supersedes them.

Why only four? Vendor extensions go in tags and metadata. If we let everyone define custom categories, import/export breaks immediately. An agent that exports a "habit" entry to a system that only understands the base four has to either drop it or guess. Four categories that every implementation shares is better than twenty that none agree on.

Confidence with temporal decay

Every Knowledge Entry carries a confidence score from 0.0 to 1.0. Stated facts from users get high initial confidence. Inferred patterns get lower scores. Corrections from users get 0.9 or above because the user explicitly told the agent what to do.

Confidence decays over time using an exponential half-life model:

confidence_t = confidence_0 * e^(-ln(2) / half_life_days * age_days)

Default half-lives vary by category. Facts decay slowly at 365 days because your employer doesn’t change often. Preferences decay at 70 days because they evolve. Patterns decay at 90 days. Corrections don’t decay at all.

A preference stored six months ago with no reconfirmation should not carry the same weight as one confirmed yesterday. Without decay, stale memories accumulate and pollute retrieval. With decay, the system naturally deprioritizes old knowledge that hasn’t been reinforced.

Provenance tracking

Every Knowledge Entry MUST have a source object with a session_id and timestamp. No anonymous knowledge entries. If an agent stores something about a user, the user has the right to know when it was stored and in what context.

Provenance makes auditing possible. It makes GDPR compliance possible. And it makes debugging possible. When an agent acts on stale knowledge, provenance tells you exactly when that knowledge was acquired and whether it was ever confirmed.


The User Model

Knowledge Entries capture individual facts and preferences. The User Model captures the agent’s structured understanding of who this person is. Expertise levels, communication style, correction history, stated preferences.

{
  "oamp_version": "1.0.0",
  "type": "user_model",
  "user_id": "user-alice-123",
  "model_version": 7,
  "updated_at": "2026-03-28T12:00:00Z",
  "communication": {
    "verbosity": -0.6,
    "formality": 0.2,
    "prefers_examples": true,
    "prefers_explanations": false,
    "languages": ["en", "ja"]
  },
  "expertise": [
    {
      "domain": "rust",
      "level": "expert",
      "confidence": 0.95,
      "evidence_sessions": ["sess-001", "sess-003", "sess-005"],
      "last_observed": "2026-03-28T09:00:00Z"
    },
    {
      "domain": "kubernetes",
      "level": "advanced",
      "confidence": 0.80,
      "evidence_sessions": ["sess-002"],
      "last_observed": "2026-03-15T14:00:00Z"
    }
  ],
  "corrections": [
    {
      "what_agent_did": "Suggested using unwrap() for quick prototyping",
      "what_user_wanted": "Always use proper error handling, even in examples",
      "context": "Rust code generation",
      "session_id": "sess-003",
      "timestamp": "2026-03-12T16:45:00Z"
    }
  ],
  "stated_preferences": [
    { "key": "theme", "value": "dark", "timestamp": "2026-03-10T10:00:00Z" },
    { "key": "response-length", "value": "concise", "timestamp": "2026-03-15T14:00:00Z" }
  ]
}

Communication profiles on continuous scales

Verbosity and formality are continuous floats from -1.0 to 1.0. Not “low/medium/high.” Not “casual/formal.”

We chose continuous scales because categorical labels collapse meaningful differences. A user who wants slightly less verbose output (-0.3) and a user who wants telegram-style brevity (-0.9) both fall into “low verbosity” under a categorical model. The agent treats them identically. Continuous scales preserve the gradient.

Zero is the default. Agents start there and adjust as they learn. Moving from 0.0 to -0.6 over ten sessions tells a story that “terse” never could.

Expertise domains

Each entry in the expertise array tracks a domain, a level (novice/intermediate/advanced/expert), the agent’s confidence in that assessment, and the sessions where expertise was observed. An agent that knows a user is a Rust expert can skip basic explanations. An agent that knows a user is a React novice can provide more scaffolding.

Expertise levels are categorical here because the difference between levels is qualitative, not quantitative. An “advanced” Kubernetes user and an “expert” Kubernetes user need different kinds of help. The confidence score captures how certain the agent is in the assessment.

Stated preferences vs inferred preferences

OAMP separates stated_preferences from inferred knowledge in the User Model. When a user explicitly says “I want concise responses,” that carries more weight than an inference the agent made from observing short reply patterns. Stated preferences are timestamped declarations. Inferred preferences live in Knowledge Entries with lower confidence scores.

Knowledge entry and user model schema


Privacy as a hard requirement

Most agent memory specs treat privacy as a feature. OAMP treats it as a compliance requirement. Violations mean you cannot claim OAMP compliance.

Encryption at rest is MUST, not SHOULD. All stored knowledge and user model data MUST be encrypted at rest. AES-256-GCM is recommended. Plaintext storage is a compliance violation. Not a warning. Not a suggestion. A violation.

We made this non-negotiable because agent memory is personal data. It contains user preferences, behavioral patterns, corrections, expertise assessments, and communication styles. Under GDPR Article 17, users have the right to erasure. Under CCPA, they have the right to deletion. An agent memory system that stores this data in plaintext on disk is storing personal data without adequate protection. It shouldn’t exist.

Real deletion, not soft-delete. When a user calls the DELETE endpoint, the data is gone. Permanently. Not flagged as deleted. Not moved to a tombstone table. Gone. Backends that implement soft-deletion cannot claim OAMP compliance.

No content in logs. Implementations MUST NOT log knowledge content, user model field values, or correction text. Log the entry ID, the category, the timestamp. Never the content. A log file that contains “User prefers dark mode” is a content leak. A log file that contains “PATCH knowledge/550e8400 category=preference” is fine.

Provenance is mandatory. Every Knowledge Entry requires a source with session_id and timestamp. No anonymous data collection. Users can audit what was stored, when, and by which agent.

We reference GDPR Article 17 and CCPA in the companion security guide because these aren’t theoretical concerns. Companies building agent systems for European or Californian users are already subject to these regulations. The spec is opinionated here because the alternative is letting every implementation decide independently how seriously to take privacy. We’ve seen how that goes.


The backend REST API

OAMP defines a REST API contract for memory backends. Nine endpoints covering knowledge CRUD, user model management, and bulk import/export.

POST   /v1/knowledge             -- store a KnowledgeEntry
GET    /v1/knowledge?query=      -- search knowledge
GET    /v1/knowledge/:id         -- retrieve by ID
DELETE /v1/knowledge/:id         -- delete (permanent)
PATCH  /v1/knowledge/:id         -- update confidence, confirm

POST   /v1/user-model            -- store/update UserModel
GET    /v1/user-model/:user_id   -- retrieve
DELETE /v1/user-model/:user_id   -- delete (full reset, including all knowledge)

POST   /v1/export                -- export all data for a user
POST   /v1/import                -- import an OAMP document

Why REST over gRPC

gRPC gives you better performance, binary efficiency, and generated client code. We chose REST as the primary interface because the adoption barrier is lower. Any HTTP client can talk to an OAMP backend. No code generation step. No protobuf toolchain. A curl command works.

For backends that want binary efficiency, the spec supports content negotiation. Send Accept: application/protobuf and get protobuf back. The protobuf definitions ship in the repo under proto/oamp/v1/. But JSON is the baseline that every implementation must support.

Search is deliberately underspecified

GET /v1/knowledge?query= accepts a text query. The spec does NOT mandate full-text search, vector search, hybrid search, or any specific algorithm. Backends choose their implementation. Results must be ranked by relevance. Results must be returned as KnowledgeEntry arrays.

We made this choice because search is where backend differentiation lives. A backend built on PostgreSQL with pg_vector has different strengths than one built on a custom spreading activation engine. Mandating a search algorithm would either be too restrictive or too vague. The contract is: take a query, return ranked entries. How you rank them is your business.

Merge semantics

When importing a Knowledge Store, entries with new IDs get inserted. Entries with existing IDs use confidence-based resolution by default. Higher confidence wins. Implementations can define other strategies but must document them. Rejected entries must be reported, never silently dropped.


Reference implementations

A spec document without working code is a wish list. We ship typed libraries in five languages and a working server.

The SDKs are: oamp-types on crates.io for Rust, @deepthinking/oamp-types on npm for TypeScript, oamp-types on PyPI for Python, oamp-go (source in the spec repo) for Go, and oamp_types on Hex for Elixir. Same types, same validators, same JSON output across all five.

The reference backend ships as a separate package, oamp-server. It is a FastAPI app with SQLite storage, FTS5 full-text search, AES-256-GCM encryption at rest on every content field, monotonic version conflict detection on user model writes, and 59 tests covering CRUD, search, validation, and round-trip serialization through the spec example documents. Two ways to run it:

pip install oamp-server && python -m oamp_server
docker run -p 8000:8000 ghcr.io/deep-thinking-llc/oamp-server:latest

OpenAPI and Swagger UI are available at /docs once the server is running.

The two language examples below show the typed SDK pattern. The other three (Python, Go, Elixir) follow the same shape.

Rust: oamp-types on crates.io

use oamp_types::{KnowledgeEntry, KnowledgeCategory};

let entry = KnowledgeEntry::new(
    "user-alice-123",
    KnowledgeCategory::Correction,
    "Always use proper error handling, even in examples",
    0.98,
    "sess-003",
);

let json = serde_json::to_string_pretty(&entry).unwrap();

The crate provides KnowledgeEntry, KnowledgeStore, UserModel, and all nested types. Everything derives Serialize and Deserialize via serde. The validate module provides validation functions:

use oamp_types::validate::validate_knowledge_entry;

match validate_knowledge_entry(&entry) {
    Ok(()) => println!("Valid"),
    Err(errors) => {
        for e in errors {
            eprintln!("Validation error: {}", e);
        }
    }
}

Validation checks confidence range, required fields, type discriminators, and communication profile bounds. Each function returns Result<(), Vec<String>> so you get all errors at once, not one at a time.

TypeScript: @deepthinking/oamp-types on npm

The TypeScript package uses Zod schemas for runtime validation. Every type is both a Zod schema and a TypeScript type.

import { KnowledgeEntry, KnowledgeCategory } from '@deepthinking/oamp-types';

const entry = KnowledgeEntry.parse({
  oamp_version: '1.0.0',
  type: 'knowledge_entry',
  id: crypto.randomUUID(),
  user_id: 'user-alice-123',
  category: 'correction',
  content: 'Always use proper error handling, even in examples',
  confidence: 0.98,
  source: {
    session_id: 'sess-003',
    timestamp: new Date().toISOString(),
  },
});

If validation fails, Zod throws a ZodError with structured error messages. You get type safety at compile time from TypeScript and runtime validation from Zod. Both at once.

The User Model works the same way:

import { UserModel, CommunicationProfile } from '@deepthinking/oamp-types';

const model = UserModel.parse({
  oamp_version: '1.0.0',
  type: 'user_model',
  user_id: 'user-alice-123',
  model_version: 7,
  updated_at: new Date().toISOString(),
  communication: {
    verbosity: -0.6,
    formality: 0.2,
    prefers_examples: true,
    prefers_explanations: false,
    languages: ['en', 'ja'],
  },
  expertise: [{
    domain: 'rust',
    level: 'expert',
    confidence: 0.95,
  }],
});

Why we ship typed libraries

Spec documents get misinterpreted. JSON Schema validation catches structural errors but not semantic ones. Run cargo add oamp-types, npm install @deepthinking/oamp-types, pip install oamp-types, or clone the repo for the Go SDK (at reference/go/). Start producing valid documents immediately. The type system catches mistakes at compile time. The validation layer catches them at runtime. Together they make it hard to produce invalid OAMP documents.


How OAMP compares to what exists

Several real products already exist in the agent memory space. Here’s how OAMP relates to them.

Honcho is building a hosted dialectic memory service. Their approach models user-agent interaction as an ongoing dialogue. They have opinions about how memory should be structured and queried. They’re building a product.

Mem0 is memory-as-a-service. You send interactions, they extract and store memories, you query them later. They handle the extraction pipeline and storage. They’re building a product.

Zep does memory for LLM applications with Graphiti, their temporal knowledge graph. They have entity resolution, temporal queries, and a managed cloud offering. They’re building a product.

All three are doing good work. None of them define a portable interchange format. If you use Honcho and want to migrate to a self-hosted solution, you write a custom export script. If you use Mem0 and want to bring your memory to a different agent, you reverse-engineer their data model.

OAMP isn’t competing with these products. It’s the interchange layer they should all speak. If Honcho implements OAMP export, users can move their memory to Zep without losing anything. If Mem0 implements the OAMP REST contract, any OAMP-compliant agent can use Mem0 as a backend without Mem0-specific integration code.

Standards succeed or fail based on implementation count. We’re releasing the spec and reference libraries first. Adoption comes from making it easy to implement and painful to ignore.


What we deliberately left out of v1

v1 is tight on purpose. Here’s what we excluded and why.

Session outcomes. Structured records of what was accomplished in each session. Useful for project-management agents. Too implementation-specific for v1. The shape of a “session outcome” varies wildly between coding agents, writing agents, and scheduling agents. We’d rather wait for community input than guess wrong.

Skill metrics. Execution statistics for reusable skills or workflows. How fast did the agent complete a task? What was the success rate? Valuable data, but the definition of “skill” differs across frameworks. v1 agents can store this in metadata if they need it.

Work patterns. Active hours, common task types, tool preferences. Relevant for scheduling-aware agents. Too personal for a v1 requirement without more thought about privacy implications.

Activity timing. Hour-of-day and day-of-week behavioral patterns. Same concern as work patterns.

All of these are candidates for v2. Community feedback drives what gets promoted from metadata to first-class fields. We’d rather ship a tight spec that people actually implement than a sprawling one that nobody finishes reading.


Using OAMP today

The spec is at github.com/deep-thinking-llc/open-agent-memory-protocol. Typed SDKs are published to crates.io (Rust), npm (TypeScript), PyPI (Python), and Hex (Elixir), with the Go SDK in the spec repo. The reference backend ships as oamp-server on PyPI and as a Docker image. MIT licensed.

Building an agent framework and want memory portability? The spec defines the document structure. The reference implementations give you typed libraries to produce and consume OAMP documents.

If you’re building a memory backend and want interop with multiple agents, the REST contract in Section 6 defines the nine endpoints. Implement those, enforce the privacy requirements, and any OAMP-compliant agent can use your backend.

Building a memory product? The export/import endpoints with Knowledge Store documents give users a complete memory snapshot. That snapshot works across any compliant system.

The privacy requirements are not optional. Agent memory is personal data. It deserves the same care as medical records. Encryption at rest, real deletion, no content logging, mandatory provenance. These aren’t features. They’re the minimum bar for responsibly storing what an AI has learned about a person.

We’re building this because we need it for our own products. We’re open-sourcing it because the ecosystem needs it too.

Links: