Skip to content
› DEVELOPERS [1/8]

remember() here. recall() everywhere. One memory for every agent.

A two-call surface, sub-200ms p95 recall, capability-token identity, and a governed multi-agent memory underneath. Wrap your client, add the MCP plugin, or call the SDK/REST — your subagents inherit memory automatically.

$ npx ultramemory setup
from ultramemory import Memory

mem = Memory()

mem.remember("Acme ships on Fridays")    # 202 -> { trace_id }
ctx = mem.recall("when does Acme ship?") # ranked, token-budgeted
mem.trace(ctx.trace_id)                   # provenance: where each fact came from
Surface2 calls · 3 altitudes
Recall p95sub-200ms SLO
Authcapability tokens
ClientsSDK · MCP · REST

Two calls. Three altitudes. Works with Claude Code, Cursor, Codex, Cline, and any MCP client.

remember returns a trace_id; recall returns ranked, token-budgeted context; trace shows the lineage of every fact.

START HERE[2 / 8]

Pick a door.

Route by intent — get to the right page in one click.

  • Quickstart

    remember → recall in 5 minutes.

    Learn more →
  • API reference

    Every endpoint, every field, copy-paste examples.

    Learn more →
  • SDKs

    TypeScript + Python, typed, batteries included.

    Learn more →
ALTITUDES[3 / 8]

Pick your altitude — three ways in.

The integration story you decide on first. Start with the Proxy — it is the lowest-effort path.

  • Proxy

    Wrap your model client; we auto-remember and auto-recall behind the scenes. Zero new tools on the agent.

    Learn more →
  • MCP plugin

    Add to Claude Code / Cursor / Codex / Cline. Exactly two tools — recall + remember, ~200 tokens. Admin stays server-side.

    Learn more →
  • SDK + REST

    Full control from TS, Python, or plain REST — buckets, promotion, identity, audit, all off the agent's context.

    Learn more →
WORKS WITH[4 / 8]

Yes, your stack.

Any MCP client works; the SDK is model-agnostic.

Plugs into

  • Claude Code
  • Cursor
  • Codex
  • Cline
  • MCP
  • OpenAI
  • Anthropic
  • Gemini
  • Vercel AI SDK

See how it works with any model →

THE NUMBERS[5 / 8]

The numbers devs care about.

Fast enough to sit in the hot path; clean enough that the model never gets slower or dumber.

RECALL p95sub-200ms

The speed SLO we hold to — fast enough to sit in the hot path.

  • RECALL QUALITYhigh
    our LongMemEval-style eval
  • MODEL DEGRADATIONzero
    token-budgeted context — the model never gets slower or dumber

Our own numbers, on our own eval, until independent benchmarks land.

Instant recall → · Keeps your AI sharp →

CONNECT ONCE[6 / 8]

Connect once, subagents inherit.

Wire up the parent once. Every helper it spawns gets scoped memory automatically — no re-wiring, no human in the loop.

Capability-token lineage — scope ⊆ parent
shared bucketOrchestratorResearcherCoderReviewerWeb searchScratch tool
  • parent
  • subagent
  • peer
  • tool
GO DEEPER[7 / 8]

Under the hood.

The plain Product pages send curious readers here.

  • Integration

    Three altitudes, the 2-tool MCP surface, and how each one wires in.

    Learn more →
  • Identity

    Capability-token delegation — parent-attenuated scopes for every subagent.

    Learn more →
  • Retrieval

    Hybrid recall, RRF fusion, and the governance rerank that ranks what's current.

    Learn more →
  • Governance

    Private / shared buckets, the promotion gate, and cross-agent conflict resolution.

    Learn more →
  • Security

    Poisoning resistance — suspicious memories caught at the gate.

    Learn more →
  • Architecture

    The read / write path on Workers + Durable Objects.

    Learn more →
SHIP-READY[8 / 8]

A maintained, dependable platform.

What's new, what's live, and where the code lives.

Two calls

Start with two calls.

remember → recall, then trace every fact back to its source.