AI Memory

Give your AI agents persistent, per-user memory — conversations, distilled facts, and prompt-ready working context

AI Memory

You've used AI Agents to call an LLM from an Appivo action. The agent answers the question in front of it, then forgets everything. The next time the user shows up, the agent has no idea who they are, what they prefer, or what was decided yesterday.

This guide is about turning that around. Appivo ships a memory subsystem that lets your AI features behave like they remember — picking up where the user left off, recalling preferences, knowing facts the user established days or weeks ago.

It's all reachable through the context object inside any action. Multi-tenant and per-user isolation are automatic; you do not have to think about them.

When To Use This

Reach for the memory subsystem whenever your AI feature should:

Continue a conversation the user started earlier — yesterday, last week, or a month ago.
Recall preferences and identity facts ("I'm vegetarian", "my role is CTO", "my deadline is Friday") without re-asking.
Reason about change over time — what did the user believe last month vs. now.
Build a prompt that mixes recent chat, a rolling summary, and the most relevant remembered facts — automatically, within a token budget.
Satisfy GDPR export and delete requests with a single call.

If your AI feature is one-shot ("translate this text", "clean this HTML"), you don't need memory. Use it when continuity matters.

What You Get

The platform gives you four cooperating capabilities, all accessible through the context object inside any action.

Capability	Accessor	What it does
Conversations	`context.getConversations()`	Persist user/assistant turns, scoped per user, replayable later.
Memory	`context.getMemory()`	Long-term storage of distilled facts ("atoms") plus the entities they refer to.
Working Context	`context.getWorkingContext()`	Compose a prompt-ready block from the conversation tail, the rolling summary, and recalled memories.
Admin (GDPR)	`context.getAIAdmin()`	Export or delete a user's full footprint — single call, every collection.

Every capability is multi-tenant and per-user by default. Cross-user reads are impossible by construction; you don't write tenant checks.

The same capabilities are available over REST at /ai-conversations, /ai-memory, and /ai-working-context for non-JS clients (mobile, third-party integrations). This guide focuses on the JavaScript surface — that's where Actions live.

The mental model

If you take away one thing from this guide, take this picture:

A conversation is distilled into memory atoms; recall plus a rolling summary compose the working context fed into the next prompt.

A conversation is an append-only stream of messages — the verbatim record of what was said. It is short-term: you may keep it forever, but you would not feed all of it back into the LLM every turn.

A memory atom is a single, distilled fact extracted from a conversation (or written directly). It is structured: it has a category, an importance score, a validity window, and links back to the source messages. Atoms live forever and accumulate into the user's long-term memory.

The bridge between them is extraction — when a conversation closes, the platform runs an LLM over its content, produces atoms, and writes them to memory. You don't write that code; the platform does it for you. You just configure a binding that says "conversations in this namespace produce atoms in this memory space."

When the user starts a new conversation, the working context builder assembles a prompt block: the rolling summary of past turns, the relevant atoms recalled by topic, and the recent message tail. You feed that into the LLM and the LLM behaves as if it remembers.

That's the whole story. Everything below is mechanics.

Quick Start: A Chatbot With Memory

Let's build the simplest useful case end-to-end. We'll write an action that powers a daily check-in chatbot. It should:

Continue an existing conversation if there is one.
Build a working context that pulls in the user's identity facts and goals.
Run a turn (LLM call).
Persist everything.

// Action: dailyCheckin
// arguments: { userMessage }
// returns: { reply }
function dailyCheckin(arguments) {
    var conv   = context.getConversations();
    var memory = context.getMemory();
    var wc     = context.getWorkingContext();

    // 1. Find or create today's conversation
    var existing = conv.listConversations({}).find(function(c) {
        return c.namespace === "daily-checkin" && c.status === "open";
    });
    var conversation = existing || conv.createConversation({
        namespace: "daily-checkin",
        title:     "Daily check-in",
        metadata:  { date: new Date().toISOString().slice(0, 10) }
    });

    // 2. Find or create the memory space we extract into
    var space = memory.listMemorySpaces({}).find(function(s) {
        return s.name === "checkin";
    });
    if (!space) {
        space = memory.createMemorySpace({ name: "checkin" });
    }

    // 3. Assemble the working context: rolling summary +
    //    semantic recall + recent turns
    var ctx = wc.buildWorkingContext(conversation.id, {
        memorySpaceId: space.id,
        recallQuery:   arguments.userMessage,
        recentTurns:   8,
        recallLimit:   6,
        alwaysOnCategoryNames: ["identity", "preference", "goal"]
    });

    // 4. Run the LLM — runTurn does the dance for you
    var outcome = conv.runTurn(conversation.id, {
        provider: "anthropic",
        model:    "claude-sonnet-4-5",
        userText: arguments.userMessage,
        options:  { systemPrompt: ctx.contextBlock }
    });

    var reply = outcome.appendedMessages
        .filter(function(m) { return m.role === "assistant"; })
        .map(function(m) {
            return m.content
                .filter(function(b) { return b.type === "text"; })
                .map(function(b) { return b.text; }).join("");
        })
        .join("\n");

    return { reply: reply };
}

That's the whole app. A few things worth noticing:

We never wrote anything to memory ourselves. When the conversation eventually closes (typically when the user finishes their session, or via a scheduled job that closes stale conversations), a binding will fire extraction automatically. The atoms it produces will be there the next time we call buildWorkingContext for this user.
alwaysOnCategoryNames pulls in atoms tagged identity, preference, or goal regardless of the query — those are the facts you almost always want present. The recallQuery handles anything topic-specific.
We did not pass any user id — the platform reads the active user from the action context. Multi-tenancy is automatic.

The next sections explain each step in depth, plus the one-time binding setup you need before extraction will actually run.

Conversations

A conversation in Appivo is more than a chat window — it's a scoped, append-only event stream. Every message you append is permanent. Every change emits an event on the internal message bus. Every conversation is partitioned by the four-tuple (tenantId, appId, namespace, userId), so cross-tenant or cross-user reads cannot happen.

The Four-Tuple

Component	Where it comes from	What it controls
`tenantId`	Auth filter (thread-local)	Tenant boundary — strict isolation
`appId`	The active app schema	Which Appivo app owns the conversation
`namespace`	You choose, per call	App-internal partition, e.g. `"daily-checkin"`, `"support-chat"`
`userId`	The current user	Conversation owner

The first three are derived for you. You only ever supply the namespace. Pick a stable name per "kind" of conversation your app has — it's the seam at which bindings and recall hook in.

Lifecycle: Open → Closed

Conversations have two lifecycle states: open and closed. You can append while open; you cannot append once closed. Closing emits a conversation.closed event, which is what triggers memory extraction.

var conv = context.getConversations();

// Create
var c = conv.createConversation({
    namespace: "support-chat",
    title:     "Refund question",
    metadata:  { source: "web", priority: "normal" }
});
// c.id, c.status === "open", c.createdAt

// Append
conv.appendUserMessage(c.id, { content: "Hi, where's my refund?" });
conv.appendAssistantTurn(c.id, {
    content:    "Looking it up now...",
    stopReason: "end_turn",
    model:      "claude-sonnet-4-5"
});

// Close — idempotent; re-closing is a no-op and fires no second event
conv.closeConversation(c.id);

Two Ways To Record A Turn

You have two patterns for recording a turn.

Pattern A — runTurn does it for you. Preferred when you let the platform call the LLM:

conv.runTurn(c.id, {
    provider: "anthropic",
    model:    "claude-sonnet-4-5",
    userText: "What's my refund status?",
    tools:    [ /* ... */ ],
    toolHandler: function(toolName, args) {
        if (toolName === "lookupRefund") {
            return JSON.stringify(myLookup(args.orderId));
        }
    }
});
// User message + assistant response (+ any tool turns) all persisted
// with a shared turnId. Sequence numbers allocated atomically.

Pattern B — appendTurn after your own LLM call. Preferred when you call the LLM yourself and want the platform to persist the result:

var llmResult = callMyOwnLLM(buildMyOwnPrompt(/* ... */));
conv.appendTurn(c.id, {
    userContent: arguments.userMessage,
    assistant: {
        content:    llmResult.text,
        stopReason: llmResult.stopReason,
        model:      "claude-sonnet-4-5",
        usage:      { inputTokens: 1234, outputTokens: 567, latencyMs: 800 }
    },
    idempotencyKey: arguments.turnId   // safe retries
});

Both patterns produce the same persistent shape, fire the same events, and feed the same downstream extraction and recall. Pick whichever fits your call site.

Reading Messages

// Last 50 user-visible messages — what a chat UI should render.
var visible = conv.getMessages(c.id, { limit: 50 });

// Same range but include internal messages — for replay,
// debugging, or feeding the LLM.
var raw = conv.getRawTurns(c.id, { limit: 50 });

Message Visibility

Every message has a visibility field with three valid values:

Value	User sees it	LLM sees it
`"user"` (default)	Yes	Yes
`"internal"`	No	Yes
`"hidden"`	No	No

Use "internal" for system messages, framework-injected context, or pre-/post-processing artifacts the user shouldn't see but the LLM should. Use "hidden" for messages you persisted for audit but don't want any downstream LLM call to see.

How Memories Are Stored

The memory store has three primitive types. Once you have the mental model for them, the recall API and the extraction pipeline are straightforward.

Atoms — The Unit Of Memory

An atom is a single, self-contained, absolute-timestamped fact. Three rules govern atoms:

Self-contained — the text reads correctly without any surrounding context. "User prefers morning meetings" is an atom; "Yes" is not.
Append-mostly — when a fact changes, you don't update the old atom; you insert a new one and mark the old one superseded. The history is preserved forever.
Cited — every atom carries sourceConversationId and sourceMessageIds, so a recall result can be traced back to the originating turn.

Each atom carries:

Field	Purpose
`text`	The fact itself, in natural language
`category`	A `(name, kind)` tuple — see below
`importance`	1–5, used in recall ranking
`confidence`	0.0–1.0, extraction's certainty
`validFrom`, `validTo`	Bi-temporal validity window (`validTo: null` means currently valid)
`entityIds`	Links to the entities the atom is about
`embedding`	Vector for semantic recall (auto-computed)

Categories — A `(name, kind)` Tuple

Categories let you classify atoms in app-specific ways while giving the platform enough structure to apply universal logic (decay rules, pattern-mining exclusions, etc.).

The name is yours — "identity", "goal", "preference", "contact", "medical-allergy", whatever your domain needs. The kind is platform-defined and one of:

Kind	Means	Decay
`FACT`	A timeless attribute	Slow
`RULE`	A constraint or policy	Slow
`INTENTION`	A planned action or goal	Medium
`EPISODE`	A recorded event	Fast
`PREFERENCE`	A taste or convention	Slow
`PATTERN`	Mined from many episodes (platform-only)	Slow

// Direct write — used for migrations and explicit "remember this" flows
context.getMemory().addAtom(spaceId, {
    text:       "User prefers morning meetings",
    category:   { name: "preference", kind: "PREFERENCE" },
    importance: 4,
    confidence: 0.95
});

You will almost never write atoms directly — extraction handles it. The API is there for migrations, admin imports, and explicit "remember this" flows in your UI.

Memory Spaces — The Container

A memory space is a named partition of atoms within a single user. You typically create one space per topic your app cares about:

var memory = context.getMemory();

// Personal goal-tracking memory
var goals = memory.createMemorySpace({ name: "goals" });

// Customer-support context memory
var support = memory.createMemorySpace({ name: "support" });

Why bother with spaces? Two reasons:

Recall stays focused — searching the goals space doesn't return support-ticket memories.
Bindings target spaces — different conversation namespaces extract into different spaces.

A simple app might have one space per user. A complex app might have several. Use as many as you have natural topic boundaries.

Entities — People, Places, And Things

When the extractor produces atoms, it also produces entities: the people, organisations, concepts, and events the atoms refer to. Every atom links to its entities; entity-based recall ("everything about Acme Corp") becomes possible without re-reading every atom.

Entity types are platform-defined: PERSON, ORGANIZATION, PLACE, OBJECT, CONCEPT, EVENT, AGREEMENT. The extractor classifies; you read.

// Resolve an entity by name
var acme = memory.findEntityByName(spaceId, {
    type: "ORGANIZATION",
    name: "Acme Corporation"
});

// Get every memory atom that mentions Acme
var hits = memory.recallByEntity(spaceId, {
    entityIdOrName: acme.id,
    limit: 20
});

You will rarely create entities by hand — the extractor does. If extraction produces two entities that should be one ("Acme" and "Acme Corporation"), fold the alias under the canonical:

memory.mergeEntities(canonicalId, aliasId);
// All atom references repointed; aliases preserved.

Bindings: Where Conversations Become Memories

So far we've talked about conversations on one side and memory on the other. The thing that connects them is a binding: a small configuration record that tells the platform "when conversations in namespace X close, extract atoms into memory space Y."

Without a binding, no extraction happens. Conversations accumulate but never get distilled.

Creating A Binding

var memory = context.getMemory();

// Find or create the destination space
var space = memory.createMemorySpace({ name: "daily" });

memory.createBinding({
    conversationScope: {
        namespace: "daily-checkin",
        userId:    "*"   // every user
    },
    memorySpaceIds: [space.id],
    extractionPolicy: {
        extractionVersion:    "v1",   // which prompt version
        onConversationClosed: true,   // extract on close
        windowed:             false,  // also extract every N turns?
        windowTurns:          0
    }
});

From this point forward, every time a conversation in daily-checkin closes for any user, the extractor runs and produces atoms in that user's daily space.

Trigger Modes

You can configure two trigger modes (and combine them):

Trigger	When it fires	Best for
`onConversationClosed: true`	Conversation transitions from open → closed	Most apps. Cheapest, most accurate.
`windowed: true, windowTurns: N`	Every N turns within an open conversation	Long sessions where you want progressive memory before the user signs off.

Most apps want onConversationClosed: true only. Windowed extraction is for long-running sessions where the agent needs to recall this session's earlier turns before it ends.

Binding Scope

The conversationScope tuple supports wildcards on namespace and userId:

// All daily-checkin conversations for any user (the typical case)
{ namespace: "daily-checkin", userId: "*" }

// All conversations of any namespace for one specific user
{ namespace: "*", userId: "user_42" }

// Multiple namespaces extracting into one space
[
    { namespace: "morning-checkin", userId: "*" },
    { namespace: "evening-checkin", userId: "*" }
]

You can have multiple bindings; they all match independently. A conversation that matches three bindings produces three extraction jobs (one per binding × space).

Extraction Prompt Versioning

The extractionVersion field pins which prompt the extractor uses (extraction-v1.txt, extraction-v2.txt, …). Bumping the version is the intentional re-extraction path: change the prompt to add a new category, or improve the extraction quality, then run a replay over historical conversations to re-extract them.

You won't typically write extraction prompts yourself — the platform ships sensible defaults — but you may bump the version when the platform team releases an improved prompt.

Manual Replay

To re-run extraction for one conversation (e.g., to test a new prompt version):

memory.runExtractionForConversation(conversationId, {
    extractionVersion: "v2"   // optional override
});

This produces a fresh extraction job that the worker pool picks up.

Recall: Three Ways To Find Memories

Recall is how memories make their way back into the LLM's context. The platform offers three modes — they exist because LLMs pick the right tool more reliably from a clear name + description than from a single tool with a discriminator argument.

`recallByTopic` — Semantic Search

The hot path. You provide a free-text query, and the platform returns the atoms most semantically relevant to it (ranked by embedding similarity, with decay and importance weighting applied).

var hits = memory.recallByTopic(spaceId, {
    query: "the user's preferred meeting times",
    limit: 5
});

// hits.hits = [
//   { atom: { text: "User prefers Tuesday mornings", ... },
//     score: 0.91, decayWeight: 1.0, ... },
//   { atom: { text: "User dislikes meetings before 9am", ... },
//     score: 0.87, ... },
//   ...
// ]

When Atlas Vector Search isn't available, the platform falls back to a token-overlap scoring function. Same return shape; slightly worse ranking. The seam is invisible to your code.

Use it when: most of the time. Working context uses this under the hood. Anywhere the agent needs to "remember something about a topic."

`recallByEntity` — Everything About X

Given an entity (a person, organisation, place, etc.), return every atom that mentions it. Newest first.

var acme = memory.findEntityByName(spaceId, {
    type: "ORGANIZATION",
    name: "Acme Corporation"
});

var history = memory.recallByEntity(spaceId, {
    entityIdOrName: acme.id,
    limit: 50
});

Use it when: profile-style summaries. Customer detail pages. "Tell me everything about user X." Anywhere an entity is the lens, not a topic.

`recallTimeline` — What Was True At Time T

Bi-temporal queries: include superseded atoms, slice by validity window, see the history of changes.

// What did we know about Acme between Jan 1 and April 30?
var window = memory.recallTimeline(spaceId, {
    entityIdOrName: "Acme Corporation",
    from: "2026-01-01T00:00:00Z",
    to:   "2026-04-30T23:59:59Z",
    limit: 20
});

recallTimeline includes superseded atoms by default — that's the whole point. You see the change history.

Use it when: audits, "what did we know at the time we made this decision", reasoning about change over time, compliance reports.

Recall Result Shape

All three modes return the same shape:

{
    mode:            "BY_TOPIC",   // or BY_ENTITY / TIMELINE
    totalCandidates: 47,           // candidates before limit
    latencyMs:       12,
    hits: [
        {
            atom: {
                id:            "atom_abc123",
                text:          "User prefers morning meetings",
                category:      { name: "preference", kind: "PREFERENCE" },
                importance:    4,
                confidence:    0.95,
                validFrom:     "2026-04-30T...",
                validTo:       null,            // null = currently valid
                sourceConversationId: "conv_xyz",
                sourceMessageIds:     ["msg_1", "msg_5"],
                entityIds:     [ ... ]
            },
            score:            0.91,             // post-fusion ranking
            decayWeight:      1.0,
            entityMatchBonus: 1.0
        },
        // ...
    ]
}

The score is the platform's final ranking — it already combines vector similarity, decay, and importance. Sort by it directly.

Working Context: A Prompt That "Remembers"

Recall is a primitive. Working Context is the ready-to-feed-the-LLM block built from recall + the conversation tail + the rolling summary.

Instead of writing 50 lines of code to build a prompt that pulls in identity facts + recent conversation summary + relevant memories + the recent message tail, you call one method:

var ctx = context.getWorkingContext().buildWorkingContext(
    conversationId,
    {
        memorySpaceId: space.id,
        recallQuery:   "the user's question right now",
        recentTurns:   10,
        recallLimit:   8,
        tokenBudget:   8000,
        alwaysOnCategoryNames: ["identity", "preference"],
        includeRollingSummary: true
    }
);

// ctx is a Map with:
//   contextBlock:    "...assembled prompt..."   (single string)
//   messages:        [ ... ]                    (canonical message list)
//   atomsUsed:       [ ... ]                    (the atoms that landed)
//   tokensEstimated: 4321

Feed ctx.contextBlock into your LLM call as the system prompt (or as the leading content of your user prompt — the template controls where it lands).

What Gets Included

By default, the working context block has four sections, assembled in this order:

Rolling summary (when includeRollingSummary) — a running précis of earlier turns.
Always-on atoms (alwaysOnCategoryNames) — facts you always want present, such as identity and preferences.
Recalled atoms — the atoms semantically relevant to recallQuery (a BY_TOPIC recall).
Recent turns — the tail of the conversation (recentTurns messages).

A concrete block looks like this:

Rolling summary:
  Across previous turns, the user works in product management at Acme,
  prefers morning meetings, and is working towards the Q3 release.

Always-on atoms:
  User name: Johan Eriksson
  User role: CTO at Snubbas
  User prefers concise answers

Recalled atoms:
  User cancelled their last 3 meetings on Tuesday afternoons
  User mentioned a conflict with the Q2 review meeting

Recent turns:
  user:      Can we move next week's status to morning?
  assistant: Sure, which morning?
  user:      Tuesday.

The token budget is enforced top-to-bottom: rolling summary first (cheap, always in), then always-on atoms, then recall hits, then recent turns. If you exceed the budget, recent turns get trimmed first, then recall hits.

Tuning Knobs

Parameter	Default	When to change it
`recentTurns`	10	Fewer for tighter token budgets; more for chat-heavy apps
`recallLimit`	8	Lower (3–5) when the user message is short; higher (10–15) for complex queries
`tokenBudget`	8000	Match your model's context window minus output budget
`includeRollingSummary`	`true`	Set `false` for very short / single-turn conversations
`alwaysOnCategoryNames`	`[]`	Heavily app-dependent; `identity` + `preference` are common
`recallQuery`	(none)	Usually the user's current message

Custom Templates

The default template renders the four sections above. If you need a different layout (e.g., XML tags for Anthropic's prompt preferences, or a chat-message array instead of a single string), register a custom Mustache template via the platform startup hooks. For most apps the default is fine.

Letting The Agent Recall On Its Own

The patterns above assume you decide when to recall. The platform also ships native AITools the LLM can call during a turn — three tools, one per recall mode:

Tool	When the LLM picks it
`RecallByTopic`	"I should look up what the user said about X"
`RecallByEntity`	"I have an entity reference; let me see its history"
`RecallTimeline`	"I need to reason about how things changed over time"

Add them to your AI Agent chain JSON:

{
  "tools": [
    {
      "type": "RECALL_BY_TOPIC",
      "name": "recallMemory",
      "memorySpaceId": "$userMemorySpaceId"
    },
    {
      "type": "RECALL_BY_ENTITY",
      "name": "lookupEntity",
      "memorySpaceId": "$userMemorySpaceId"
    }
  ]
}

The "$userMemorySpaceId" syntax is the platform's standard variable-binding convention. At chain-execution time the runtime reads userMemorySpaceId from the AIStep's input arguments. The practical consequence: one agent JSON can serve every user; you just pass the user's memory space id when you invoke the agent.

context.getAIFunctions().invokeAgent("supportAgent", {
    arguments: {
        userMemorySpaceId: userSpace.id,
        userMessage:       arguments.message
    }
});

If your space is app-global (a shared knowledge base, not per-user), drop the $ and use a literal id:

{
  "type": "RECALL_BY_TOPIC",
  "name": "recallHandbook",
  "memorySpaceId": "ms_company_handbook"
}

The agent then calls these tools mid-turn:

User:      "Did I tell you about my dietary restrictions?"
Assistant: [calls recallMemory with query "user dietary restrictions"]
Tool:      [returns: "User is vegetarian; user has a peanut allergy"]
Assistant: "Yes — you mentioned you're vegetarian and have a peanut allergy."

Apps With Their Own Chat History

Some agents put much more than plain chat in the user prompt — they render their own (often richly-structured) history and don't use the platform's AIChatHistory at all. They still want conversation persistence so memory extraction, recall, and GDPR export work.

The integration point is appendTurn — one call per turn; both messages share a server-allocated turn id:

function myCustomAgent(arguments) {
    var conv = context.getConversations();

    // 1. Get or create the conversation
    var c = conv.createConversation({
        namespace: "custom-agent",
        title:     arguments.title
    });

    // 2. Build my own rich prompt from internal history + user input
    var prompt = buildMyRichPrompt(arguments.myHistory, arguments.userInput);

    // 3. Call my own LLM
    var llmResult = callMyOwnLLM(prompt);

    // 4. One call to persist user + assistant under one turn
    var saved = conv.appendTurn(c.id, {
        userContent: arguments.userInput,
        assistant: {
            content:    llmResult.text,
            stopReason: llmResult.stopReason,
            model:      llmResult.model,
            provider:   "anthropic",
            usage: {
                inputTokens:  llmResult.inputTokens,
                outputTokens: llmResult.outputTokens,
                latencyMs:    llmResult.latencyMs
            }
        },
        idempotencyKey: arguments.turnId   // safe retries
    });

    return { reply: llmResult.text, turnId: saved.turnId };
}

Three guarantees this gives you:

Shared turnId on both messages — replay tooling groups them as one turn even though the LLM-protocol message array doesn't reflect that.
Atomic ordering — user at sequence N, assistant at N+1.
Idempotency on the user side — retries with the same idempotencyKey return the existing user message; the assistant always gets a fresh row (different LLM responses for the same logical turn must not collide on dedupe).

Common Patterns

Per-User Memory Space

Most apps have one memory space per user, named after the domain area:

function getOrCreateUserSpace(name) {
    var memory = context.getMemory();
    var existing = memory.listMemorySpaces({})
        .find(function(s) { return s.name === name; });
    return existing || memory.createMemorySpace({ name: name });
}

var goals     = getOrCreateUserSpace("goals");
var prefs     = getOrCreateUserSpace("preferences");
var support   = getOrCreateUserSpace("support");

Spaces are cheap. Use them as natural topic boundaries.

Shared Knowledge Base

For app-global knowledge (handbooks, FAQs, company facts) you have two choices:

Use a separate per-app system user that owns the shared memory space, and recall from that space in addition to the per-user space.
Bypass the memory subsystem and use a regular SearchIndex. Memory is for personal, evolving facts — if your "memory" is actually a static document, document search will likely serve you better.

"Remember This" UI Flow

Sometimes the user explicitly asks the agent to remember something. Instead of waiting for extraction, write the atom directly:

context.getMemory().addAtom(spaceId, {
    text:       arguments.factText,
    category:   { name: "preference", kind: "PREFERENCE" },
    importance: 5,         // user-volunteered facts are high signal
    confidence: 1.0,       // user-confirmed, max confidence
    sourceConversationId: arguments.conversationId
});

Cite The Source In The UI

Atoms carry sourceConversationId and sourceMessageIds. Render those as citations when the LLM uses recall:

var hits = memory.recallByTopic(spaceId, { query: q, limit: 5 });
hits.hits.forEach(function(h) {
    var sourceDate = h.atom.validFrom.slice(0, 10);
    var convLink   = "/history/" + h.atom.sourceConversationId;
    // render "Said on YYYY-MM-DD" → conversation history view
});

Single calls handle the GDPR right-to-export and right-to-erasure paths:

// Export — typically wired to "download my data"
var dump = context.getAIAdmin().exportUserData({
    userId: arguments.userId   // optional; defaults to current user
});
// dump = { conversations: [...], spaces: [...], atoms: [...],
//          entities: [...], bindings: [...] }

// Delete — wired to "delete my data"
context.getAIAdmin().deleteUserData({
    userId:  arguments.userId,
    confirm: true
});

The delete is hard — it removes data from every collection. Idempotent.

What To Watch Out For

Don't Mix Conversation Namespaces

A namespace is the seam at which extraction fires. If you mix unrelated topics in one namespace, the extractor produces a muddled set of atoms. Use namespaces generously — "daily-checkin", "goal-setting", "weekly-review" — not a single "chat" for everything.

Don't Write Atoms Directly When Extraction Will Handle It

If a fact comes up naturally in conversation, the extractor will pick it up. Writing it directly in addition produces a duplicate that the reconciler may or may not merge. Reserve direct writes for migrations and explicit "remember this" flows.

Don't Conflate Visibility Levels

"internal" is for messages the LLM sees but the user UI doesn't (system prompts, framework injections). "hidden" hides from both. Defaulting to "internal" for system messages is correct; setting things to "hidden" is rare.

Mind The Token Budget

Working context trims recent turns first, then recall hits. If your tokenBudget is too tight you'll lose recent conversation context — usually the most useful part. Err on the high side.

Bumping `extractionVersion` Doesn't Auto-Replay

Existing atoms stay where they are; only NEW conversations close at the new version. To re-extract historical conversations, run runExtractionForConversation for each one.

Bindings Are Cluster-Wide

A binding lives on the platform; every node sees it. If you delete a binding, extraction stops everywhere. Use enabled: false on the binding to pause without losing config.

Reference Cheat Sheet

Quick lookups for the methods and shapes you'll reach for most often.

`context.getConversations()`

Method	Returns	Notes
`createConversation({namespace, title?, sessionId?, metadata?})`	conversation	`namespace` required
`getConversation(id)`	conversation \| null	scope-checked
`listConversations({})`	conversation	for current user
`closeConversation(id)`	conversation	idempotent
`appendUserMessage(id, {content, idempotencyKey?, traceId?})`	message	string or content-block list
`appendAssistantTurn(id, {content, stopReason?, model?, provider?, usage?, traceId?})`	message
`appendToolResult(id, {toolUseId, toolName?, content, isError?})`	message
`appendSystemMessage(id, {content, visibility?})`	message	defaults to `"internal"`
`appendTurn(id, {userContent, userVisibility?, assistant: {content, stopReason?, model?, provider?, usage?}, idempotencyKey?})`	`{turnId, userMessage, assistantMessage}`	one call, shared turn
`getMessages(id, {limit?, includeInternal?})`	message	excludes internal by default
`getRawTurns(id, {limit?})`	message	always includes internal
`runTurn(id, {provider, model, userText? \| content?, tools?, toolHandler?, options?, maxCycles?, tailLimit?, idempotencyKey?})`	`{status, appendedMessages, aggregateUsage, toolInvocations, totalLatencyMs}`	full LLM turn
`refreshRollingSummary(id)`	`{outcome, rollingSummary, rollingSummaryUntilSeq}`	manual

`context.getMemory()`

Method	Returns	Notes
`createMemorySpace({name?, metadata?})`	space	`name` is yours
`getMemorySpace(id)`	space \| null
`listMemorySpaces({})`	space
`deleteMemorySpace(id)`	void	soft delete
`addAtom(spaceId, {text, category: {name, kind}, importance?, confidence?, validFrom?, sourceConversationId?, sourceMessageIds?})`	atom	direct write
`getAtom(id)`	atom \| null
`supersedeAtom(oldId, {…AddAtom})`	atom	closes old, inserts new
`archiveAtom(id)`	void	excluded from default recall
`listAtoms(spaceId, {category?, status?, validAt?, limit?})`	atom
`createEntity(spaceId, {type, canonicalName, description?, attributes?})`	entity	`type` ∈ PERSON / ORGANIZATION / PLACE / OBJECT / CONCEPT / EVENT / AGREEMENT
`getEntity(id)`	entity \| null
`findEntityByName(spaceId, {type, name})`	entity \| null	exact / alias match
`mergeEntities(canonicalId, aliasId)`	entity	re-points atom refs
`searchEntities(spaceId, {type?, query, limit?})`	entity
`recallByTopic(spaceId, {query, limit?, categoryNames?, minImportance?, validAt?})`	recall result	semantic
`recallByEntity(spaceId, {entityIdOrName, limit?, categoryNames?})`	recall result	newest-first
`recallTimeline(spaceId, {entityIdOrName? \| query?, from?, to?, limit?, includeSuperseded?})`	recall result	bi-temporal
`createBinding({conversationScope, memorySpaceIds, extractionPolicy})`	binding	enables extraction
`listBindings({})`	binding
`updateBinding(id, {extractionPolicy?, enabled?})`	binding	scope is immutable
`deleteBinding(id)`	void
`runExtractionForConversation(convId, {extractionVersion?})`	job	manual replay

`context.getWorkingContext()`

Method	Returns	Notes
`buildWorkingContext(convId, {memorySpaceId, recallQuery?, recentTurns?, recallLimit?, tokenBudget?, includeRollingSummary?, alwaysOnCategoryNames?, templateId?})`	`{contextBlock, messages, atomsUsed, tokensEstimated}`	all knobs documented in JSDoc

`context.getAIAdmin()`

Method	Returns	Notes
`exportUserData({userId?})`	`{conversations, spaces, atoms, entities, bindings}`	GDPR right-to-export
`deleteUserData({userId?, confirm: true})`	`{deleted: {…counts}}`	GDPR right-to-erasure

Common Shapes

Content block (used in content fields):

{ type: "text",        text: "..." }
{ type: "image",       url: "...", mimeType: "image/png" }
{ type: "tool_use",    id: "tu_1", name: "lookup", arguments: { ... } }
{ type: "tool_result", toolUseId: "tu_1", content: "...", isError: false }

Atom:

{
    id, memorySpaceId, tenantId, appId, userId,
    text,
    category:    { name, kind },
    importance:  1..5,
    confidence:  0.0..1.0,
    validFrom, validTo,                  // ISO instants; validTo null = current
    status:      "ACTIVE" | "ARCHIVED" | "DELETED",
    sourceConversationId,
    sourceMessageIds: [ ... ],
    entityIds:        [ ... ],
    createdAt, updatedAt
}

Entity:

{
    id, memorySpaceId, tenantId, appId, userId,
    type:           "PERSON" | "ORGANIZATION" | ...,
    canonicalName,  description,
    aliases:        [ ... ],
    status:         "ACTIVE" | "MERGED",
    firstSeenAt, lastSeenAt, atomCount,
    createdAt, updatedAt
}

Binding:

{
    id, tenantId,
    conversationScope: { namespace, userId },   // or array of these
    memorySpaceIds:    [ ... ],
    extractionPolicy:  {
        extractionVersion:    "v1",
        onConversationClosed: true,
        windowed:             false,
        windowTurns:          0
    },
    enabled: true,
    createdAt, updatedAt
}

Recall hit:

{
    atom:             { …atom shape… },
    score:            0.0..1.0,           // post-fusion ranking
    decayWeight:      0.0..1.0,           // multiplier applied
    entityMatchBonus: 1.0..2.0
}

Variable Binding In Agent JSON

When configuring tools or agents that need per-user values sourced from AIStep inputs:

{
  "memorySpaceId": "$userMemorySpaceId"
}

The $paramName prefix tells the platform to read the value from the AIStep's input map at runtime. Drop the $ for literal values that are the same for every user. Then invoke with the variable bound:

context.getAIFunctions().invokeAgent("agentName", {
    arguments: {
        userMemorySpaceId: someResolvedSpaceId
        // ... other agent inputs
    }
});

Next Steps

AI Agents — build and invoke the agents that use this memory
Rules and Actions — schedule a job that closes stale conversations to trigger extraction
Security — understand the tenant + user isolation that backs every memory call