AI Memory
Give your AI agents persistent, per-user memory — conversations, distilled facts, and prompt-ready working context
AI Memory Guide
You've used AI Agents to call an LLM from an Appivo action. The agent answers the question in front of it, then forgets everything. The next time the user shows up, the agent has no idea who they are, what they prefer, or what was decided yesterday.
This guide is about turning that around. Appivo ships a memory subsystem that lets your AI features behave like they remember — picking up where the user left off, recalling preferences, knowing facts the user established days or weeks ago.
It's all reachable through the context object inside any action. Multi-tenant and per-user isolation are automatic; you do not have to think about them.
When To Use This
Reach for the memory subsystem whenever your AI feature should:
- Continue a conversation the user started earlier — yesterday, last week, or a month ago.
- Recall preferences and identity facts ("I'm vegetarian", "my role is CTO", "my deadline is Friday") without re-asking.
- Reason about change over time — what did the user believe last month vs. now.
- Build a prompt that mixes recent chat, a rolling summary, and the most relevant remembered facts — automatically, within a token budget.
- Satisfy GDPR export and delete requests with a single call.
If your AI feature is one-shot ("translate this text", "clean this HTML"), you don't need memory. Use it when continuity matters.
What You Get
The platform gives you four cooperating capabilities, all accessible through the context object inside any action.
| Capability | Accessor | What it does |
|---|---|---|
| Conversations | context.getConversations() | Persist user/assistant turns, scoped per user, replayable later. |
| Memory | context.getMemory() | Long-term storage of distilled facts ("atoms") plus the entities they refer to. |
| Working Context | context.getWorkingContext() | Compose a prompt-ready block from the conversation tail, the rolling summary, and recalled memories. |
| Admin (GDPR) | context.getAIAdmin() | Export or delete a user's full footprint — single call, every collection. |
Every capability is multi-tenant and per-user by default. Cross-user reads are impossible by construction; you don't write tenant checks.
The same capabilities are available over REST at /ai-conversations, /ai-memory, and /ai-working-context for non-JS clients (mobile, third-party integrations). This guide focuses on the JavaScript surface — that's where Actions live.
The Mental Model In One Picture
If you take away one thing from this guide, take this:
Conversation Memory
────────────── ──────
(short-term, (long-term, bi-temporal,
verbatim, replayable) distilled, recallable)
┌──────────────────────┐ ┌──────────────────────┐
│ user: "I prefer │ │ atom: "User prefers │
│ meetings on │ ───► │ morning meetings." │
│ Tuesday mornings." │ │ │
│ │ │ importance: 4 │
│ ai: "Got it!" │ │ validFrom: 2026... │
│ │ │ confidence: 0.95 │
└──────────────────────┘ └──────────────────────┘
▲ │
│ │
│ extraction │ recall
│ (event-driven, │ (BY_TOPIC,
│ runs on close) │ BY_ENTITY,
│ │ TIMELINE)
│ ▼
│ ┌──────────────────────┐
└────────────────────────────┤ Working Context │
│ = summary │
│ + relevant atoms │
│ + recent turns │
│ ──► next prompt │
└──────────────────────┘
A conversation is an append-only stream of messages — the verbatim record of what was said. It is short-term: you may keep it forever, but you would not feed all of it back into the LLM every turn.
A memory atom is a single, distilled fact extracted from a conversation (or written directly). It is structured: it has a category, an importance score, a validity window, and links back to the source messages. Atoms live forever and accumulate into the user's long-term memory.
The bridge between them is extraction — when a conversation closes, the platform runs an LLM over its content, produces atoms, and writes them to memory. You don't write that code; the platform does it for you. You just configure a binding that says "conversations in this namespace produce atoms in this memory space."
When the user starts a new conversation, the working context builder assembles a prompt block: the rolling summary of past turns, the relevant atoms recalled by topic, and the recent message tail. You feed that into the LLM and the LLM behaves as if it remembers.
That's the whole story. Everything below is mechanics.
Quick Start: A Chatbot With Memory
Let's build the simplest useful case end-to-end. We'll write an action that powers a daily check-in chatbot. It should:
- Continue an existing conversation if there is one.
- Build a working context that pulls in the user's identity facts and goals.
- Run a turn (LLM call).
- Persist everything.
// Action: dailyCheckin
// arguments: { userMessage }
// returns: { reply }
function dailyCheckin(arguments) {
var conv = context.getConversations();
var memory = context.getMemory();
var wc = context.getWorkingContext();
// 1. Find or create today's conversation
var existing = conv.listConversations({}).find(function(c) {
return c.namespace === "daily-checkin" && c.status === "open";
});
var conversation = existing || conv.createConversation({
namespace: "daily-checkin",
title: "Daily check-in",
metadata: { date: new Date().toISOString().slice(0, 10) }
});
// 2. Find or create the memory space we extract into
var space = memory.listMemorySpaces({}).find(function(s) {
return s.name === "checkin";
});
if (!space) {
space = memory.createMemorySpace({ name: "checkin" });
}
// 3. Assemble the working context: rolling summary +
// semantic recall + recent turns
var ctx = wc.buildWorkingContext(conversation.id, {
memorySpaceId: space.id,
recallQuery: arguments.userMessage,
recentTurns: 8,
recallLimit: 6,
alwaysOnCategoryNames: ["identity", "preference", "goal"]
});
// 4. Run the LLM — runTurn does the dance for you
var outcome = conv.runTurn(conversation.id, {
provider: "anthropic",
model: "claude-sonnet-4-5",
userText: arguments.userMessage,
options: { systemPrompt: ctx.contextBlock }
});
var reply = outcome.appendedMessages
.filter(function(m) { return m.role === "assistant"; })
.map(function(m) {
return m.content
.filter(function(b) { return b.type === "text"; })
.map(function(b) { return b.text; }).join("");
})
.join("\n");
return { reply: reply };
}
That's the whole app. A few things worth noticing:
- We never wrote anything to memory ourselves. When the conversation eventually closes (typically when the user finishes their session, or via a scheduled job that closes stale conversations), a binding will fire extraction automatically. The atoms it produces will be there the next time we call
buildWorkingContextfor this user. alwaysOnCategoryNamespulls in atoms taggedidentity,preference, orgoalregardless of the query — those are the facts you almost always want present. TherecallQueryhandles anything topic-specific.- We did not pass any user id — the platform reads the active user from the action context. Multi-tenancy is automatic.
The next sections explain each step in depth, plus the one-time binding setup you need before extraction will actually run.
Conversations
A conversation in Appivo is more than a chat window — it's a scoped, append-only event stream. Every message you append is permanent. Every change emits an event on the internal message bus. Every conversation is partitioned by the four-tuple (tenantId, appId, namespace, userId), so cross-tenant or cross-user reads cannot happen.
The Four-Tuple
| Component | Where it comes from | What it controls |
|---|---|---|
tenantId | Auth filter (thread-local) | Tenant boundary — strict isolation |
appId | The active app schema | Which Appivo app owns the conversation |
namespace | You choose, per call | App-internal partition, e.g. "daily-checkin", "support-chat" |
userId | The current user | Conversation owner |
The first three are derived for you. You only ever supply the namespace. Pick a stable name per "kind" of conversation your app has — it's the seam at which bindings and recall hook in.
Lifecycle: Open → Closed
Conversations have two lifecycle states: open and closed. You can append while open; you cannot append once closed. Closing emits a conversation.closed event, which is what triggers memory extraction.
var conv = context.getConversations();
// Create
var c = conv.createConversation({
namespace: "support-chat",
title: "Refund question",
metadata: { source: "web", priority: "normal" }
});
// c.id, c.status === "open", c.createdAt
// Append
conv.appendUserMessage(c.id, { content: "Hi, where's my refund?" });
conv.appendAssistantTurn(c.id, {
content: "Looking it up now...",
stopReason: "end_turn",
model: "claude-sonnet-4-5"
});
// Close — idempotent; re-closing is a no-op and fires no second event
conv.closeConversation(c.id);
Two Ways To Record A Turn
You have two patterns for recording a turn.
Pattern A — runTurn does it for you. Preferred when you let the platform call the LLM:
conv.runTurn(c.id, {
provider: "anthropic",
model: "claude-sonnet-4-5",
userText: "What's my refund status?",
tools: [ /* ... */ ],
toolHandler: function(toolName, args) {
if (toolName === "lookupRefund") {
return JSON.stringify(myLookup(args.orderId));
}
}
});
// User message + assistant response (+ any tool turns) all persisted
// with a shared turnId. Sequence numbers allocated atomically.
Pattern B — appendTurn after your own LLM call. Preferred when you call the LLM yourself and want the platform to persist the result:
var llmResult = callMyOwnLLM(buildMyOwnPrompt(/* ... */));
conv.appendTurn(c.id, {
userContent: arguments.userMessage,
assistant: {
content: llmResult.text,
stopReason: llmResult.stopReason,
model: "claude-sonnet-4-5",
usage: { inputTokens: 1234, outputTokens: 567, latencyMs: 800 }
},
idempotencyKey: arguments.turnId // safe retries
});
Both patterns produce the same persistent shape, fire the same events, and feed the same downstream extraction and recall. Pick whichever fits your call site.
Reading Messages
// Last 50 user-visible messages — what a chat UI should render.
var visible = conv.getMessages(c.id, { limit: 50 });
// Same range but include internal messages — for replay,
// debugging, or feeding the LLM.
var raw = conv.getRawTurns(c.id, { limit: 50 });
Message Visibility
Every message has a visibility field with three valid values:
| Value | User sees it | LLM sees it |
|---|---|---|
"user" (default) | Yes | Yes |
"internal" | No | Yes |
"hidden" | No | No |
Use "internal" for system messages, framework-injected context, or pre-/post-processing artifacts the user shouldn't see but the LLM should. Use "hidden" for messages you persisted for audit but don't want any downstream LLM call to see.
How Memories Are Stored
The memory store has three primitive types. Once you have the mental model for them, the recall API and the extraction pipeline are straightforward.
Atoms — The Unit Of Memory
An atom is a single, self-contained, absolute-timestamped fact. Three rules govern atoms:
- Self-contained — the text reads correctly without any surrounding context. "User prefers morning meetings" is an atom; "Yes" is not.
- Append-mostly — when a fact changes, you don't update the old atom; you insert a new one and mark the old one superseded. The history is preserved forever.
- Cited — every atom carries
sourceConversationIdandsourceMessageIds, so a recall result can be traced back to the originating turn.
Each atom carries:
| Field | Purpose |
|---|---|
text | The fact itself, in natural language |
category | A (name, kind) tuple — see below |
importance | 1–5, used in recall ranking |
confidence | 0.0–1.0, extraction's certainty |
validFrom, validTo | Bi-temporal validity window (validTo: null means currently valid) |
entityIds | Links to the entities the atom is about |
embedding | Vector for semantic recall (auto-computed) |
Categories — A (name, kind) Tuple
Categories let you classify atoms in app-specific ways while giving the platform enough structure to apply universal logic (decay rules, pattern-mining exclusions, etc.).
The name is yours — "identity", "goal", "preference", "contact", "medical-allergy", whatever your domain needs. The kind is platform-defined and one of:
| Kind | Means | Decay |
|---|---|---|
FACT | A timeless attribute | Slow |
RULE | A constraint or policy | Slow |
INTENTION | A planned action or goal | Medium |
EPISODE | A recorded event | Fast |
PREFERENCE | A taste or convention | Slow |
PATTERN | Mined from many episodes (platform-only) | Slow |
// Direct write — used for migrations and explicit "remember this" flows
context.getMemory().addAtom(spaceId, {
text: "User prefers morning meetings",
category: { name: "preference", kind: "PREFERENCE" },
importance: 4,
confidence: 0.95
});
You will almost never write atoms directly — extraction handles it. The API is there for migrations, admin imports, and explicit "remember this" flows in your UI.
Memory Spaces — The Container
A memory space is a named partition of atoms within a single user. You typically create one space per topic your app cares about:
var memory = context.getMemory();
// Personal goal-tracking memory
var goals = memory.createMemorySpace({ name: "goals" });
// Customer-support context memory
var support = memory.createMemorySpace({ name: "support" });
Why bother with spaces? Two reasons:
- Recall stays focused — searching the goals space doesn't return support-ticket memories.
- Bindings target spaces — different conversation namespaces extract into different spaces.
A simple app might have one space per user. A complex app might have several. Use as many as you have natural topic boundaries.
Entities — People, Places, And Things
When the extractor produces atoms, it also produces entities: the people, organisations, concepts, and events the atoms refer to. Every atom links to its entities; entity-based recall ("everything about Acme Corp") becomes possible without re-reading every atom.
Entity types are platform-defined: PERSON, ORGANIZATION, PLACE, OBJECT, CONCEPT, EVENT, AGREEMENT. The extractor classifies; you read.
// Resolve an entity by name
var acme = memory.findEntityByName(spaceId, {
type: "ORGANIZATION",
name: "Acme Corporation"
});
// Get every memory atom that mentions Acme
var hits = memory.recallByEntity(spaceId, {
entityIdOrName: acme.id,
limit: 20
});
You will rarely create entities by hand — the extractor does. If extraction produces two entities that should be one ("Acme" and "Acme Corporation"), fold the alias under the canonical:
memory.mergeEntities(canonicalId, aliasId);
// All atom references repointed; aliases preserved.
Bindings: Where Conversations Become Memories
So far we've talked about conversations on one side and memory on the other. The thing that connects them is a binding: a small configuration record that tells the platform "when conversations in namespace X close, extract atoms into memory space Y."
Without a binding, no extraction happens. Conversations accumulate but never get distilled.
Creating A Binding
var memory = context.getMemory();
// Find or create the destination space
var space = memory.createMemorySpace({ name: "daily" });
memory.createBinding({
conversationScope: {
namespace: "daily-checkin",
userId: "*" // every user
},
memorySpaceIds: [space.id],
extractionPolicy: {
extractionVersion: "v1", // which prompt version
onConversationClosed: true, // extract on close
windowed: false, // also extract every N turns?
windowTurns: 0
}
});
From this point forward, every time a conversation in daily-checkin closes for any user, the extractor runs and produces atoms in that user's daily space.
Trigger Modes
You can configure two trigger modes (and combine them):
| Trigger | When it fires | Best for |
|---|---|---|
onConversationClosed: true | Conversation transitions from open → closed | Most apps. Cheapest, most accurate. |
windowed: true, windowTurns: N | Every N turns within an open conversation | Long sessions where you want progressive memory before the user signs off. |
Most apps want onConversationClosed: true only. Windowed extraction is for long-running sessions where the agent needs to recall this session's earlier turns before it ends.
Binding Scope
The conversationScope tuple supports wildcards on namespace and userId:
// All daily-checkin conversations for any user (the typical case)
{ namespace: "daily-checkin", userId: "*" }
// All conversations of any namespace for one specific user
{ namespace: "*", userId: "user_42" }
// Multiple namespaces extracting into one space
[
{ namespace: "morning-checkin", userId: "*" },
{ namespace: "evening-checkin", userId: "*" }
]
You can have multiple bindings; they all match independently. A conversation that matches three bindings produces three extraction jobs (one per binding × space).
Extraction Prompt Versioning
The extractionVersion field pins which prompt the extractor uses (extraction-v1.txt, extraction-v2.txt, …). Bumping the version is the intentional re-extraction path: change the prompt to add a new category, or improve the extraction quality, then run a replay over historical conversations to re-extract them.
You won't typically write extraction prompts yourself — the platform ships sensible defaults — but you may bump the version when the platform team releases an improved prompt.
Manual Replay
To re-run extraction for one conversation (e.g., to test a new prompt version):
memory.runExtractionForConversation(conversationId, {
extractionVersion: "v2" // optional override
});
This produces a fresh extraction job that the worker pool picks up.
Recall: Three Ways To Find Memories
Recall is how memories make their way back into the LLM's context. The platform offers three modes — they exist because LLMs pick the right tool more reliably from a clear name + description than from a single tool with a discriminator argument.
recallByTopic — Semantic Search
The hot path. You provide a free-text query, and the platform returns the atoms most semantically relevant to it (ranked by embedding similarity, with decay and importance weighting applied).
var hits = memory.recallByTopic(spaceId, {
query: "the user's preferred meeting times",
limit: 5
});
// hits.hits = [
// { atom: { text: "User prefers Tuesday mornings", ... },
// score: 0.91, decayWeight: 1.0, ... },
// { atom: { text: "User dislikes meetings before 9am", ... },
// score: 0.87, ... },
// ...
// ]
When Atlas Vector Search isn't available, the platform falls back to a token-overlap scoring function. Same return shape; slightly worse ranking. The seam is invisible to your code.
Use it when: most of the time. Working context uses this under the hood. Anywhere the agent needs to "remember something about a topic."
recallByEntity — Everything About X
Given an entity (a person, organisation, place, etc.), return every atom that mentions it. Newest first.
var acme = memory.findEntityByName(spaceId, {
type: "ORGANIZATION",
name: "Acme Corporation"
});
var history = memory.recallByEntity(spaceId, {
entityIdOrName: acme.id,
limit: 50
});
Use it when: profile-style summaries. Customer detail pages. "Tell me everything about user X." Anywhere an entity is the lens, not a topic.
recallTimeline — What Was True At Time T
Bi-temporal queries: include superseded atoms, slice by validity window, see the history of changes.
// What did we know about Acme between Jan 1 and April 30?
var window = memory.recallTimeline(spaceId, {
entityIdOrName: "Acme Corporation",
from: "2026-01-01T00:00:00Z",
to: "2026-04-30T23:59:59Z",
limit: 20
});
recallTimeline includes superseded atoms by default — that's the whole point. You see the change history.
Use it when: audits, "what did we know at the time we made this decision", reasoning about change over time, compliance reports.
Recall Result Shape
All three modes return the same shape:
{
mode: "BY_TOPIC", // or BY_ENTITY / TIMELINE
totalCandidates: 47, // candidates before limit
latencyMs: 12,
hits: [
{
atom: {
id: "atom_abc123",
text: "User prefers morning meetings",
category: { name: "preference", kind: "PREFERENCE" },
importance: 4,
confidence: 0.95,
validFrom: "2026-04-30T...",
validTo: null, // null = currently valid
sourceConversationId: "conv_xyz",
sourceMessageIds: ["msg_1", "msg_5"],
entityIds: [ ... ]
},
score: 0.91, // post-fusion ranking
decayWeight: 1.0,
entityMatchBonus: 1.0
},
// ...
]
}
The score is the platform's final ranking — it already combines vector similarity, decay, and importance. Sort by it directly.
Working Context: A Prompt That "Remembers"
Recall is a primitive. Working Context is the ready-to-feed-the-LLM block built from recall + the conversation tail + the rolling summary.
Instead of writing 50 lines of code to build a prompt that pulls in identity facts + recent conversation summary + relevant memories + the recent message tail, you call one method:
var ctx = context.getWorkingContext().buildWorkingContext(
conversationId,
{
memorySpaceId: space.id,
recallQuery: "the user's question right now",
recentTurns: 10,
recallLimit: 8,
tokenBudget: 8000,
alwaysOnCategoryNames: ["identity", "preference"],
includeRollingSummary: true
}
);
// ctx is a Map with:
// contextBlock: "...assembled prompt..." (single string)
// messages: [ ... ] (canonical message list)
// atomsUsed: [ ... ] (the atoms that landed)
// tokensEstimated: 4321
Feed ctx.contextBlock into your LLM call as the system prompt (or as the leading content of your user prompt — the template controls where it lands).
What Gets Included
By default, the working context block has four sections:
═══════════════════════════════════════════════════════════
1. ROLLING SUMMARY (if includeRollingSummary)
"Across previous turns, the user established that they
work in product management at Acme, prefer morning
meetings, and are working towards launching the Q3 release..."
2. ALWAYS-ON ATOMS (alwaysOnCategoryNames)
"User name: Johan Eriksson"
"User role: CTO at Snubbas"
"User prefers concise answers"
3. RECALLED ATOMS (semantic, recallQuery → BY_TOPIC)
"User cancelled their last 3 meetings on Tuesday afternoons"
"User mentioned a conflict with the Q2 review meeting"
4. RECENT TURNS (the conversation tail, recentTurns count)
user: "Can we move next week's status to morning?"
assistant: "Sure, which morning?"
user: "Tuesday."
═══════════════════════════════════════════════════════════
The token budget is enforced top-to-bottom: rolling summary first (cheap, always in), then always-on atoms, then recall hits, then recent turns. If you exceed the budget, recent turns get trimmed first, then recall hits.
Tuning Knobs
| Parameter | Default | When to change it |
|---|---|---|
recentTurns | 10 | Fewer for tighter token budgets; more for chat-heavy apps |
recallLimit | 8 | Lower (3–5) when the user message is short; higher (10–15) for complex queries |
tokenBudget | 8000 | Match your model's context window minus output budget |
includeRollingSummary | true | Set false for very short / single-turn conversations |
alwaysOnCategoryNames | [] | Heavily app-dependent; identity + preference are common |
recallQuery | (none) | Usually the user's current message |
Custom Templates
The default template renders the four sections above. If you need a different layout (e.g., XML tags for Anthropic's prompt preferences, or a chat-message array instead of a single string), register a custom Mustache template via the platform startup hooks. For most apps the default is fine.
Letting The Agent Recall On Its Own
The patterns above assume you decide when to recall. The platform also ships native AITools the LLM can call during a turn — three tools, one per recall mode:
| Tool | When the LLM picks it |
|---|---|
RecallByTopic | "I should look up what the user said about X" |
RecallByEntity | "I have an entity reference; let me see its history" |
RecallTimeline | "I need to reason about how things changed over time" |
Add them to your AI Agent chain JSON:
{
"tools": [
{
"type": "RECALL_BY_TOPIC",
"name": "recallMemory",
"memorySpaceId": "$userMemorySpaceId"
},
{
"type": "RECALL_BY_ENTITY",
"name": "lookupEntity",
"memorySpaceId": "$userMemorySpaceId"
}
]
}
The "$userMemorySpaceId" syntax is the platform's standard variable-binding convention. At chain-execution time the runtime reads userMemorySpaceId from the AIStep's input arguments. The practical consequence: one agent JSON can serve every user; you just pass the user's memory space id when you invoke the agent.
context.getAIFunctions().invokeAgent("supportAgent", {
arguments: {
userMemorySpaceId: userSpace.id,
userMessage: arguments.message
}
});
If your space is app-global (a shared knowledge base, not per-user), drop the $ and use a literal id:
{
"type": "RECALL_BY_TOPIC",
"name": "recallHandbook",
"memorySpaceId": "ms_company_handbook"
}
The agent then calls these tools mid-turn:
User: "Did I tell you about my dietary restrictions?"
Assistant: [calls recallMemory with query "user dietary restrictions"]
Tool: [returns: "User is vegetarian; user has a peanut allergy"]
Assistant: "Yes — you mentioned you're vegetarian and have a peanut allergy."
Apps With Their Own Chat History
Some agents put much more than plain chat in the user prompt — they render their own (often richly-structured) history and don't use the platform's AIChatHistory at all. They still want conversation persistence so memory extraction, recall, and GDPR export work.
The integration point is appendTurn — one call per turn; both messages share a server-allocated turn id:
function myCustomAgent(arguments) {
var conv = context.getConversations();
// 1. Get or create the conversation
var c = conv.createConversation({
namespace: "custom-agent",
title: arguments.title
});
// 2. Build my own rich prompt from internal history + user input
var prompt = buildMyRichPrompt(arguments.myHistory, arguments.userInput);
// 3. Call my own LLM
var llmResult = callMyOwnLLM(prompt);
// 4. One call to persist user + assistant under one turn
var saved = conv.appendTurn(c.id, {
userContent: arguments.userInput,
assistant: {
content: llmResult.text,
stopReason: llmResult.stopReason,
model: llmResult.model,
provider: "anthropic",
usage: {
inputTokens: llmResult.inputTokens,
outputTokens: llmResult.outputTokens,
latencyMs: llmResult.latencyMs
}
},
idempotencyKey: arguments.turnId // safe retries
});
return { reply: llmResult.text, turnId: saved.turnId };
}
Three guarantees this gives you:
- Shared
turnIdon both messages — replay tooling groups them as one turn even though the LLM-protocol message array doesn't reflect that. - Atomic ordering — user at sequence N, assistant at N+1.
- Idempotency on the user side — retries with the same
idempotencyKeyreturn the existing user message; the assistant always gets a fresh row (different LLM responses for the same logical turn must not collide on dedupe).
Common Patterns
Per-User Memory Space
Most apps have one memory space per user, named after the domain area:
function getOrCreateUserSpace(name) {
var memory = context.getMemory();
var existing = memory.listMemorySpaces({})
.find(function(s) { return s.name === name; });
return existing || memory.createMemorySpace({ name: name });
}
var goals = getOrCreateUserSpace("goals");
var prefs = getOrCreateUserSpace("preferences");
var support = getOrCreateUserSpace("support");
Spaces are cheap. Use them as natural topic boundaries.
Shared Knowledge Base
For app-global knowledge (handbooks, FAQs, company facts) you have two choices:
- Use a separate per-app system user that owns the shared memory space, and recall from that space in addition to the per-user space.
- Bypass the memory subsystem and use a regular SearchIndex. Memory is for personal, evolving facts — if your "memory" is actually a static document, document search will likely serve you better.
"Remember This" UI Flow
Sometimes the user explicitly asks the agent to remember something. Instead of waiting for extraction, write the atom directly:
context.getMemory().addAtom(spaceId, {
text: arguments.factText,
category: { name: "preference", kind: "PREFERENCE" },
importance: 5, // user-volunteered facts are high signal
confidence: 1.0, // user-confirmed, max confidence
sourceConversationId: arguments.conversationId
});
Cite The Source In The UI
Atoms carry sourceConversationId and sourceMessageIds. Render those as citations when the LLM uses recall:
var hits = memory.recallByTopic(spaceId, { query: q, limit: 5 });
hits.hits.forEach(function(h) {
var sourceDate = h.atom.validFrom.slice(0, 10);
var convLink = "/history/" + h.atom.sourceConversationId;
// render "Said on YYYY-MM-DD" → conversation history view
});
GDPR Export And Delete
Single calls handle the GDPR right-to-export and right-to-erasure paths:
// Export — typically wired to "download my data"
var dump = context.getAIAdmin().exportUserData({
userId: arguments.userId // optional; defaults to current user
});
// dump = { conversations: [...], spaces: [...], atoms: [...],
// entities: [...], bindings: [...] }
// Delete — wired to "delete my data"
context.getAIAdmin().deleteUserData({
userId: arguments.userId,
confirm: true
});
The delete is hard — it removes data from every collection. Idempotent.
What To Watch Out For
Don't Mix Conversation Namespaces
A namespace is the seam at which extraction fires. If you mix unrelated topics in one namespace, the extractor produces a muddled set of atoms. Use namespaces generously — "daily-checkin", "goal-setting", "weekly-review" — not a single "chat" for everything.
Don't Write Atoms Directly When Extraction Will Handle It
If a fact comes up naturally in conversation, the extractor will pick it up. Writing it directly in addition produces a duplicate that the reconciler may or may not merge. Reserve direct writes for migrations and explicit "remember this" flows.
Don't Conflate Visibility Levels
"internal" is for messages the LLM sees but the user UI doesn't (system prompts, framework injections). "hidden" hides from both. Defaulting to "internal" for system messages is correct; setting things to "hidden" is rare.
Mind The Token Budget
Working context trims recent turns first, then recall hits. If your tokenBudget is too tight you'll lose recent conversation context — usually the most useful part. Err on the high side.
Bumping extractionVersion Doesn't Auto-Replay
Existing atoms stay where they are; only NEW conversations close at the new version. To re-extract historical conversations, run runExtractionForConversation for each one.
Bindings Are Cluster-Wide
A binding lives on the platform; every node sees it. If you delete a binding, extraction stops everywhere. Use enabled: false on the binding to pause without losing config.
Reference Cheat Sheet
Quick lookups for the methods and shapes you'll reach for most often.
context.getConversations()
| Method | Returns | Notes |
|---|---|---|
createConversation({namespace, title?, sessionId?, metadata?}) | conversation | namespace required |
getConversation(id) | conversation | null | scope-checked |
listConversations({}) | conversation | for current user |
closeConversation(id) | conversation | idempotent |
appendUserMessage(id, {content, idempotencyKey?, traceId?}) | message | string or content-block list |
appendAssistantTurn(id, {content, stopReason?, model?, provider?, usage?, traceId?}) | message | |
appendToolResult(id, {toolUseId, toolName?, content, isError?}) | message | |
appendSystemMessage(id, {content, visibility?}) | message | defaults to "internal" |
appendTurn(id, {userContent, userVisibility?, assistant: {content, stopReason?, model?, provider?, usage?}, idempotencyKey?}) | {turnId, userMessage, assistantMessage} | one call, shared turn |
getMessages(id, {limit?, includeInternal?}) | message | excludes internal by default |
getRawTurns(id, {limit?}) | message | always includes internal |
runTurn(id, {provider, model, userText? | content?, tools?, toolHandler?, options?, maxCycles?, tailLimit?, idempotencyKey?}) | {status, appendedMessages, aggregateUsage, toolInvocations, totalLatencyMs} | full LLM turn |
refreshRollingSummary(id) | {outcome, rollingSummary, rollingSummaryUntilSeq} | manual |
context.getMemory()
| Method | Returns | Notes |
|---|---|---|
createMemorySpace({name?, metadata?}) | space | name is yours |
getMemorySpace(id) | space | null | |
listMemorySpaces({}) | space | |
deleteMemorySpace(id) | void | soft delete |
addAtom(spaceId, {text, category: {name, kind}, importance?, confidence?, validFrom?, sourceConversationId?, sourceMessageIds?}) | atom | direct write |
getAtom(id) | atom | null | |
supersedeAtom(oldId, {…AddAtom}) | atom | closes old, inserts new |
archiveAtom(id) | void | excluded from default recall |
listAtoms(spaceId, {category?, status?, validAt?, limit?}) | atom | |
createEntity(spaceId, {type, canonicalName, description?, attributes?}) | entity | type ∈ PERSON / ORGANIZATION / PLACE / OBJECT / CONCEPT / EVENT / AGREEMENT |
getEntity(id) | entity | null | |
findEntityByName(spaceId, {type, name}) | entity | null | exact / alias match |
mergeEntities(canonicalId, aliasId) | entity | re-points atom refs |
searchEntities(spaceId, {type?, query, limit?}) | entity | |
recallByTopic(spaceId, {query, limit?, categoryNames?, minImportance?, validAt?}) | recall result | semantic |
recallByEntity(spaceId, {entityIdOrName, limit?, categoryNames?}) | recall result | newest-first |
recallTimeline(spaceId, {entityIdOrName? | query?, from?, to?, limit?, includeSuperseded?}) | recall result | bi-temporal |
createBinding({conversationScope, memorySpaceIds, extractionPolicy}) | binding | enables extraction |
listBindings({}) | binding | |
updateBinding(id, {extractionPolicy?, enabled?}) | binding | scope is immutable |
deleteBinding(id) | void | |
runExtractionForConversation(convId, {extractionVersion?}) | job | manual replay |
context.getWorkingContext()
| Method | Returns | Notes |
|---|---|---|
buildWorkingContext(convId, {memorySpaceId, recallQuery?, recentTurns?, recallLimit?, tokenBudget?, includeRollingSummary?, alwaysOnCategoryNames?, templateId?}) | {contextBlock, messages, atomsUsed, tokensEstimated} | all knobs documented in JSDoc |
context.getAIAdmin()
| Method | Returns | Notes |
|---|---|---|
exportUserData({userId?}) | {conversations, spaces, atoms, entities, bindings} | GDPR right-to-export |
deleteUserData({userId?, confirm: true}) | {deleted: {…counts}} | GDPR right-to-erasure |
Common Shapes
Content block (used in content fields):
{ type: "text", text: "..." }
{ type: "image", url: "...", mimeType: "image/png" }
{ type: "tool_use", id: "tu_1", name: "lookup", arguments: { ... } }
{ type: "tool_result", toolUseId: "tu_1", content: "...", isError: false }
Atom:
{
id, memorySpaceId, tenantId, appId, userId,
text,
category: { name, kind },
importance: 1..5,
confidence: 0.0..1.0,
validFrom, validTo, // ISO instants; validTo null = current
status: "ACTIVE" | "ARCHIVED" | "DELETED",
sourceConversationId,
sourceMessageIds: [ ... ],
entityIds: [ ... ],
createdAt, updatedAt
}
Entity:
{
id, memorySpaceId, tenantId, appId, userId,
type: "PERSON" | "ORGANIZATION" | ...,
canonicalName, description,
aliases: [ ... ],
status: "ACTIVE" | "MERGED",
firstSeenAt, lastSeenAt, atomCount,
createdAt, updatedAt
}
Binding:
{
id, tenantId,
conversationScope: { namespace, userId }, // or array of these
memorySpaceIds: [ ... ],
extractionPolicy: {
extractionVersion: "v1",
onConversationClosed: true,
windowed: false,
windowTurns: 0
},
enabled: true,
createdAt, updatedAt
}
Recall hit:
{
atom: { …atom shape… },
score: 0.0..1.0, // post-fusion ranking
decayWeight: 0.0..1.0, // multiplier applied
entityMatchBonus: 1.0..2.0
}
Variable Binding In Agent JSON
When configuring tools or agents that need per-user values sourced from AIStep inputs:
{
"memorySpaceId": "$userMemorySpaceId"
}
The $paramName prefix tells the platform to read the value from the AIStep's input map at runtime. Drop the $ for literal values that are the same for every user. Then invoke with the variable bound:
context.getAIFunctions().invokeAgent("agentName", {
arguments: {
userMemorySpaceId: someResolvedSpaceId
// ... other agent inputs
}
});
Next Steps
- AI Agents — build and invoke the agents that use this memory
- Rules and Actions — schedule a job that closes stale conversations to trigger extraction
- Security — understand the tenant + user isolation that backs every memory call