Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mem.xtrace.ai/llms.txt

Use this file to discover all available pages before exploring further.

Ingest is the write path. You send conversation messages; the server runs LLM-based extraction to pull out facts (and, when relevant, artifacts and episodes), embeds each one, and stores them in your org’s vector index.

The mental model

Ingest is asynchronous by default. Extraction is LLM-bound — typically 3–10 seconds — so the API returns a job immediately and does the work in the background. Your code polls or opts into sync mode.
┌──────────┐                              ┌───────────────┐
│  Client  │  POST /v1/memories  ──────►  │   Memory API  │
│          │  ◄────  IngestJob (pending)  │  (returns 1s) │
└──────────┘                              └───────────────┘

                                                  │  extraction (3–10s)

                                          status: succeeded
                                          result.memories_created: [...]

Required fields

Every ingest needs:
  • messages — array of { role, content }. Empty array → 400.
  • user_id — keys the per-user session namespace
  • conv_id — anchors every extracted memory to a conversation (for replay, export, bulk retract)
Optional: agent_id, app_id, metadata (arbitrary key/value, becomes filterable on search), extract_artifacts: true (opts into the more expensive artifact-extraction stage).

Async ingest (default)

const job = await client.memories.ingest({
  messages: [
    { role: 'user', content: 'My favorite food is pad see ew.' },
    { role: 'assistant', content: 'Noted — Thai food.' },
  ],
  user_id: 'alice',
  conv_id: 'conv_2026_05_16',
});

// pollUntilDone handles exponential backoff (500ms → 5s) and timeout.
const done = await client.memories.jobs.pollUntilDone(job.id);

if (done.status === 'failed') {
  throw new Error(`Ingest failed: ${done.error?.message}`);
}

console.log('Created', done.result?.memories_created.length, 'memories');

Sync ingest (wait: true)

Useful for demos, one-shot scripts, or any code where you want the result inline:
const job = await client.memories.ingest(
  {
    messages: [{ role: 'user', content: 'I am vegetarian.' }],
    user_id: 'alice',
    conv_id: 'conv_2026_05_16',
  },
  { wait: true },
);

if (job.status === 'succeeded') {
  console.log('Inline result:', job.result?.memories_created);
} else if (job.status === 'failed') {
  console.error('Extraction failed:', job.error);
} else {
  // Sync budget elapsed (30s) — fell back to async; poll job.id as above.
  console.log('Polling required:', job.id);
}
The server holds the connection for up to 30 seconds. If extraction finishes in that window the response is terminal (succeeded or failed). If the budget elapses, you get a pending/running job back and have to poll — same as async mode.
Use sync mode for interactive demos and CLI tools; use async mode for production agent loops where you want to dispatch ingest and continue working.

What gets extracted

You pass messages; you don’t pre-decide what’s a fact vs an artifact vs an episode. The server’s extraction pipeline decides:
TypeTriggered when
FactThe default. A semantic claim in a turn (“User likes X”, “User works at Y”).
ArtifactThe conversation references a structured object — a doc, code snippet, summary — that’s worth storing standalone. Requires extract_artifacts: true on ingest.
EpisodeA stretch of turns gets summarized into a session-level memory. Server-driven; no client knob.
The result.memories_created array tells you what landed; each entry is a thin reference ({id, type, text}). For the full row, call client.memories.get(id).

What’s in metadata

Anything you put in metadata is stored verbatim on every memory extracted from this ingest, and each key becomes an indexed payload field filterable on search:
await client.memories.ingest({
  messages: [/* ... */],
  user_id: 'alice',
  conv_id: 'conv_2026_05_16',
  metadata: {
    project:  'atlas',
    channel:  'support',
    priority: 'high',
  },
});

// Later:
await client.memories.search({
  query: 'thai food',
  filters: { user_id: 'alice', project: 'atlas' },
});
Reserved internal keys (tag1tag5, kb_type, org_id, etc.) are stripped silently.

Failure modes

Extraction can fail for various reasons — upstream LLM hiccup, content that doesn’t yield extractable facts, rate limits. The job lands in status: "failed" with an error.code and error.message. Retry by submitting the same body again; we don’t auto-retry server-side. Common failure codes:
CodeMeaning
ingest_failedGeneric extraction error; check error.message
rate_limit_exceededOrg quota hit; wait and retry

See also

  • Searching memories — query what you just ingested
  • API Reference → Memories → Ingest — full request/response schemas