Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lagos MUG - Memory is the substrate

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Aletheia Aletheia
May 16, 2026
9

Lagos MUG - Memory is the substrate

Avatar for Aletheia

Aletheia

May 16, 2026

Transcript

  1. [ lagos.mug / 2026-05 ] // agentic-memory.mongo Memory is the

    substrate. Building an agentic memory system on MongoDB for a legal-domain knowledge base. Luca Bianchi CTO @ Mesa Group · AWS Serverless Hero · Cursor Ambassador bianchiluca.com · theagenticstack.dev
  2. [ who is talking? ] 02 / 11 lidia /

    memory @ lagos.mug Chief Technology & Innovation Officer @ MESA Chief Technology Officer @ LIDIA Chief Technology Officer @ Overnet AWS Hero, Cursor Ambassador, MongoDB Content Creator MMUG and OpenClaw Milano Meetup organizer LinkedIn www https://www.linkedin.com/in/lucabianchipavia/ www.bianchiluca.com Luca Bianchi, PhD.
  3. [ problem ] A legal associate who forgets every morning.

    // agents are stateless. memory is what you fake. An agent without memory is a brilliant trainee on day one — every day. In a legal workflow, that is unacceptable: cases run for months, client preferences matter, and the same article gets cited a hundred times. Memory is not a feature. It is the substrate over which reasoning happens. lost.context Client asked the same question last week. Agent has no clue. lost.precedent Cited case in turn 3. Forgets it in turn 12. Cites the wrong one. lost.profile Knows nothing about the tenant, the jurisdiction, or the matter. 02 / 11 lidia / memory @ lagos.mug
  4. [ taxonomy ] Memory is not one thing. // four

    canonical layers, borrowed from cognitive science. working Working Single-thread state. Resumable. Short-lived. RAM analogue episodic Episodic Past events with time. Per-entity. Recall by similarity + range. Personal history semantic Semantic Facts, rules, profiles. Stable. Hybrid retrieval. Knowledge procedural Procedural How to do X. Routing, tool specs, prompts. Skills + storage memory — raw documents, the substrate under everything else. Often treated as plumbing. It isn't. 03 / 11 lidia / memory @ lagos.mug
  5. [ architecture / what we built ] Five tiers —

    what a legal agent actually needs. // scaling a legal knowledge base means designing the memory, not the model. 01 Data acquisition Crawlers, ingestion pipelines, normalisation. The faucet. 02 Storage memory Raw documents and chunks. Source of truth. Re-embeddable. 03 Semantic memory Legal facts, jurisprudence, normative graph. Stable, hybrid-searchable. 04 Episodic memory Past interactions, reasoning steps, what the agent did and why. 05 Persistent memory Per-case, per-user, per-tenant state across sessions. 04 / 11 lidia / memory @ lagos.mug
  6. [ mapping / one platform ] Five tiers, one operational

    store. // every tier becomes a collection (or an index) in the same cluster. tier MongoDB primitive what it does Data acquisition change streams + Atlas Stream Processing oplog-tailing triggers + managed pipeline for AI enrichment Storage memory collections + GridFS (> 16 MB) raw text + provenance; re-embeddable when the model changes Semantic memory $vectorSearch + $search + $rankFusion ANN + Lucene full-text, RRF fusion server-side (8.0+) Episodic memory $vectorSearch + filter + TTL index pre-filtered ANN; documents auto-expire via expireAfterSeconds Persistent memory namespaced collection + compound index per-tenant cross-session state, fetched by (tenant, case_id) aside · working memory lives in app state. LangGraph users get it via MongoDBSaver — same cluster. 05 / 11 lidia / memory @ lagos.mug
  7. [ retrieval ] Pure vector search fails in law. //

    embeddings are great at meaning. Terrible at identifiers. Where dense vectors miss miss.01 art. 1218 c.c. Embedding clusters all civil code articles together. miss.02 Cass. 12345/2024 Case number has no semantic neighbours. miss.03 "Rossi v. Comune di Milano" Party names are lexical, not semantic. Solution: hybrid via $rankFusion $vectorSearch semantic similarity — meaning, paraphrase, concept $search (BM25) lexical — article numbers, case refs, party names ↓ $rankFusion / RRF MongoDB 8.0+ one aggregation pipeline. no score normalisation. no app glue. 06 / 11 lidia / memory @ lagos.mug
  8. [ scale / forget ] Pre-filter, then ANN, then forget.

    // the three tricks that keep a legal knowledge base from rotting. 01 / filtered ANN Pre-filter on tenant + jurisdiction $vectorSearch · filter: { ... } Filter pushed into the HNSW traversal, not applied after. Predictable recall, smaller candidate set, smaller context window. 02 / TTL Decay working and episodic createIndex(...) · expireAfterSeconds Working days, episodic months, semantic indefinite. Forgetting is a feature — MongoDB does it in the index, not in app code. 03 / consolidation Episodic → semantic aggregation pipeline · $group / $merge Summarise clusters of episodic events into reusable semantic memories. Expire the originals. Compress on read, decay on time. 07 / 11 lidia / memory @ lagos.mug
  9. [ speed / honest ] What MongoDB actually buys you.

    // latency budget is dominated by the LLM. The win is operational. vendor benchmark < 50 ms query latency at 15.3M vectors 90 – 95% recall · voyage-3-large · scalar/binary quantized source: MongoDB Atlas Vector Search benchmark, 2025-26 Co-located writes Memory writes and business writes share a session. No 2PC across heterogeneous systems. Smaller context window Filtered + hybrid retrieval → fewer tokens → lower LLM cost and lower wall-clock per turn. Less to operate One cluster to monitor, secure, back up. Not four. This compounds as the team scales. 08 / 11 lidia / memory @ lagos.mug
  10. [ takeaway ] // thank you, lagos. Design the memory

    layers explicitly, or your context window will design them for you. what to do monday 1. map your agent's memory to the four layers — be honest about which you've actually built. 2. add hybrid search before you add more embeddings. 3. put a TTL on everything you're not sure you should keep. Luca Bianchi · bianchiluca.com · theagenticstack.dev
  11. [ reference / primitives ] The MongoDB features in this

    talk. // the call shape for each primitive — bookmark this slide. $vectorSearch + filter ANN over an HNSW index with a predicate pushed into the graph traversal. { $vectorSearch: { index, path, queryVector, numCandidates, limit, filter } } $rankFusion Server-side RRF across N input pipelines. Hybrid search in one round-trip. { $rankFusion: { input: { pipelines: { semantic, lexical } }, combination: { weights } } } TTL index Date-field index with expireAfterSeconds. Background sweep ~60s. Declarative forgetting. createIndex({ createdAt: 1 }, { expireAfterSeconds, partialFilterExpression }) $merge Terminal aggregation stage with upsert semantics. How consolidation writes back. { $merge: { into, on, whenMatched, whenNotMatched } } change streams Real-time event feed off the oplog. Fan-out for embedding + downstream enrichment. collection.watch([ { $match: ... } ]) Voyage AI embeddings Embeddings generated automatically on write — no separate embedding service. configure in vector index definition · public preview (May 2026) 10 / 11 lidia / memory @ lagos.mug
  12. [ reference / boundaries ] Editions, versions, and the gaps.

    // where MongoDB runs these features — and what it still leaves to you. supported on Vector search ($vectorSearch, $rankFusion) Atlas only · M10+ tier · 8.0+ for $rankFusion TTL index Community · Enterprise · Atlas · since 2.2 $merge Community · Enterprise · Atlas · since 4.2 change streams any replica set or sharded cluster · since 3.6 Atlas Stream Processing Atlas only · separate add-on Voyage AI Automated Embeddings Atlas only · public preview (May 2026) what MongoDB does not give you Embedding generation BYO model or use Voyage integration. MongoDB stores and indexes embeddings — does not produce them outside the Voyage path. Consolidation logic Pattern-density thresholds, summarisation rules — application code. Schedule via Atlas Triggers or your own cron. Per-tenant physical isolation Filtered ANN is logical isolation. Hard isolation (regulated workloads) means separate collections, separate clusters. Score-level hybrid tuning $rankFusion does RRF. Weighted score-normalisation hybrid (BM25 cosine calibration) is still yours. 11 / 11 lidia / memory @ lagos.mug