freshcrate
Skin:/
Home > RAG & Memory > agentic-memory

agentic-memory

No description

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

README

Agentic Memory Research

Research collection on agent memory architectures, persistence patterns, and output quality maintenance for LLM-based agent systems.

Citation

If you reference this repoโ€™s summaries/analyses in academic or professional work, please cite:

@misc{lin_agentic_memory_2026,
  author       = {Leonard Lin},
  title        = {agentic-memory: Agentic Memory Research Collection (Summaries and Analyses)},
  year         = {2026},
  howpublished = {GitHub repository},
  url          = {https://github.com/lhl/agentic-memory},
}

Reference Summaries

Document Author Description
jumperz-agent-memory-stack @jumperz 31-piece memory architecture split across 3 phases (Core โ†’ Reliability โ†’ Intelligence). Complete prompt/spec breakdowns for write pipeline, read pipeline, decay, knowledge graph, episodic memory, trust scoring, echo/fizzle feedback loops. The foundational reference that others build on.
joelhooks-adr-0077-memory-system-next-phase @joelhooks ADR for joelclaw (personal AI Mac Mini). Maps existing production system (~6 days running, Qdrant 1,343 points) against jumperz's 31 pieces. Plans 3 increments: retrieval quality (score decay, query rewriting), storage quality (dedup, nightly maintenance), feedback loop (echo/fizzle). Includes detailed gap analysis.
coolmanns-openclaw-memory-architecture coolmanns 12-layer production memory stack for OpenClaw with 14 agents. SQLite+FTS5 knowledge graph (3,108 facts), llama.cpp GPU embeddings (768d, 7ms), three runtime plugins (continuity, stability, graph-memory). 100% recall on 60-query benchmark. Includes activation/decay system, domain RAG, session boot sequences.
drag88-agent-output-degradation @drag88 (Aswin) "Why Your Agent's Output Gets Worse Over Time" โ€” multi-agent convergence problem. 4-tier memory (working โ†’ episodic โ†’ semantic โ†’ procedural). 3-layer enforcement pipeline (YAML regex โ†’ Gemini LLM judge โ†’ self-learning loop). Core insight: convert expensive runtime LLM checks into free static regex rules over time.
versatly-clawvault Versatly (@drag88) ClawVault npm CLI tool โ€” structured markdown memory vault with observation pipeline, knowledge graph, session lifecycle (wake/sleep/checkpoint), task/project primitives, Obsidian integration, OpenClaw hooks. 449+ tests. v2.6.1.
vstorm-memv vstorm-co memv (PyPI: memvee) โ€” Nemori-inspired predict-calibrate extraction + episode segmentation, plus Graphiti-style bi-temporal validity and hybrid retrieval (sqlite-vec + FTS5 + RRF) on SQLite.
supermemory Dhravya Shah / supermemoryai Supermemory memory-as-a-service API: memory versioning (linked-list chains), typed relationships (updates/extends/derives), static/dynamic profile synthesis, time-based forgetting with reason tracking, multi-model embedding storage. Critical caveat: open-source repo is frontend/SDK only; core engine is proprietary backend at api.supermemory.ai.

Paper Reference Summaries (Academic / Industry)

Document Author Description
hu-evermembench Hu et al. EverMemBench benchmark for >1M-token multi-party, multi-group interleaved conversations; diagnoses multi-hop collapse, temporal/versioning difficulty, and retrieval-bottlenecked โ€œmemory awarenessโ€.
zhang-live-evo Zhang et al. Live-Evo: online self-evolving agent memory with an experience bank + meta-guideline bank, contrastive โ€œmemory-on vs memory-offโ€ feedback, and weight-based reinforcement/forgetting; evaluated on Prophet Arena + deep research (as reported).
shutova-structmemeval Shutova et al. StructMemEval benchmark for whether agents can organize memory into useful structures (trees/ledgers/state tracking), not just retrieve facts; includes hint vs no-hint evaluation to isolate โ€œstructure recognitionโ€ failures.
yan-gam Yan et al. GAM: just-in-time agent memory via lightweight memos + a universal page-store, plus a deep-research researcher that plans/searches/integrates/reflects over history to compile optimized context at runtime; strong long-context QA gains with higher latency (as reported).
yang-graph-based-agent-memory-taxonomy Yang et al. Graph-based Agent Memory survey: graph-centric taxonomy + lifecycle (extract/store/retrieve/evolve), storage structures (KG/temporal/hyper/hierarchical/hybrid), retrieval operators, evolution/maintenance, and resources/benchmarks; useful shared vocabulary for shisad.
zhang-survey-memory-mechanism Zhang et al. Survey on memory mechanisms for LLM agents: definitions, why memory, design axes (sources/forms/ops), evaluation approaches, and application domains; good baseline checklist alongside newer benchmarks/systems.
hu-memory-age-ai-agents Hu et al. Memory in the Age of AI Agents survey: proposes unified lenses of forms (token/parametric/latent), functions (factual/experiential/working), and dynamics (formation/evolution/retrieval), plus benchmarks/frameworks and trustworthiness frontiers.
li-locomoplus Li et al. LoCoMo-Plus: evaluates beyond-factual โ€œcognitive memoryโ€ (latent constraints like state/goals/values) under cueโ€“trigger semantic disconnect, using constraint-consistency + LLM-judge evaluation.
maharana-locomo Maharana et al. LoCoMo dataset + benchmark for very long-term multi-session conversations (300 turns, multimodal) grounded in personas + temporal event graphs; evaluates QA + event summarization + multimodal generation.
wu-longmemeval Wu et al. LongMemEval benchmark + design decomposition (indexing โ†’ retrieval โ†’ reading) and system optimizations (value granularity, key expansion, time-aware query expansion).
packer-memgpt Packer et al. MemGPT: OS-inspired hierarchical memory + paging between a fixed-context LLM prompt and external stores (recall + archival), with function-call memory ops and event-driven control flow; foundational baseline for external agent memory.
chhikara-mem0 Chhikara et al. Mem0: production-oriented long-term memory pipeline with explicit ops (ADD/UPDATE/DELETE/NOOP) and an optional graph memory variant; reports quality + token/latency tradeoffs on LoCoMo.
liu-simplemem Liu et al. SimpleMem: write-time semantic structured compression + online synthesis + intent-aware retrieval planning (multi-view dense/BM25/symbolic retrieval with union+dedup) to improve LoCoMo/LongMemEval quality while cutting token cost (as reported).
xu-a-mem Xu et al. Aโ€‘Mem: Zettelkasten-inspired note network with LLM-driven link generation and โ€œmemory evolutionโ€ (updating older note attributes as new evidence arrives); strong LoCoMo multi-hop/temporal gains with far lower token lengths than full-context (as reported).
salama-meminsight Salama et al. MemInsight: autonomous memory augmentation that mines/annotates attributes (entity-centric + conversation-centric; turn/session granularity) and uses attribute-guided retrieval; large LoCoMo retrieval recall gains vs DPR RAG baseline (as reported).
rasmussen-zep Rasmussen et al. Zep: production memory layer built on Graphiti, a bi-temporal knowledge graph (episodes โ†’ entities/facts โ†’ communities) with validity intervals and invalidation-based corrections; evaluated on DMR + LongMemEval.
nan-nemori Nan et al. Nemori: cognitively-inspired self-organizing agent memory with semantic episode boundary detection + episodic narratives and a predict-calibrate loop that distills semantic knowledge from prediction gaps; strong LoCoMo + LongMemEvalS results (as reported).
li-memos Li et al. MemOS: OS-like memory control plane with MemCube (payload+metadata), lifecycle/scheduling, governance (ACL/TTL/audit), and multi-substrate memory (plaintext/activation/KV/parameter/LoRA).
yan-memory-r1 Yan et al. Memory-R1: reinforcement-learned memory manager (ADD/UPDATE/DELETE/NOOP) + answer agent with learned memory distillation; data-efficient RL (PPO/GRPO) training with exact-match reward.
jonelagadda-mnemosyne Jonelagadda et al. Mnemosyne: edge-friendly graph memory with substance/redundancy filters, probabilistic recall with decay/refresh, and a fixed-budget โ€œcore summaryโ€ for persona-level context.
patel-engram Patel et al. ENGRAM: lightweight typed memory (episodic/semantic/procedural) with simple dense retrieval + strict evidence budgets; strong LoCoMo + LongMemEval results with low token usage.
wei-evo-memory Wei et al. Evo-Memory: streaming benchmark + framework for self-evolving memory and experience reuse; introduces ExpRAG and ReMem (Think/Act/Refine) baselines and robustness/efficiency metrics.
cao-remember-me-refine-me Cao et al. ReMe: dynamic procedural memory lifecycle (acquireโ†’reuseโ†’refine) with multi-faceted distillation from success/failure trajectories, scenario-aware retrieval, and utility-based pruning; strong BFCLโ€‘V3/AppWorld results (as reported).
sarin-memoria Sarin et al. Memoria: personalization memory layer combining session summaries + KG triplets (persona) with exponential recency weighting; SQLite + ChromaDB architecture and LongMemEvals subset results.
latimer-hindsight Latimer et al. Hindsight: retain/recall/reflect architecture separating evidence vs beliefs vs summaries; temporal+entity memory graph with multi-channel retrieval fusion and belief confidence updates; very strong LongMemEval/LoCoMo results (as reported).
yu-agentic-memory Yu et al. AgeMem: RL-trained unified LTM+STM controller exposing memory ops as tool actions (add/update/delete/retrieve/summarize/filter) with a 3-stage curriculum and step-wise GRPO for credit assignment.
hu-evermemos Hu et al. EverMemOS: self-organizing โ€œmemory OSโ€ with MemCellsโ†’MemScenes lifecycle, user profile consolidation, and necessity/sufficiency-guided recollection (verifier + query rewrite); strong LoCoMo/LongMemEval results (as reported).
li-timem Li et al. TiMem: temporal-hierarchical memory consolidation (segmentโ†’sessionโ†’dayโ†’weekโ†’profile) with query-complexity recall planning + gating; strong LoCoMo/LongMemEval-S accuracy with low recalled tokens (as reported).
zhang-himem Zhang et al. HiMem: hierarchical long-term memory split (Episode Memory + Note Memory) with topic+surprise episode segmentation, note-first โ€œbest-effortโ€ retrieval w/ sufficiency checks, and conflict-aware reconsolidation; strong LoCoMo results (as reported).
behrouz-nested-learning Behrouz et al. Nested Learning / CMS / Hope: reframes memory as multi-timescale update dynamics (continuum memory blocks updated at different frequencies) with implications for consolidation and โ€œcorrections without forgettingโ€.
zhang-recursive-language-models Zhang et al. Recursive Language Models (RLMs): inference-time recursion + REPL state treats long prompts as an external environment; processes multiโ€‘million-token inputs with sub-calls and programmatic slicing, often beating long-context scaffolds at comparable average cost (as reported).
wang-m-plus Wang et al. M+: latent-space long-term memory extension to MemoryLLM that stores dropped memory tokens in an LTM pool and retrieves them during generation with a co-trained retriever; extends retention to >160k tokens at similar GPU memory cost (as reported).
dong-minja Dong et al. MINJA: practical memory injection attack on โ€œmemory-as-demonstrationsโ€ agents via query-only interaction (bridging steps + progressive shortening); motivates write-time gates, isolation, and safer memory representations.
sunil-memory-poisoning-attack-defense Sunil et al. Memory poisoning attack & defense: empirical MINJA follow-up in EHR agents; shows pre-existing benign memory can reduce ASR, and that trust-score defenses can fail via over-conservatism or overconfidence.
anokhin-arigraph Anokhin et al. AriGraph: knowledge-graph world model that links episodic observation nodes to extracted semantic triplets; two-stage retrieval (semanticโ†’episodic) for planning/exploration in text-game environments.
behrouz-titans Behrouz et al. Titans: long-context architecture with an online-updated neural memory module (test-time learning) plus persistent task memory; provides explicit primitives for surprise-based salience and forgetting.
ahn-hema Ahn HEMA: hippocampus-inspired dual memory for long conversations (running compact summary + FAISS episodic vector store) with explicit prompt budgeting, pruning (โ€œsemantic forgettingโ€), and summary-of-summaries consolidation.
tan-membench Tan et al. MemBench: benchmark/dataset for agent memory covering participation vs observation scenarios and factual vs reflective memory, with metrics for accuracy/recall/capacity and read/write-time efficiency.

Deep Dive Analyses

Root-level critical analyses intended for synthesis work. These reference the summaries above, but focus on coherence, evidence quality, risks, and synthesis-ready claim framing.

Synthesis Based on Focus
ANALYSIS ANALYSIS-*.md + shisad docs + Mem0/Letta baselines Cross-system comparison (techniques + memory types), plus mapping to shisad and โ€œtraditionalโ€ RAG-ish memory
ANALYSIS-academic-industry paper ANALYSIS-arxiv-*.md + shisad plan Academic/industry synthesis: benchmarks vs systems vs attacks, with โ€œwhatโ€™s missing in shisadโ€ framing
Benchmarks best practices Public disputes, audits, our analysis Known pitfalls, metric confusion, dataset quality issues, per-benchmark limitations
MELT benchmark design ANALYSIS.md systems + Reality Check epistemic docs Memory Evaluation for Lifecycle Testing โ€” session-replay benchmark testing full memory lifecycle (decay, consolidation, contradiction, core stability, inference) at 6 scale tiers over simulated time. Separate repo; draft.
Analysis Based on Focus
ANALYSIS-jumperz-agent-memory-stack references/jumperz-agent-memory-stack.md Checklist critique (semantics, failure modes, missing evaluation), synthesis-ready takeaways + claims table
ANALYSIS-joelhooks-adr-0077-memory-system-next-phase references/joelhooks-adr-0077-memory-system-next-phase.md Increment plan critique (decay, rewrite, dedup, echo/fizzle), validation plan + claims
ANALYSIS-coolmanns-openclaw-memory-architecture references/coolmanns-openclaw-memory-architecture.md + vendor/openclaw-memory-architecture/ Layered stack critique with benchmark-method verification, operational risks, doc drift notes
ANALYSIS-drag88-agent-output-degradation references/drag88-agent-output-degradation.md Convergence + enforcement pattern critique (judgeโ†’rule distillation), measurement gaps, risks
ANALYSIS-versatly-clawvault references/versatly-clawvault.md + vendor/clawvault/ Product/tooling critique (surface area, hooks, qmd dependency), security posture, missing benchmarks
ANALYSIS-vstorm-memv references/vstorm-memv.md + vendor/memv/ Implementation critique of Nemori-inspired predict-calibrate extraction + bi-temporal validity + hybrid retrieval, with gaps/risks and shisad mapping
ANALYSIS-openviking vendor/openviking/ + Hermes provider docs Open-source context database: viking:// filesystem, L0/L1/L2 tiered loading, session-commit extraction across 8 memory categories, and hierarchical typed retrieval over memory/resources/skills; strong observability with heavier operational complexity
ANALYSIS-byterover-cli vendor/byterover-cli/ + vendor/byterover-cli/paper/ Agent-native coding-agent memory/runtime: daemon + per-project agent pool, markdown context tree with explicit relations and lifecycle, 5-tier progressive retrieval with cache/OOD detection, and strong self-reported benchmarks with caveats
ANALYSIS-mira-OSS vendor/mira-OSS/ Full-stack event-driven agent (v1 rev 2): activity-day sigmoid decay, hub discovery + 3-axis linking (vector+entity+TF-IDF), Text-Based LoRA + user model synthesis with critic validation, background forage agent (sub-agent collaboration), portrait synthesis, 16 tools, context overflow remediation, immutable domain models, multi-user RLS + Vault; gaps in write gating, external benchmarks, taint tracking, and sub-agent capability scoping
ANALYSIS-claude-code-memory Source: /home/lhl/Downloads/claude-code/src Claude Code memory subsystem (Anthropic): first-party production-scale memory system; flat-file MEMORY.md + typed topic files (user/feedback/project/reference) + background extraction via forked agent with mutual exclusion + LLM-based relevance selection (Sonnet) + team memory with OAuth sync + auto dream consolidation + KAIROS daily-log mode + eval-validated prompts with case IDs + security-hardened path validation; no vector search, no graph, no decay scoring
ANALYSIS-codex-memory openai/codex Codex memory subsystem (OpenAI): first-party open-source coding agent; two-phase async pipeline (gpt-5.1-codex-mini extraction โ†’ gpt-5.3-codex consolidation) + SQLite-backed job coordination (leases/heartbeats/watermarks) + progressive disclosure layout (memory_summary โ†’ MEMORY.md โ†’ rollout_summaries โ†’ skills) + skills as procedural memory + usage-based citation-driven retention + thread-diff incremental forgetting + ~1,400 lines extraction/consolidation prompts; no vector search, no team memory, no real-time extraction
ANALYSIS-google-always-on-memory-agent vendor/always-on-memory-agent/ Official Google ADK sample: always-on daemon with multimodal ingestion (27 file types via Gemini 3.1 Flash-Lite), periodic LLM consolidation, SQLite storage, HTTP API + Streamlit dashboard; no retrieval/search (recency scan LIMIT 50), no decay/dedup/versioning; useful as ADK orchestration reference and multimodal ingestion pattern
ANALYSIS-supermemory references/supermemory.md + vendor/supermemory/ Memory-as-a-service startup: memory versioning (linked-list chains via parentMemoryId/rootMemoryId/isLatest), typed relationship ontology (updates/extends/derives), static/dynamic profile synthesis API, time-based forgetting with audit trail, multi-model embedding columns, MemoryBench framework; open-source repo is SDK/frontend only โ€” core engine logic is proprietary hosted backend
ANALYSIS-karta vendor/karta/ Karta (rohithzr): Rust (~10.4K LOC) agentic memory library with Zettelkasten-inspired knowledge graph, 7-type dream engine (deduction/induction/abduction/consolidation/contradiction/episode digest/cross-episode digest) with inference feedback into retrieval, embedding-based query classification (6 modes), retroactive context evolution with drift protection, cross-encoder reranking with abstention, multi-hop BFS traversal, atomic fact decomposition with per-fact embeddings, foresight signals with TTL, structured episode digests; BEAM 100K: 57.7% with 243-failure root cause catalog

Paper Deep Dive Analyses (Academic / Industry)

Analysis Based on Focus
ANALYSIS-arxiv-2602.01313-evermembench references/hu-evermembench.md + references/papers/arxiv-2602.01313.pdf Benchmark critique emphasizing version semantics, multi-party fragmentation, oracle diagnostics, and shisad mapping
ANALYSIS-arxiv-2602.02369-live-evo references/zhang-live-evo.md + references/papers/arxiv-2602.02369.pdf System deep dive emphasizing online experience weighting from continuous feedback, meta-guidelines for memory compilation, and memory-on vs memory-off utility measurement; shisad mapping for feedback loops + procedural memory gating
ANALYSIS-arxiv-2602.11243-structmemeval references/shutova-structmemeval.md + references/papers/arxiv-2602.11243.pdf Benchmark deep dive emphasizing memory organization/structure as a distinct capability (trees/ledgers/state), hint vs no-hint diagnostics, and implications for shisad structured-memory primitives
ANALYSIS-arxiv-2602.05665-graph-based-agent-memory-taxonomy references/yang-graph-based-agent-memory-taxonomy.md + references/papers/arxiv-2602.05665.pdf Survey deep dive providing graph-based memory taxonomy and lifecycle (extract/store/retrieve/evolve), with implications for shisad graph-as-derived-view, operator choices, and maintenance jobs
ANALYSIS-arxiv-2404.13501-survey-memory-mechanism references/zhang-survey-memory-mechanism.md + references/papers/arxiv-2404.13501.pdf Survey deep dive providing baseline taxonomy and evaluation checklists for agent memory; useful coverage reference alongside newer benchmarks/systems for shisadโ€™s roadmap
ANALYSIS-arxiv-2512.13564-memory-age-ai-agents references/hu-memory-age-ai-agents.md + references/papers/arxiv-2512.13564.pdf Survey deep dive emphasizing the Formsโ€“Functionsโ€“Dynamics taxonomy and frontiers (RL integration, multimodal, multi-agent shared memory, trustworthiness), used as organizing frame for shisad v0.7 memory roadmap
ANALYSIS-arxiv-2402.17753-locomo references/maharana-locomo.md + references/papers/arxiv-2402.17753.pdf Dataset/benchmark critique with episodic-memory implications (event graphs, multimodal, RAG harm) and shisad mapping
ANALYSIS-arxiv-2410.10813-longmemeval references/wu-longmemeval.md + references/papers/arxiv-2410.10813.pdf Benchmark and system-design decomposition (indexing/retrieval/reading), with mapping to shisad primitives
ANALYSIS-arxiv-2310.08560-memgpt references/packer-memgpt.md + references/papers/arxiv-2310.08560.pdf System deep dive emphasizing virtual context management (OS paging), memory tiers (working/queue/recall/archival), function-call memory ops, and implications for shisad versioned corrections + write-policy hardening
ANALYSIS-arxiv-2602.10715-locomoplus references/li-locomoplus.md + references/papers/arxiv-2602.10715.pdf Beyond-factual โ€œcognitive memoryโ€ benchmark critique (latent constraints) and implications for safe constraint/procedural memory
ANALYSIS-arxiv-2504.19413-mem0 references/chhikara-mem0.md + references/papers/arxiv-2504.19413.pdf System deep dive emphasizing explicit memory ops, graph-memory tradeoffs, deployment metrics (tokens/p95), and shisad mapping (versioned corrections vs delete)
ANALYSIS-arxiv-2601.02553-simplemem references/liu-simplemem.md + references/papers/arxiv-2601.02553.pdf System deep dive emphasizing write-time semantic structured compression, online consolidation, and intent-aware multi-view retrieval planning; mapping to shisad โ€œderived vs rawโ€ memory + retrieval budgeting
ANALYSIS-arxiv-2502.12110-a-mem references/xu-a-mem.md + references/papers/arxiv-2502.12110.pdf System deep dive emphasizing Zettelkasten-style notes + LLM-driven linking + memory evolution, with strong multi-hop/temporal LoCoMo gains but high versioning/audit requirements for shisad
ANALYSIS-arxiv-2503.21760-meminsight references/salama-meminsight.md + references/papers/arxiv-2503.21760.pdf System deep dive emphasizing autonomous attribute mining/annotation as a derived metadata layer to improve retrieval recall and downstream tasks; mapping to shisad schema constraints + provenance/versioning
ANALYSIS-arxiv-2511.18423-gam references/yan-gam.md + references/papers/arxiv-2511.18423.pdf System deep dive emphasizing just-in-time context compilation via memo index + universal page-store and an iterative deep-research researcher; highlights the latency/quality trade-off and mapping to shisad evidence-first episodic storage
ANALYSIS-arxiv-2501.13956-zep references/rasmussen-zep.md + references/papers/arxiv-2501.13956.pdf System deep dive emphasizing bi-temporal validity semantics, episodic+semantic+community graph tiers, hybrid retrieval (BM25/embeddings/BFS), and implications for shisad versioned memory
ANALYSIS-arxiv-2507.03724-memos references/li-memos.md + references/papers/arxiv-2507.03724.pdf System deep dive emphasizing MemCube metadata, multi-substrate memory (plaintext/KV/parameter), lifecycle/scheduling/governance, and mapping to shisad primitives
ANALYSIS-arxiv-2508.19828-memory-r1 references/yan-memory-r1.md + references/papers/arxiv-2508.19828.pdf RL deep dive emphasizing learned memory ops (ADD/UPDATE/DELETE/NOOP) + post-retrieval memory distillation, reward design, and whatโ€™s required to safely adopt this in shisad
ANALYSIS-arxiv-2508.03341-nemori references/nan-nemori.md + references/papers/arxiv-2508.03341.pdf System deep dive emphasizing episode segmentation (Two-Step Alignment) + predict-calibrate semantic distillation, reported LoCoMo/LongMemEvalS gains, and implications for shisad write gating + correction semantics
ANALYSIS-arxiv-2510.08601-mnemosyne references/jonelagadda-mnemosyne.md + references/papers/arxiv-2510.08601.pdf System deep dive emphasizing edge-first graph memory, redundancy/refresh, probabilistic decay-based recall, and a fixed-budget core/persona summary; includes evaluation-rigor cautions
ANALYSIS-arxiv-2511.12960-engram references/patel-engram.md + references/papers/arxiv-2511.12960.pdf System deep dive emphasizing typed memory (episodic/semantic/procedural), deterministic routing/formatting, strict evidence budgets, and strong token/latency results; mapping to shisad primitives
ANALYSIS-arxiv-2511.20857-evo-memory references/wei-evo-memory.md + references/papers/arxiv-2511.20857.pdf Benchmark deep dive emphasizing streaming task-sequence evaluation for experience reuse, plus refine/prune mechanisms and metrics (robustness, step efficiency) for shisadโ€™s eval harness
ANALYSIS-arxiv-2512.10696-remember-me-refine-me references/cao-remember-me-refine-me.md + references/papers/arxiv-2512.10696.pdf System deep dive emphasizing procedural memory distillation + scenario-aware reuse + utility-based refinement/pruning; mapping to shisad procedural tier + versioned invalidation vs delete
ANALYSIS-arxiv-2512.12686-memoria references/sarin-memoria.md + references/papers/arxiv-2512.12686.pdf System deep dive emphasizing persona KG + session summaries with recency-weighted retrieval; highlights missing governance/versioning primitives needed for shisad
ANALYSIS-arxiv-2512.12818-hindsight references/latimer-hindsight.md + references/papers/arxiv-2512.12818.pdf System deep dive emphasizing retain/recall/reflect with four-network memory (facts/experiences/observations/beliefs), token-budgeted multi-channel retrieval fusion, and belief confidence updates; key shisad mapping
ANALYSIS-arxiv-2601.01885-agentic-memory references/yu-agentic-memory.md + references/papers/arxiv-2601.01885.pdf RL deep dive emphasizing unified LTM+STM memory ops as tool actions, 3-stage training curriculum, step-wise GRPO credit assignment, and implications for shisadโ€™s future learned memory policies
ANALYSIS-arxiv-2601.02163-evermemos references/hu-evermemos.md + references/papers/arxiv-2601.02163.pdf System deep dive emphasizing MemCellโ†’MemScene consolidation lifecycle, user profile/foresight, and sufficiency-verified scene-guided retrieval; mapping to shisad consolidation roadmap
ANALYSIS-arxiv-2601.02845-timem references/li-timem.md + references/papers/arxiv-2601.02845.pdf System deep dive emphasizing temporal-hierarchical consolidation (TMT), query-complexity recall planning/gating, and the accuracyโ€“token frontier; mapping to shisad temporal tiers
ANALYSIS-arxiv-2601.06377-himem references/zhang-himem.md + references/papers/arxiv-2601.06377.pdf System deep dive emphasizing Episode Memory + Note Memory hierarchy, note-first โ€œbest-effortโ€ retrieval w/ sufficiency checks, and conflict-aware reconsolidation; mapping to shisad eventโ†’knowledge tiers + versioned updates
ANALYSIS-arxiv-2512.24695-nested-learning references/behrouz-nested-learning.md + references/papers/arxiv-2512.24695.pdf Conceptual deep dive on multi-timescale โ€œcontinuum memoryโ€ and consolidation dynamics; mapping to shisad tiered memory + versioned corrections
ANALYSIS-arxiv-2512.24601-recursive-language-models references/zhang-recursive-language-models.md + references/papers/arxiv-2512.24601.pdf Architecture deep dive emphasizing RLM-style programmatic reading/compilation over arbitrarily long evidence stores (REPL + recursion + sub-calls), with implications for shisad sandboxed compilation traces and cost tail management
ANALYSIS-arxiv-2502.00592-m-plus references/wang-m-plus.md + references/papers/arxiv-2502.00592.pdf Architecture deep dive emphasizing latent-space long-term memory tokens + co-trained retrieval for >160k retention, with mapping to shisadโ€™s external evidence-first memory and retrieval diagnostics
ANALYSIS-arxiv-2503.03704-minja references/dong-minja.md + references/papers/arxiv-2503.03704.pdf Security deep dive on query-only memory injection attacks; implications for write-policy, provenance/taint, isolation, and โ€œdonโ€™t store demonstrationsโ€ patterns
ANALYSIS-arxiv-2601.05504-memory-poisoning-attack-defense references/sunil-memory-poisoning-attack-defense.md + references/papers/arxiv-2601.05504.pdf Security deep dive emphasizing ISR vs ASR under realistic memory conditions, and why trust-score sanitization can fail; concrete shisad hardening takeaways
ANALYSIS-arxiv-2407.04363-arigraph references/anokhin-arigraph.md + references/papers/arxiv-2407.04363.pdf System deep dive emphasizing episodicโ†”semantic memory linking, graph-structured retrieval for planning/exploration, and implications for shisad episode objects + provenance + correction semantics
ANALYSIS-arxiv-2501.00663-titans references/behrouz-titans.md + references/papers/arxiv-2501.00663.pdf Architecture deep dive emphasizing test-time-learning neural memory (surprise/momentum/forgetting), Titans MAC/MAG/MAL variants, and how to translate salience/decay ideas into shisadโ€™s external memory framework
ANALYSIS-arxiv-2504.16754-hema references/ahn-hema.md + references/papers/arxiv-2504.16754.pdf System deep dive emphasizing dual memory (summary + vector store), explicit prompt budgeting, pruning/consolidation policies, and evaluation-rigor cautions for shisad adoption
ANALYSIS-arxiv-2506.21605-membench references/tan-membench.md + references/papers/arxiv-2506.21605.pdf Benchmark deep dive emphasizing multi-scenario (participant vs observer) and multi-level (factual vs reflective) evaluation, plus latency/capacity metrics and implications for shisad eval harnesses

Source Threads & Links

Source URL
@jumperz memory stack thread https://x.com/jumperz/status/2024841165774717031
@joelhooks ADR tweet https://x.com/joelhooks/status/2024947701738262773
joelclaw ADR-0077 https://joelclaw.com/adrs/0077-memory-system-next-phase
@drag88 article https://x.com/drag88/status/2022551759491862974
supermemory docs https://supermemory.ai/docs
supermemory repo https://github.com/supermemoryai/supermemory
mempalace repo https://github.com/milla-jovovich/mempalace
karta repo https://github.com/rohithzr/karta

File Tree

agentic-memory/
โ”œโ”€โ”€ README.md                          โ† this file
โ”œโ”€โ”€ ANALYSIS.md                         โ† synthesis + comparison
โ”œโ”€โ”€ ANALYSIS-academic-industry.md       โ† academic/industry synthesis
โ”œโ”€โ”€ ANALYSIS-jumperz-agent-memory-stack.md
โ”œโ”€โ”€ ANALYSIS-joelhooks-adr-0077-memory-system-next-phase.md
โ”œโ”€โ”€ ANALYSIS-coolmanns-openclaw-memory-architecture.md
โ”œโ”€โ”€ ANALYSIS-drag88-agent-output-degradation.md
โ”œโ”€โ”€ ANALYSIS-versatly-clawvault.md
โ”œโ”€โ”€ ANALYSIS-vstorm-memv.md
โ”œโ”€โ”€ ANALYSIS-mira-OSS.md
โ”œโ”€โ”€ ANALYSIS-codex-memory.md
โ”œโ”€โ”€ ANALYSIS-google-always-on-memory-agent.md
โ”œโ”€โ”€ ANALYSIS-supermemory.md
โ”œโ”€โ”€ ANALYSIS-karta.md               โ† Karta: Rust agentic memory library with dream engine
โ”œโ”€โ”€ ANALYSIS-mempalace.md           โ† not in ANALYSIS.md (claims-vs-code issues); see REVIEWED.md
โ”œโ”€โ”€ REVIEWED.md                        โ† triage log (examined but not promoted to ANALYSIS)
โ”œโ”€โ”€ PUNCHLIST-academic-industry.md     โ† tracking checklist for paper deep dives
โ”œโ”€โ”€ templates/                         โ† templates for paper analyses/summaries
โ”‚
โ”œโ”€โ”€ references/                        โ† summarized reference docs (markdown w/ frontmatter)
โ”‚   โ”œโ”€โ”€ 1-full-agent-memory-build.jpg  โ† jumperz card 1: memory storage
โ”‚   โ”œโ”€โ”€ 2-feeds-into.jpg               โ† jumperz card 2: memory intelligence
โ”‚   โ”œโ”€โ”€ jumperz-agent-memory-stack.md
โ”‚   โ”œโ”€โ”€ joelhooks-adr-0077-memory-system-next-phase.md
โ”‚   โ”œโ”€โ”€ coolmanns-openclaw-memory-architecture.md
โ”‚   โ”œโ”€โ”€ drag88-agent-output-degradation.md
โ”‚   โ””โ”€โ”€ versatly-clawvault.md
โ”‚   โ”œโ”€โ”€ hu-evermembench.md
โ”‚   โ”œโ”€โ”€ li-locomoplus.md
โ”‚   โ”œโ”€โ”€ maharana-locomo.md
โ”‚   โ”œโ”€โ”€ wu-longmemeval.md
โ”‚   โ”œโ”€โ”€ chhikara-mem0.md
โ”‚   โ””โ”€โ”€ papers/                        โ† archived PDFs + text snapshots
โ”‚       โ”œโ”€โ”€ README.md
โ”‚       โ”œโ”€โ”€ arxiv-*.pdf
โ”‚       โ””โ”€โ”€ arxiv-*.md
โ”‚
โ””โ”€โ”€ vendor/                            โ† cloned source repos
    โ”œโ”€โ”€ mira-OSS/                      โ† github.com/taylorsatula/mira-OSS (snapshot, AGPLv3)
    โ”‚   โ”œโ”€โ”€ README.md
    โ”‚   โ”œโ”€โ”€ CLAUDE.md                  โ† project guide (architecture, patterns, principles)
    โ”‚   โ”œโ”€โ”€ main.py                    โ† FastAPI entry point
    โ”‚   โ”œโ”€โ”€ cns/                       โ† Central Nervous System (conversation orchestration)
    โ”‚   โ”‚   โ”œโ”€โ”€ api/                   โ† FastAPI endpoints (chat, actions, data, health)
    โ”‚   โ”‚   โ”œโ”€โ”€ core/                  โ† Domain models (Continuum, Message, Events)
    โ”‚   โ”‚   โ”œโ”€โ”€ services/              โ† Orchestrator, subcortical, summary, collapse handler
    โ”‚   โ”‚   โ””โ”€โ”€ infrastructure/        โ† Repositories, Valkey cache, unit of work
    โ”‚   โ”œโ”€โ”€ lt_memory/                 โ† Long-term memory system
    โ”‚   โ”‚   โ”œโ”€โ”€ scoring_formula.sql    โ† Multi-factor activity-day sigmoid importance scoring
    โ”‚   โ”‚   โ”œโ”€โ”€ models.py             โ† Memory, Entity, ExtractedMemory, link types
    โ”‚   โ”‚   โ”œโ”€โ”€ hybrid_search.py      โ† BM25 + pgvector with RRF
    โ”‚   โ”‚   โ”œโ”€โ”€ proactive.py          โ† Dual-path retrieval (similarity + hub discovery)
    โ”‚   โ”‚   โ”œโ”€โ”€ hub_discovery.py      โ† Entity-driven memory retrieval via pg_trgm
    โ”‚   โ”‚   โ””โ”€โ”€ processing/           โ† Extraction, consolidation, entity GC pipelines
    โ”‚   โ”œโ”€โ”€ working_memory/           โ† System prompt composition via trinkets
    โ”‚   โ”œโ”€โ”€ tools/                    โ† Self-registering tool framework (11 built-in)
    โ”‚   โ”œโ”€โ”€ config/                   โ† Pydantic config + prompt templates
    โ”‚   โ””โ”€โ”€ auth/                     โ† WebAuthn + magic link authentication
    โ”‚
    โ”œโ”€โ”€ openclaw-memory-architecture/  โ† github.com/coolmanns/openclaw-memory-architecture
    โ”‚   โ”œโ”€โ”€ README.md
    โ”‚   โ”œโ”€โ”€ PROJECT.md
    โ”‚   โ”œโ”€โ”€ CHANGELOG.md
    โ”‚   โ”œโ”€โ”€ docs/
    โ”‚   โ”‚   โ”œโ”€โ”€ ARCHITECTURE.md        โ† full 12-layer technical reference
    โ”‚   โ”‚   โ”œโ”€โ”€ knowledge-graph.md     โ† graph search pipeline, benchmarks
    โ”‚   โ”‚   โ”œโ”€โ”€ context-optimization.md
    โ”‚   โ”‚   โ”œโ”€โ”€ embedding-setup.md
    โ”‚   โ”‚   โ”œโ”€โ”€ benchmark-process.md
    โ”‚   โ”‚   โ”œโ”€โ”€ benchmark-results.md
    โ”‚   โ”‚   โ”œโ”€โ”€ code-search.md
    โ”‚   โ”‚   โ””โ”€โ”€ COMPARISON.md
    โ”‚   โ”œโ”€โ”€ schema/
    โ”‚   โ”‚   โ””โ”€โ”€ facts.sql              โ† SQLite schema for knowledge graph
    โ”‚   โ”œโ”€โ”€ scripts/                   โ† init, seed, search, ingest, decay, benchmark, telemetry
    โ”‚   โ”œโ”€โ”€ templates/                 โ† starter files (active-context, gating-policies, etc.)
    โ”‚   โ””โ”€โ”€ plugin-graph-memory/       โ† OpenClaw plugin (JS)
    โ”‚
    โ”œโ”€โ”€ karta/                         โ† github.com/rohithzr/karta (submodule, MIT)
    โ”‚   โ”œโ”€โ”€ Cargo.toml                โ† workspace: karta-core + karta-cli
    โ”‚   โ”œโ”€โ”€ crates/
    โ”‚   โ”‚   โ””โ”€โ”€ karta-core/           โ† Core engine (~6.7K LOC Rust)
    โ”‚   โ”‚       โ”œโ”€โ”€ src/
    โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ note.rs       โ† MemoryNote, Provenance, NoteStatus, AtomicFact, Episode, EpisodeDigest
    โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ write.rs      โ† Write path: index, link, evolve, foresight, facts
    โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ read.rs       โ† Read path: classify, search, traverse, rerank, synthesize
    โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ rerank.rs     โ† Jina/LLM/noop rerankers
    โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ dream/        โ† Dream engine: 7 inference types
    โ”‚   โ”‚       โ”‚   โ”œโ”€โ”€ store/        โ† LanceDB + SQLite implementations
    โ”‚   โ”‚       โ”‚   โ””โ”€โ”€ llm/          โ† Provider trait + OpenAI + mock + prompts
    โ”‚   โ”‚       โ””โ”€โ”€ tests/            โ† eval, beam_100k, bench_beam (~3.8K LOC)
    โ”‚   โ”œโ”€โ”€ findings.md               โ† BEAM 100K detailed failure analysis
    โ”‚   โ””โ”€โ”€ plan.md                   โ† Experiment plan targeting 90%+
    โ”‚
    โ”œโ”€โ”€ always-on-memory-agent/        โ† GoogleCloudPlatform/generative-ai (official ADK sample)
    โ”‚   โ”œโ”€โ”€ agent.py                  โ† ADK multi-agent daemon (ingest/consolidate/query)
    โ”‚   โ”œโ”€โ”€ dashboard.py              โ† Streamlit UI
    โ”‚   โ””โ”€โ”€ docs/                     โ† Logo/architecture assets
    โ”‚
    โ”œโ”€โ”€ memv/                          โ† github.com/vstorm-co/memv
    โ”‚   โ”œโ”€โ”€ README.md
    โ”‚   โ”œโ”€โ”€ CHANGELOG.md
    โ”‚   โ”œโ”€โ”€ pyproject.toml             โ† PyPI: memvee, v0.1.0
    โ”‚   โ”œโ”€โ”€ docs/                      โ† docs site (MkDocs)
    โ”‚   โ”œโ”€โ”€ src/
    โ”‚   โ”‚   โ””โ”€โ”€ memv/                  โ† segmentation, extraction, validity, retrieval, storage
    โ”‚   โ””โ”€โ”€ tests/
    โ”‚
    โ”œโ”€โ”€ supermemory/                    โ† github.com/supermemoryai/supermemory (lean subset: schemas, SDK, MCP, arch docs)
    โ”‚   โ”œโ”€โ”€ LICENSE
    โ”‚   โ”œโ”€โ”€ README.md                  โ† provenance + open-source vs hosted-backend split
    โ”‚   โ”œโ”€โ”€ packages/
    โ”‚   โ”‚   โ”œโ”€โ”€ validation/            โ† Zod schemas (data model definitions)
    โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ schemas.ts
    โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ api.ts
    โ”‚   โ”‚   โ”œโ”€โ”€ lib/
    โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ api.ts             โ† reveals backend dependency (api.supermemory.ai)
    โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ similarity.ts      โ† client-side cosine sim (visualization only)
    โ”‚   โ”‚   โ””โ”€โ”€ tools/src/shared/
    โ”‚   โ”‚       โ””โ”€โ”€ memory-client.ts   โ† SDK client (profile search, prompt formatting)
    โ”‚   โ”œโ”€โ”€ apps/mcp/src/
    โ”‚   โ”‚   โ””โ”€โ”€ server.ts              โ† MCP server (memory/recall/whoAmI tools)
    โ”‚   โ””โ”€โ”€ skills/supermemory/references/
    โ”‚       โ””โ”€โ”€ architecture.md        โ† claimed design (558 lines)
    โ”‚
    โ””โ”€โ”€ clawvault/                     โ† github.com/Versatly/clawvault
        โ”œโ”€โ”€ README.md
        โ”œโ”€โ”€ PLAN.md                    โ† issue #4: ledger, reflect, replay, archive
        โ”œโ”€โ”€ CHANGELOG.md
        โ”œโ”€โ”€ SKILL.md
        โ”œโ”€โ”€ package.json               โ† npm: clawvault, v2.6.1
        โ”œโ”€โ”€ src/
        โ”‚   โ”œโ”€โ”€ commands/              โ† archive, context, inject, observe, reflect, replay, wake, sleep, task, project, ...
        โ”‚   โ”œโ”€โ”€ observer/              โ† compressor, reflector, router, session-watcher
        โ”‚   โ”œโ”€โ”€ lib/                   โ† vault, memory-graph, ledger, observation-format, session-utils
        โ”‚   โ””โ”€โ”€ cli/
        โ”œโ”€โ”€ bin/                       โ† CLI entry + command registration modules
        โ”œโ”€โ”€ hooks/                     โ† OpenClaw hook handler
        โ”œโ”€โ”€ dashboard/                 โ† web dashboard (vault parser, graph diff)
        โ”œโ”€โ”€ schemas/
        โ”œโ”€โ”€ scripts/
        โ”œโ”€โ”€ templates/
        โ””โ”€โ”€ tests/

Key Themes Across Sources

  • Phased build order matters: Core memory first (write/read/decay), reliability second (dedup/maintenance/recovery), intelligence last (graphs/trust/cross-agent). Building out of order amplifies flaws.
  • Tiered retrieval: Summary files first (fast, cheap), vector search fallback (thorough, expensive). Don't vector-search everything.
  • Score decay: final_score = relevance ร— exp(-ฮป ร— days) โ€” recency-weighted relevance is universal across all architectures.
  • Feedback loops: Echo/fizzle (track which injected memories get used), behavior loops (extract corrections as lessons), learning loops (convert expensive LLM checks into cheap static rules).
  • SQLite over hosted vector DBs: At current scales (1K-5K entries), SQLite + FTS5 + local embeddings outperforms hosted solutions on latency, cost, and operational simplicity.
  • Multi-agent convergence: Shared memory creates homogenization pressure. Workspace isolation + file routing guards help but don't fully solve it.
  • Vault index pattern: Single scannable manifest (one-line descriptions) โ†’ load individual entries on demand. One file read instead of N.

Release History

VersionChangesUrgencyDate
main@2026-05-09Latest activity on main branchHigh5/9/2026
0.0.0No release found โ€” using repo HEADHigh4/11/2026
main@2026-04-11Latest activity on main branchHigh4/11/2026
main@2026-04-11Latest activity on main branchHigh4/11/2026
main@2026-04-11Latest activity on main branchHigh4/11/2026
main@2026-04-11Latest activity on main branchHigh4/11/2026
main@2026-04-11Latest activity on main branchHigh4/11/2026
main@2026-04-11Latest activity on main branchMedium4/11/2026
main@2026-04-11Latest activity on main branchMedium4/11/2026
main@2026-04-11Latest activity on main branchMedium4/11/2026
main@2026-04-11Latest activity on main branchMedium4/11/2026
main@2026-04-11Latest activity on main branchMedium4/11/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

edgequakeEdegQuake ๐ŸŒ‹ High-performance GraphRAG inspired from LightRag written in Rust; Transform documents into intelligent knowledge graphs for superior retrieval and generationv0.12.7
nltkNatural Language Toolkitdevelop@2026-06-06
Mini-o3๐Ÿง  Enhance visual search with Mini-o3, providing state-of-the-art multi-turn reasoning and easy-to-use training code for advanced AI applications.main@2026-06-06
awesome-opensource-aiCurated list of the best truly open-source AI projects, models, tools, and infrastructure.main@2026-06-06
spiceaiA portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.v2.0.0

More in RAG & Memory

edgequakeEdegQuake ๐ŸŒ‹ High-performance GraphRAG inspired from LightRag written in Rust; Transform documents into intelligent knowledge graphs for superior retrieval and generation
vllmA high-throughput and memory-efficient inference and serving engine for LLMs
nltkNatural Language Toolkit
spiceaiA portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.