HippocampAI — Enterprise Memory Engine for Intelligent AI Systems

HippocampAI is a production-ready, enterprise-grade memory engine that transforms how AI systems remember, reason, and learn from interactions. It provides persistent, intelligent memory capabilities that enable AI agents to maintain context across sessions, understand user preferences, detect behavioral patterns, and deliver truly personalized experiences.

The name "HippocampAI" draws inspiration from the hippocampus - the brain region responsible for memory formation and retrieval - reflecting our mission to give AI systems human-like memory capabilities.

Current Release: v0.5.1 — Bug-fix and API expansion release: batch endpoints, deduplication endpoint, single-memory GET, Prometheus scrape in main app, remote backend URL fixes, QueryRouter stem matching, Groq retry tuning, and SQL packaging fix.

Package Structure

HippocampAI is organized into two main components for flexibility:

Package	Description	Use Case
`hippocampai.core`	Core memory engine (no SaaS dependencies)	Library integration, embedded use
`hippocampai.platform`	SaaS platform (API, auth, Celery, monitoring)	Self-hosted SaaS deployment

# Core library only (lightweight)
from hippocampai.core import MemoryClient, Memory, Config

# SaaS platform features
from hippocampai.platform import run_api_server, AutomationController

# Or use the main package (includes everything, backward compatible)
from hippocampai import MemoryClient

Quick Start

Docker Compose (full stack, recommended)

git clone https://github.com/rexdivakar/HippocampAI.git
cd HippocampAI

# Copy and edit environment file
cp .env.example .env
# Required: set GROQ_API_KEY (or switch LLM_PROVIDER=ollama and point LLM_BASE_URL to your instance)
# Required for production: set USER_AUTH_ENABLED=true and JWT_SECRET

docker compose up -d

Services started:

Service	URL	Purpose
API	http://localhost:8000	FastAPI + Socket.IO
Frontend	http://localhost:81	React dashboard
Flower	http://localhost:5555	Celery task monitor
Prometheus	http://localhost:9090	Metrics collection
Grafana	http://localhost:3002	Dashboards
Qdrant	http://localhost:6333	Vector store

Verify the API is up:

curl http://localhost:8000/healthz
# {"status":"ok"}

Installation

# Core library (lightweight - 10 dependencies)
pip install hippocampai

# With SaaS features (API, auth, background tasks)
pip install "hippocampai[saas]"

# With specific LLM providers
pip install "hippocampai[openai]"     # OpenAI support
pip install "hippocampai[anthropic]"  # Anthropic Claude
pip install "hippocampai[groq]"       # Groq support

# Everything (development, all features)
pip install "hippocampai[all,dev]"

Your First Memory (30 seconds)

from hippocampai import MemoryClient

# Initialize client
client = MemoryClient()

# Store a memory
memory = client.remember(
    "I prefer oat milk in my coffee and work remotely on Tuesdays",
    user_id="alice",
    type="preference"
)

# Recall memories
results = client.recall("work preferences", user_id="alice")
print(f"Found: {results[0].memory.text}")

That's it! You now have intelligent memory for your AI application.

Key Features

Feature	Description	Learn More
Intelligent Memory	Hybrid search, importance scoring, semantic clustering	Features Guide
High Performance	50-100x faster with Redis caching, 500-1000+ RPS	Performance
Advanced Search	Vector + BM25 + reranking, temporal queries	Search Guide
Analytics	Pattern detection, habit tracking, behavioral insights	Analytics
AI Integration	Works with OpenAI, Anthropic, Groq, Ollama, local models	Providers
Session Management	Conversation tracking, summaries, hierarchical sessions	Sessions
SaaS Platform	Multi-tenant auth, rate limiting, background tasks	SaaS Guide
Memory Quality	Health monitoring, duplicate detection, quality tracking	Memory Management
Background Tasks	Celery-powered async operations, scheduled jobs	Celery Guide
Memory Consolidation ⭐ NEW	Sleep phase architecture with intelligent compaction	Sleep Phase
Multi-Agent Collaboration ⭐ NEW	Shared memory spaces for agent coordination	Collaboration
React Dashboard ⭐ NEW	Full-featured UI with analytics and visualization	Frontend
Predictive Analytics ⭐ NEW	Memory usage predictions and pattern forecasting	New Features
Auto-Healing ⭐ NEW	Automatic detection and repair of memory issues	New Features
Knowledge Graph NEW	Real-time entity/relationship extraction on every remember()	Features
Graph-Aware Retrieval NEW	3-way RRF fusion: vector + BM25 + graph	Features
Relevance Feedback NEW	User feedback loop with exponential decay scoring	Features
Memory Triggers NEW	Event-driven webhooks, websocket, and log actions	Features
Procedural Memory NEW	Self-optimizing prompts via learned behavioral rules	Features
Embedding Migration NEW	Safe model migration with Celery background processing	Features
Plugin System	Custom processors, scorers, retrievers, filters	New Features
Memory Namespaces	Hierarchical organization with permissions	New Features
Export/Import	Portable formats (JSON, Parquet, CSV) for backup	New Features
Offline Mode	Queue operations when backend unavailable	New Features
Tiered Storage	Hot/warm/cold storage tiers for efficiency	New Features
Framework Integrations	LangChain & LlamaIndex adapters	New Features
Bi-Temporal Facts	Track facts with validity periods and time-travel queries	Bi-Temporal Guide
Context Assembly	Automated context pack generation with token budgeting	Context Assembly
Custom Schemas	Define entity/relationship types without code changes	Schema Guide
Benchmarks	Reproducible performance benchmarks	Benchmarks

Why Choose HippocampAI?

vs. Traditional Vector Databases

Built-in Intelligence: Pattern detection, insights, behavioral analysis
Memory Types: Facts, preferences, goals, habits, events (not just vectors)
Temporal Reasoning: Native time-based queries and narratives

vs. Other Memory Platforms

5-100x Faster: Redis caching, optimized retrieval
Deployment Flexibility: Local, self-hosted, or SaaS
Full Control: Complete source access and customization

vs. Building In-House

Ready in Minutes: pip install hippocampai
102+ Methods: Complete API covering all use cases
Production-Tested: Battle-tested in real applications

See detailed comparison →

Documentation

Complete documentation is available in the docs/ directory.

Quick Links

What do you want to do?	Go here
Get started in 5 minutes	Getting Started Guide \| Quickstart
Try interactive demo	Chat Demo Guide
See all 102+ functions	API Reference \| Library Reference
Deploy as SaaS platform	SaaS Platform Guide ⭐ NEW
Monitor memory quality	Memory Management ⭐ NEW
Set up background tasks	Celery Guide ⭐ NEW
Deploy to production	User Guide \| Deployment
Configure settings	Configuration Guide \| Providers
Monitor & observe	Monitoring \| Telemetry
Troubleshoot issues	Troubleshooting
Use new features	New Features Guide ⭐ NEW
View all documentation	Documentation Hub

Documentation Index

Complete Documentation Index - Browse all 26 documentation files organized by topic

Core Documentation:

API Reference - All 102+ methods with examples
Features Overview - Comprehensive feature documentation
Architecture - System design and components
Configuration - All configuration options

Advanced Topics:

New Features - Plugins, namespaces, export/import, offline mode, tiered storage
Multi-Agent Features - Agent coordination
Intelligence Features - Fact extraction, entity recognition
Temporal Analytics - Time-based queries
Session Management - Conversation tracking

Production & Operations:

User Guide - Production deployment
Security - Security best practices
Monitoring - Observability and metrics
Backup & Recovery - Data protection

Configuration

Key environment variables

All variables are read from .env (or shell environment). Complete list is in .env.example.

Infrastructure

Variable	Default	Description
`QDRANT_URL`	`http://localhost:6333`	Qdrant endpoint. Use `http://qdrant:6333` inside Docker compose.
`REDIS_URL`	`redis://localhost:6379`	Redis for caching and Celery broker
`POSTGRES_HOST`	`localhost`	PostgreSQL host (used for auth)

LLM

Variable	Default	Description
`LLM_PROVIDER`	`ollama`	`ollama`, `openai`, `anthropic`, or `groq`
`LLM_MODEL`	`qwen2.5:7b-instruct`	Model name for the selected provider
`LLM_BASE_URL`	`http://localhost:11434`	Base URL for Ollama
`ALLOW_CLOUD`	`false`	Must be `true` when using cloud providers
`GROQ_API_KEY`	—	Required when `LLM_PROVIDER=groq`

Retrieval and scoring

Variable	Default	Description
`EMBED_MODEL`	`BAAI/bge-small-en-v1.5`	HuggingFace embedding model
`EMBED_DIMENSION`	`384`	Must match the chosen embedding model
`TOP_K_QDRANT`	`200`	Candidates pulled from vector search
`TOP_K_FINAL`	`20`	Results returned after reranking
`WEIGHT_SIM`	`0.55`	Vector similarity weight in fusion
`WEIGHT_RERANK`	`0.20`	Cross-encoder reranker weight
`WEIGHT_RECENCY`	`0.15`	Recency decay weight
`WEIGHT_IMPORTANCE`	`0.10`	Importance score weight
`WEIGHT_GRAPH`	`0.15`	Knowledge graph score weight (when enabled)
`WEIGHT_FEEDBACK`	`0.10`	Relevance feedback weight
`DEDUP_THRESHOLD`	`0.88`	Cosine similarity threshold for deduplication

Feature flags

Variable	Default	Description
`USER_AUTH_ENABLED`	`false`	Enable JWT auth on all `/v1/` endpoints. Set `true` in production.*
`ENABLE_REALTIME_GRAPH`	`true`	Build knowledge graph on every `remember()`
`GRAPH_EXTRACTION_MODE`	`pattern`	`pattern` (fast, regex-based) or `llm` (accurate, slower)
`ENABLE_GRAPH_RETRIEVAL`	`true`	Include graph scores in retrieval fusion
`ENABLE_TRIGGERS`	`true`	Enable event-driven webhook/websocket/log triggers
`AUTO_DEDUP_ENABLED`	`true`	Background deduplication every 24 h
`AUTO_CONSOLIDATION_ENABLED`	`false`	Nightly sleep-phase memory consolidation
`ENABLE_PROCEDURAL_MEMORY`	`false`	Procedural memory and prompt self-optimization (beta)
`ENABLE_PROSPECTIVE_MEMORY`	`false`	Time/event-triggered intent system (beta)
`HIPPOCAMPAI_ENABLE_TMS`	`false`	Truth maintenance system — retraction and contradiction detection (beta)
`IMPORTANCE_DECAY_ENABLED`	`true`	Apply exponential decay to importance scores
`AUTO_PRUNING_ENABLED`	`false`	Automatically prune low-quality memories

Half-lives (days)

Variable	Default	Memory type
`HALF_LIFE_PREFS`	`90`	preferences, goals, habits
`HALF_LIFE_FACTS`	`30`	facts, context
`HALF_LIFE_EVENTS`	`14`	events
`HALF_LIFE_PROCEDURAL`	`180`	procedural rules
`HALF_LIFE_PROSPECTIVE`	`30`	prospective intents

Local Development (library mode)

# .env file
QDRANT_URL=http://localhost:6333      # local Qdrant; use http://qdrant:6333 inside Docker
LLM_PROVIDER=ollama
LLM_MODEL=qwen2.5:7b-instruct

# Library usage — no running server required
from hippocampai import MemoryClient

client = MemoryClient()  # reads .env automatically

memory = client.remember("I prefer dark mode", user_id="alice", type="preference")
results = client.recall("UI settings", user_id="alice", k=3)

Remote mode (library pointing at running API)

from hippocampai.backends.remote import RemoteBackend
from hippocampai import MemoryClient

client = MemoryClient(backend=RemoteBackend(
    api_url="http://localhost:8000",
    api_key="your-api-key",   # omit if USER_AUTH_ENABLED=false
    timeout=90,
))

memory = client.remember("Alice joined in January", user_id="alice")
results = client.recall("when did Alice join", user_id="alice", k=5)

Cloud/Production

from hippocampai import MemoryClient
from hippocampai.adapters import GroqLLM

client = MemoryClient(
    llm_provider=GroqLLM(api_key="your-key"),
    qdrant_url="https://your-qdrant-cluster.com",
    redis_url="redis://your-redis:6379"
)

See all configuration options →

REST API Reference

All endpoints are served by the FastAPI application on port 8000. Interactive docs available at http://localhost:8000/docs.

Core memory operations

Method	Path	Description
`GET`	`/healthz`	Health check — returns `{"status":"ok"}`
`GET`	`/metrics`	Prometheus text-format scrape endpoint (501 if `prometheus-client` not installed)
`POST`	`/v1/memories:remember`	Store one memory; auto-classifies type if not provided
`POST`	`/v1/memories:recall`	Retrieve memories by semantic query
`POST`	`/v1/memories:extract`	Extract memories from raw conversation text
`PATCH`	`/v1/memories:update`	Update text, importance, tags, or TTL of an existing memory
`DELETE`	`/v1/memories:delete`	Delete one memory by ID
`POST`	`/v1/memories:get`	List memories for a user with optional filters
`POST`	`/v1/memories:expire`	Delete all expired memories for a user
`GET`	`/v1/memories/{memory_id}`	Fetch a single memory by ID; 404 if not found
`POST`	`/v1/classify`	Classify text into a memory type using agentic LLM classifier

Batch operations (new in 0.5.1)

Method	Path	Description
`POST`	`/v1/memories/batch`	Store N memories; partial failures are logged, not raised
`POST`	`/v1/memories/batch/get`	Fetch N memories by ID list; silently skips not-found IDs
`POST`	`/v1/memories/batch/delete`	Delete N memories; returns `{"deleted": N, "failed": M}`
`POST`	`/v1/memories/deduplicate`	Find/remove duplicates for a user; `dry_run=true` by default

Other route groups

Prefix	Description
`/v1/sessions/`	Conversation session CRUD and summaries
`/v1/feedback/`	Memory relevance feedback submission and statistics
`/v1/triggers/`	Event-driven trigger CRUD and fire history
`/v1/procedural/`	Procedural memory rule CRUD, extraction, injection
`/v1/prospective/`	Prospective intent CRUD, parse, evaluate, expire
`/v1/graph/`	Knowledge graph query and sync
`/v1/consolidation/`	Sleep-phase consolidation status and control
`/v1/namespaces/`	Memory namespace management
`/v1/bitemporal/`	Bi-temporal fact storage and time-travel queries
`/v1/context/`	Context assembly with token budgeting
`/v1/migration/`	Embedding model migration management
`/admin/`	Admin-only operations

Deployment Options

Local Development

docker run -d -p 6333:6333 qdrant/qdrant
pip install hippocampai

Full Stack (Docker Compose)

git clone https://github.com/rexdivakar/HippocampAI.git
cd HippocampAI
cp .env.example .env
# Edit .env: set GROQ_API_KEY, optionally set USER_AUTH_ENABLED=true
docker compose up -d

Includes:

FastAPI server (port 8000)
React Dashboard (port 81)
Celery workers with Beat scheduler
Flower monitoring (port 5555)
Prometheus metrics (port 9090)
Grafana dashboards (port 3002)

React Dashboard

cd frontend
npm install
npm run dev  # Development server on port 5173

Production deployment guide →

Production Deployment

Blockers — must resolve before exposing to the internet

Authentication is disabled by default. USER_AUTH_ENABLED=false in docker-compose defaults. All /v1/* endpoints are open with no auth check. Set USER_AUTH_ENABLED=true and configure JWT_SECRET in .env.
Default credentials in .env.example — replace before deploying:
- HIPPOCAMPAI_API_KEY=example_key_do_not_use
- FLOWER_PASSWORD=changeme_in_production
- GRAFANA_ADMIN_PASSWORD=changeme_in_production
GROQ_API_KEY in plain-text .env — for production, inject via Docker secrets, AWS Secrets Manager, or HashiCorp Vault. Never commit .env to git. (.env is in .gitignore.)
Single uvicorn worker — --workers 1 in the docker-compose command. Under concurrent LLM-backed requests the single worker threadpool will saturate. Run multiple workers (--workers 4) or place a process manager (gunicorn) in front.
No TLS — all services communicate over plain HTTP. Add nginx or Caddy as a TLS-terminating reverse proxy in front of port 8000.

Warnings — should resolve before production load

Groq free-tier rate limits — 30 RPM. High-throughput writes to /v1/memories:remember and /v1/memories:extract (which call the LLM for type classification and extraction) will hit this limit and retry. Options: upgrade to Groq Dev Tier, switch to a self-hosted LLM (LLM_PROVIDER=ollama), or pass type explicitly in remember requests to skip LLM classification.
admin_ui/ directory is empty — the hippocampai-admin container (nginx on port 3001) serves 403 because the directory has no files. Either populate it or remove the service from docker-compose.
Frontend Docker healthcheck reports unhealthy — Vite dev server responds on / but the healthcheck hits a different path. Change the healthcheck to curl -f http://localhost:81/ or switch to the production build.
QDRANT_URL in .env — the default is http://localhost:6333 which works for local library use. When running inside Docker compose, the API container needs QDRANT_URL=http://qdrant:6333 (already set in docker-compose.yml via the service DNS name).
Celery worker healthcheck disabled — healthcheck: disable: true in docker-compose. Production should enable Flower-based or custom health monitoring for Celery workers.
AUTO_CONSOLIDATION_ENABLED=false — memory consolidation (sleep-phase replay) is off by default. Set to true to enable nightly automatic consolidation.

Not recommended for production yet (disabled by default, needs end-to-end testing)

ENABLE_PROCEDURAL_MEMORY=false — procedural memory and prompt self-optimization
ENABLE_PROSPECTIVE_MEMORY=false — time- and event-triggered intent system
HIPPOCAMPAI_ENABLE_TMS=false — truth maintenance system (retraction and contradiction detection)

Use Cases

AI Agents & Chatbots

Personalized assistants with context across sessions
Customer support with interaction history
Educational tutoring that adapts to students

Enterprise Applications

Knowledge management for teams
CRM enhancement with interaction intelligence
Compliance monitoring and audit trails

Research & Analytics

Behavioral pattern analysis
Long-term trend detection
User experience personalization

More use cases →

Performance

Metric	Performance
Query Speed	50-100x faster with caching
Throughput	500-1000+ requests/second
Latency	1-2ms (cached), 5-15ms (uncached)
Availability	99.9% uptime

See benchmarks →

Community & Support

Documentation: Complete guides
Issues: GitHub Issues
Discussions: GitHub Discussions
Discord: Join our community

Examples

Code Examples

Over 25 working examples in the examples/ directory:

# Basic operations
python examples/01_basic_usage.py

# Advanced features
python examples/11_intelligence_features_demo.py
python examples/13_temporal_reasoning_demo.py
python examples/14_cross_session_insights_demo.py

# New v0.4.0 features
python examples/20_collaboration_demo.py      # Multi-agent collaboration
python examples/21_predictive_analytics_demo.py  # Predictive analytics
python examples/22_auto_healing_demo.py       # Auto-healing pipeline
python examples/consolidation_demo.py         # Memory consolidation

View all examples →

Contributing

We welcome contributions! See our Contributing Guide for details.

git clone https://github.com/rexdivakar/HippocampAI.git
cd HippocampAI
pip install -e ".[dev]"
pytest

License

Apache 2.0 - Use freely in commercial and open-source projects.

Star History

If you find HippocampAI useful, please star the repo! It helps others discover the project.

Built with by the HippocampAI team

Quick Reference Card

from hippocampai import MemoryClient

client = MemoryClient()

# Core operations
memory = client.remember("text", user_id="alice")
results = client.recall("query", user_id="alice", k=5)
client.update_memory(memory_id, text="new text")
client.delete_memory(memory_id)

# Intelligence
facts = client.extract_facts("John works at Google")
entities = client.extract_entities("Elon Musk founded SpaceX")
patterns = client.detect_patterns(user_id="alice")

# Analytics
habits = client.detect_habits(user_id="alice")
changes = client.track_behavior_changes(user_id="alice")
stats = client.get_memory_statistics(user_id="alice")

# Sessions
session = client.create_session(user_id="alice", title="Planning")
client.complete_session(session.id, generate_summary=True)

# Bi-Temporal Facts (NEW)
from hippocampai.models.bitemporal import BiTemporalQuery
fact = client.store_bitemporal_fact(
    user_id="alice",
    subject="alice",
    predicate="works_at",
    object_value="Acme Corp",
    valid_from=datetime(2024, 1, 1),
)
facts = client.query_bitemporal_facts(BiTemporalQuery(
    user_id="alice",
    valid_at=datetime(2024, 6, 1),
))

# Context Assembly (NEW)
from hippocampai.context.models import ContextConstraints
context = client.assemble_context(
    user_id="alice",
    query="What are Alice's work preferences?",
    constraints=ContextConstraints(token_budget=4000),
)
print(context.final_context_text)

# Custom Schema Validation (NEW)
from hippocampai.schema import SchemaRegistry
registry = SchemaRegistry()
result = registry.validate_entity("person", {"name": "Alice"})

# Relevance Feedback (NEW v0.5.0)
client.rate_recall(
    memory_id=results[0].memory.id,
    user_id="alice",
    query="coffee preferences",
    feedback_type="relevant"
)

# See docs/LIBRARY_COMPLETE_REFERENCE.md for the full method reference

Full API Reference | REST API Reference

Version	Changes	Urgency	Date
V0.5.0	### User description ## Summary HippocampAI v0.5.0 — 6 major new intelligent memory features, comprehensive documentation overhaul, and full React frontend. ### New Features (v0.5.0) - Real-Time Incremental Knowledge Graph — Auto-extraction of entities, facts, relationships on every `remember()` call using NetworkX in-memory graph with JSON persistence - Graph-Aware Retrieval — 3-way Reciprocal Rank Fusion (vector + BM25 + graph) with new `GRAPH_HYBRID` search mode -	Low	2/11/2026
V0.3.0	## What's Changed * Feature1 by @rexdivakar in https://github.com/rexdivakar/HippocampAI/pull/24 Full Changelog: https://github.com/rexdivakar/HippocampAI/compare/V0.2.5...V0.3.0	Low	11/24/2025
V0.2.5	## What's Changed * Priority fix by @rexdivakar in https://github.com/rexdivakar/HippocampAI/pull/6 * Patch 1 by @rexdivakar in https://github.com/rexdivakar/HippocampAI/pull/18 Full Changelog: https://github.com/rexdivakar/HippocampAI/commits/V0.2.5	Low	11/3/2025

HippocampAI

Description

README