HippocampAI is a production-ready, enterprise-grade memory engine that transforms how AI systems remember, reason, and learn from interactions. It provides persistent, intelligent memory capabilities that enable AI agents to maintain context across sessions, understand user preferences, detect behavioral patterns, and deliver truly personalized experiences.
The name "HippocampAI" draws inspiration from the hippocampus - the brain region responsible for memory formation and retrieval - reflecting our mission to give AI systems human-like memory capabilities.
Current Release: v0.5.1 — Bug-fix and API expansion release: batch endpoints, deduplication endpoint, single-memory GET, Prometheus scrape in main app, remote backend URL fixes, QueryRouter stem matching, Groq retry tuning, and SQL packaging fix.
HippocampAI is organized into two main components for flexibility:
| Package | Description | Use Case |
|---|---|---|
hippocampai.core |
Core memory engine (no SaaS dependencies) | Library integration, embedded use |
hippocampai.platform |
SaaS platform (API, auth, Celery, monitoring) | Self-hosted SaaS deployment |
# Core library only (lightweight)
from hippocampai.core import MemoryClient, Memory, Config
# SaaS platform features
from hippocampai.platform import run_api_server, AutomationController
# Or use the main package (includes everything, backward compatible)
from hippocampai import MemoryClientgit clone https://github.com/rexdivakar/HippocampAI.git
cd HippocampAI
# Copy and edit environment file
cp .env.example .env
# Required: set GROQ_API_KEY (or switch LLM_PROVIDER=ollama and point LLM_BASE_URL to your instance)
# Required for production: set USER_AUTH_ENABLED=true and JWT_SECRET
docker compose up -dServices started:
| Service | URL | Purpose |
|---|---|---|
| API | http://localhost:8000 | FastAPI + Socket.IO |
| Frontend | http://localhost:81 | React dashboard |
| Flower | http://localhost:5555 | Celery task monitor |
| Prometheus | http://localhost:9090 | Metrics collection |
| Grafana | http://localhost:3002 | Dashboards |
| Qdrant | http://localhost:6333 | Vector store |
Verify the API is up:
curl http://localhost:8000/healthz
# {"status":"ok"}# Core library (lightweight - 10 dependencies)
pip install hippocampai
# With SaaS features (API, auth, background tasks)
pip install "hippocampai[saas]"
# With specific LLM providers
pip install "hippocampai[openai]" # OpenAI support
pip install "hippocampai[anthropic]" # Anthropic Claude
pip install "hippocampai[groq]" # Groq support
# Everything (development, all features)
pip install "hippocampai[all,dev]"from hippocampai import MemoryClient
# Initialize client
client = MemoryClient()
# Store a memory
memory = client.remember(
"I prefer oat milk in my coffee and work remotely on Tuesdays",
user_id="alice",
type="preference"
)
# Recall memories
results = client.recall("work preferences", user_id="alice")
print(f"Found: {results[0].memory.text}")That's it! You now have intelligent memory for your AI application.
| Feature | Description | Learn More |
|---|---|---|
| Intelligent Memory | Hybrid search, importance scoring, semantic clustering | Features Guide |
| High Performance | 50-100x faster with Redis caching, 500-1000+ RPS | Performance |
| Advanced Search | Vector + BM25 + reranking, temporal queries | Search Guide |
| Analytics | Pattern detection, habit tracking, behavioral insights | Analytics |
| AI Integration | Works with OpenAI, Anthropic, Groq, Ollama, local models | Providers |
| Session Management | Conversation tracking, summaries, hierarchical sessions | Sessions |
| SaaS Platform | Multi-tenant auth, rate limiting, background tasks | SaaS Guide |
| Memory Quality | Health monitoring, duplicate detection, quality tracking | Memory Management |
| Background Tasks | Celery-powered async operations, scheduled jobs | Celery Guide |
| Memory Consolidation ⭐ NEW | Sleep phase architecture with intelligent compaction | Sleep Phase |
| Multi-Agent Collaboration ⭐ NEW | Shared memory spaces for agent coordination | Collaboration |
| React Dashboard ⭐ NEW | Full-featured UI with analytics and visualization | Frontend |
| Predictive Analytics ⭐ NEW | Memory usage predictions and pattern forecasting | New Features |
| Auto-Healing ⭐ NEW | Automatic detection and repair of memory issues | New Features |
| Knowledge Graph NEW | Real-time entity/relationship extraction on every remember() | Features |
| Graph-Aware Retrieval NEW | 3-way RRF fusion: vector + BM25 + graph | Features |
| Relevance Feedback NEW | User feedback loop with exponential decay scoring | Features |
| Memory Triggers NEW | Event-driven webhooks, websocket, and log actions | Features |
| Procedural Memory NEW | Self-optimizing prompts via learned behavioral rules | Features |
| Embedding Migration NEW | Safe model migration with Celery background processing | Features |
| Plugin System | Custom processors, scorers, retrievers, filters | New Features |
| Memory Namespaces | Hierarchical organization with permissions | New Features |
| Export/Import | Portable formats (JSON, Parquet, CSV) for backup | New Features |
| Offline Mode | Queue operations when backend unavailable | New Features |
| Tiered Storage | Hot/warm/cold storage tiers for efficiency | New Features |
| Framework Integrations | LangChain & LlamaIndex adapters | New Features |
| Bi-Temporal Facts | Track facts with validity periods and time-travel queries | Bi-Temporal Guide |
| Context Assembly | Automated context pack generation with token budgeting | Context Assembly |
| Custom Schemas | Define entity/relationship types without code changes | Schema Guide |
| Benchmarks | Reproducible performance benchmarks | Benchmarks |
- Built-in Intelligence: Pattern detection, insights, behavioral analysis
- Memory Types: Facts, preferences, goals, habits, events (not just vectors)
- Temporal Reasoning: Native time-based queries and narratives
- 5-100x Faster: Redis caching, optimized retrieval
- Deployment Flexibility: Local, self-hosted, or SaaS
- Full Control: Complete source access and customization
- Ready in Minutes:
pip install hippocampai - 102+ Methods: Complete API covering all use cases
- Production-Tested: Battle-tested in real applications
Complete documentation is available in the docs/ directory.
| What do you want to do? | Go here |
|---|---|
| Get started in 5 minutes | Getting Started Guide | Quickstart |
| Try interactive demo | Chat Demo Guide |
| See all 102+ functions | API Reference | Library Reference |
| Deploy as SaaS platform | SaaS Platform Guide ⭐ NEW |
| Monitor memory quality | Memory Management ⭐ NEW |
| Set up background tasks | Celery Guide ⭐ NEW |
| Deploy to production | User Guide | Deployment |
| Configure settings | Configuration Guide | Providers |
| Monitor & observe | Monitoring | Telemetry |
| Troubleshoot issues | Troubleshooting |
| Use new features | New Features Guide ⭐ NEW |
| View all documentation | Documentation Hub |
Complete Documentation Index - Browse all 26 documentation files organized by topic
Core Documentation:
- API Reference - All 102+ methods with examples
- Features Overview - Comprehensive feature documentation
- Architecture - System design and components
- Configuration - All configuration options
Advanced Topics:
- New Features - Plugins, namespaces, export/import, offline mode, tiered storage
- Multi-Agent Features - Agent coordination
- Intelligence Features - Fact extraction, entity recognition
- Temporal Analytics - Time-based queries
- Session Management - Conversation tracking
Production & Operations:
- User Guide - Production deployment
- Security - Security best practices
- Monitoring - Observability and metrics
- Backup & Recovery - Data protection
All variables are read from .env (or shell environment). Complete list is in .env.example.
Infrastructure
| Variable | Default | Description |
|---|---|---|
QDRANT_URL |
http://localhost:6333 |
Qdrant endpoint. Use http://qdrant:6333 inside Docker compose. |
REDIS_URL |
redis://localhost:6379 |
Redis for caching and Celery broker |
POSTGRES_HOST |
localhost |
PostgreSQL host (used for auth) |
LLM
| Variable | Default | Description |
|---|---|---|
LLM_PROVIDER |
ollama |
ollama, openai, anthropic, or groq |
LLM_MODEL |
qwen2.5:7b-instruct |
Model name for the selected provider |
LLM_BASE_URL |
http://localhost:11434 |
Base URL for Ollama |
ALLOW_CLOUD |
false |
Must be true when using cloud providers |
GROQ_API_KEY |
— | Required when LLM_PROVIDER=groq |
Retrieval and scoring
| Variable | Default | Description |
|---|---|---|
EMBED_MODEL |
BAAI/bge-small-en-v1.5 |
HuggingFace embedding model |
EMBED_DIMENSION |
384 |
Must match the chosen embedding model |
TOP_K_QDRANT |
200 |
Candidates pulled from vector search |
TOP_K_FINAL |
20 |
Results returned after reranking |
WEIGHT_SIM |
0.55 |
Vector similarity weight in fusion |
WEIGHT_RERANK |
0.20 |
Cross-encoder reranker weight |
WEIGHT_RECENCY |
0.15 |
Recency decay weight |
WEIGHT_IMPORTANCE |
0.10 |
Importance score weight |
WEIGHT_GRAPH |
0.15 |
Knowledge graph score weight (when enabled) |
WEIGHT_FEEDBACK |
0.10 |
Relevance feedback weight |
DEDUP_THRESHOLD |
0.88 |
Cosine similarity threshold for deduplication |
Feature flags
| Variable | Default | Description |
|---|---|---|
USER_AUTH_ENABLED |
false |
Enable JWT auth on all /v1/* endpoints. Set true in production. |
ENABLE_REALTIME_GRAPH |
true |
Build knowledge graph on every remember() |
GRAPH_EXTRACTION_MODE |
pattern |
pattern (fast, regex-based) or llm (accurate, slower) |
ENABLE_GRAPH_RETRIEVAL |
true |
Include graph scores in retrieval fusion |
ENABLE_TRIGGERS |
true |
Enable event-driven webhook/websocket/log triggers |
AUTO_DEDUP_ENABLED |
true |
Background deduplication every 24 h |
AUTO_CONSOLIDATION_ENABLED |
false |
Nightly sleep-phase memory consolidation |
ENABLE_PROCEDURAL_MEMORY |
false |
Procedural memory and prompt self-optimization (beta) |
ENABLE_PROSPECTIVE_MEMORY |
false |
Time/event-triggered intent system (beta) |
HIPPOCAMPAI_ENABLE_TMS |
false |
Truth maintenance system — retraction and contradiction detection (beta) |
IMPORTANCE_DECAY_ENABLED |
true |
Apply exponential decay to importance scores |
AUTO_PRUNING_ENABLED |
false |
Automatically prune low-quality memories |
Half-lives (days)
| Variable | Default | Memory type |
|---|---|---|
HALF_LIFE_PREFS |
90 |
preferences, goals, habits |
HALF_LIFE_FACTS |
30 |
facts, context |
HALF_LIFE_EVENTS |
14 |
events |
HALF_LIFE_PROCEDURAL |
180 |
procedural rules |
HALF_LIFE_PROSPECTIVE |
30 |
prospective intents |
# .env file
QDRANT_URL=http://localhost:6333 # local Qdrant; use http://qdrant:6333 inside Docker
LLM_PROVIDER=ollama
LLM_MODEL=qwen2.5:7b-instruct# Library usage — no running server required
from hippocampai import MemoryClient
client = MemoryClient() # reads .env automatically
memory = client.remember("I prefer dark mode", user_id="alice", type="preference")
results = client.recall("UI settings", user_id="alice", k=3)from hippocampai.backends.remote import RemoteBackend
from hippocampai import MemoryClient
client = MemoryClient(backend=RemoteBackend(
api_url="http://localhost:8000",
api_key="your-api-key", # omit if USER_AUTH_ENABLED=false
timeout=90,
))
memory = client.remember("Alice joined in January", user_id="alice")
results = client.recall("when did Alice join", user_id="alice", k=5)from hippocampai import MemoryClient
from hippocampai.adapters import GroqLLM
client = MemoryClient(
llm_provider=GroqLLM(api_key="your-key"),
qdrant_url="https://your-qdrant-cluster.com",
redis_url="redis://your-redis:6379"
)See all configuration options →
All endpoints are served by the FastAPI application on port 8000. Interactive docs available at http://localhost:8000/docs.
| Method | Path | Description |
|---|---|---|
GET |
/healthz |
Health check — returns {"status":"ok"} |
GET |
/metrics |
Prometheus text-format scrape endpoint (501 if prometheus-client not installed) |
POST |
/v1/memories:remember |
Store one memory; auto-classifies type if not provided |
POST |
/v1/memories:recall |
Retrieve memories by semantic query |
POST |
/v1/memories:extract |
Extract memories from raw conversation text |
PATCH |
/v1/memories:update |
Update text, importance, tags, or TTL of an existing memory |
DELETE |
/v1/memories:delete |
Delete one memory by ID |
POST |
/v1/memories:get |
List memories for a user with optional filters |
POST |
/v1/memories:expire |
Delete all expired memories for a user |
GET |
/v1/memories/{memory_id} |
Fetch a single memory by ID; 404 if not found |
POST |
/v1/classify |
Classify text into a memory type using agentic LLM classifier |
| Method | Path | Description |
|---|---|---|
POST |
/v1/memories/batch |
Store N memories; partial failures are logged, not raised |
POST |
/v1/memories/batch/get |
Fetch N memories by ID list; silently skips not-found IDs |
POST |
/v1/memories/batch/delete |
Delete N memories; returns {"deleted": N, "failed": M} |
POST |
/v1/memories/deduplicate |
Find/remove duplicates for a user; dry_run=true by default |
| Prefix | Description |
|---|---|
/v1/sessions/ |
Conversation session CRUD and summaries |
/v1/feedback/ |
Memory relevance feedback submission and statistics |
/v1/triggers/ |
Event-driven trigger CRUD and fire history |
/v1/procedural/ |
Procedural memory rule CRUD, extraction, injection |
/v1/prospective/ |
Prospective intent CRUD, parse, evaluate, expire |
/v1/graph/ |
Knowledge graph query and sync |
/v1/consolidation/ |
Sleep-phase consolidation status and control |
/v1/namespaces/ |
Memory namespace management |
/v1/bitemporal/ |
Bi-temporal fact storage and time-travel queries |
/v1/context/ |
Context assembly with token budgeting |
/v1/migration/ |
Embedding model migration management |
/admin/ |
Admin-only operations |
docker run -d -p 6333:6333 qdrant/qdrant
pip install hippocampaigit clone https://github.com/rexdivakar/HippocampAI.git
cd HippocampAI
cp .env.example .env
# Edit .env: set GROQ_API_KEY, optionally set USER_AUTH_ENABLED=true
docker compose up -dIncludes:
- FastAPI server (port 8000)
- React Dashboard (port 81)
- Celery workers with Beat scheduler
- Flower monitoring (port 5555)
- Prometheus metrics (port 9090)
- Grafana dashboards (port 3002)
cd frontend
npm install
npm run dev # Development server on port 5173-
Authentication is disabled by default.
USER_AUTH_ENABLED=falsein docker-compose defaults. All/v1/*endpoints are open with no auth check. SetUSER_AUTH_ENABLED=trueand configureJWT_SECRETin.env. -
Default credentials in
.env.example— replace before deploying:HIPPOCAMPAI_API_KEY=example_key_do_not_useFLOWER_PASSWORD=changeme_in_productionGRAFANA_ADMIN_PASSWORD=changeme_in_production
-
GROQ_API_KEYin plain-text.env— for production, inject via Docker secrets, AWS Secrets Manager, or HashiCorp Vault. Never commit.envto git. (.envis in.gitignore.) -
Single uvicorn worker —
--workers 1in the docker-composecommand. Under concurrent LLM-backed requests the single worker threadpool will saturate. Run multiple workers (--workers 4) or place a process manager (gunicorn) in front. -
No TLS — all services communicate over plain HTTP. Add nginx or Caddy as a TLS-terminating reverse proxy in front of port 8000.
-
Groq free-tier rate limits — 30 RPM. High-throughput writes to
/v1/memories:rememberand/v1/memories:extract(which call the LLM for type classification and extraction) will hit this limit and retry. Options: upgrade to Groq Dev Tier, switch to a self-hosted LLM (LLM_PROVIDER=ollama), or passtypeexplicitly in remember requests to skip LLM classification. -
admin_ui/directory is empty — thehippocampai-admincontainer (nginx on port 3001) serves 403 because the directory has no files. Either populate it or remove the service from docker-compose. -
Frontend Docker healthcheck reports unhealthy — Vite dev server responds on
/but the healthcheck hits a different path. Change the healthcheck tocurl -f http://localhost:81/or switch to the production build. -
QDRANT_URLin.env— the default ishttp://localhost:6333which works for local library use. When running inside Docker compose, the API container needsQDRANT_URL=http://qdrant:6333(already set in docker-compose.yml via the service DNS name). -
Celery worker healthcheck disabled —
healthcheck: disable: truein docker-compose. Production should enable Flower-based or custom health monitoring for Celery workers. -
AUTO_CONSOLIDATION_ENABLED=false— memory consolidation (sleep-phase replay) is off by default. Set totrueto enable nightly automatic consolidation.
ENABLE_PROCEDURAL_MEMORY=false— procedural memory and prompt self-optimizationENABLE_PROSPECTIVE_MEMORY=false— time- and event-triggered intent systemHIPPOCAMPAI_ENABLE_TMS=false— truth maintenance system (retraction and contradiction detection)
AI Agents & Chatbots
- Personalized assistants with context across sessions
- Customer support with interaction history
- Educational tutoring that adapts to students
Enterprise Applications
- Knowledge management for teams
- CRM enhancement with interaction intelligence
- Compliance monitoring and audit trails
Research & Analytics
- Behavioral pattern analysis
- Long-term trend detection
- User experience personalization
| Metric | Performance |
|---|---|
| Query Speed | 50-100x faster with caching |
| Throughput | 500-1000+ requests/second |
| Latency | 1-2ms (cached), 5-15ms (uncached) |
| Availability | 99.9% uptime |
- Documentation: Complete guides
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Discord: Join our community
Over 25 working examples in the examples/ directory:
# Basic operations
python examples/01_basic_usage.py
# Advanced features
python examples/11_intelligence_features_demo.py
python examples/13_temporal_reasoning_demo.py
python examples/14_cross_session_insights_demo.py
# New v0.4.0 features
python examples/20_collaboration_demo.py # Multi-agent collaboration
python examples/21_predictive_analytics_demo.py # Predictive analytics
python examples/22_auto_healing_demo.py # Auto-healing pipeline
python examples/consolidation_demo.py # Memory consolidationWe welcome contributions! See our Contributing Guide for details.
git clone https://github.com/rexdivakar/HippocampAI.git
cd HippocampAI
pip install -e ".[dev]"
pytestApache 2.0 - Use freely in commercial and open-source projects.
If you find HippocampAI useful, please star the repo! It helps others discover the project.
Built with by the HippocampAI team
from hippocampai import MemoryClient
client = MemoryClient()
# Core operations
memory = client.remember("text", user_id="alice")
results = client.recall("query", user_id="alice", k=5)
client.update_memory(memory_id, text="new text")
client.delete_memory(memory_id)
# Intelligence
facts = client.extract_facts("John works at Google")
entities = client.extract_entities("Elon Musk founded SpaceX")
patterns = client.detect_patterns(user_id="alice")
# Analytics
habits = client.detect_habits(user_id="alice")
changes = client.track_behavior_changes(user_id="alice")
stats = client.get_memory_statistics(user_id="alice")
# Sessions
session = client.create_session(user_id="alice", title="Planning")
client.complete_session(session.id, generate_summary=True)
# Bi-Temporal Facts (NEW)
from hippocampai.models.bitemporal import BiTemporalQuery
fact = client.store_bitemporal_fact(
user_id="alice",
subject="alice",
predicate="works_at",
object_value="Acme Corp",
valid_from=datetime(2024, 1, 1),
)
facts = client.query_bitemporal_facts(BiTemporalQuery(
user_id="alice",
valid_at=datetime(2024, 6, 1),
))
# Context Assembly (NEW)
from hippocampai.context.models import ContextConstraints
context = client.assemble_context(
user_id="alice",
query="What are Alice's work preferences?",
constraints=ContextConstraints(token_budget=4000),
)
print(context.final_context_text)
# Custom Schema Validation (NEW)
from hippocampai.schema import SchemaRegistry
registry = SchemaRegistry()
result = registry.validate_entity("person", {"name": "Alice"})
# Relevance Feedback (NEW v0.5.0)
client.rate_recall(
memory_id=results[0].memory.id,
user_id="alice",
query="coffee preferences",
feedback_type="relevant"
)
# See docs/LIBRARY_COMPLETE_REFERENCE.md for the full method reference