JRVS — Local-First AI Agent Framework

JRVS is a RAG-powered CLI AI assistant built for developers who want explicit control, predictable behavior, and extensible architecture when working with local language models.

Designed for privacy-sensitive, offline, or resource-constrained environments — not cloud-scale SaaS.

Why JRVS?

Most agent frameworks optimize for hosted APIs and rapid abstraction. JRVS is optimized for a different set of constraints:

Local inference — CPU/GPU, quantized models, limited memory
Offline or privacy-sensitive workflows — no data leaves your machine
Explicit control — over tools, memory, retrieval, and context
Extensibility without tight coupling — via MCP and UTCP protocols

Features

Category	Capability
LLM Backends	Ollama, LM Studio (switchable at runtime)
RAG Pipeline	FAISS + BGE embeddings + cross-encoder reranking + MMR diversity
Memory	SQLite persistent memory, cross-session FAISS retrieval
Web Search	Brave Search API with auto-ingest into knowledge base
Web Scraping	BeautifulSoup scraper with dedup and chunking
File Uploads	Drop files in `uploads/` and ingest into knowledge base
MCP	Full MCP client + server (17+ tools)
UTCP	Universal Tool Calling Protocol via REST API
Web UI	Optional browser-based chat interface
API Server	FastAPI server for programmatic access

Quick Start

Prerequisites

Tool	Required	Purpose
Python 3.8+	Yes	Runtime
Ollama or LM Studio	Yes	LLM backend
Node.js	Optional	MCP servers, web UI

Install

# 1. Clone the repo
git clone https://github.com/Xthebuilder/JRVS.git
cd JRVS

# 2. Create a virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate        # macOS/Linux
# venv\Scripts\activate         # Windows

# 3. Install Python dependencies
pip install -r requirements.txt

# 4. Pull a model (Ollama)
ollama pull llama3.1

# 5. Run JRVS
python main.py

Automated Setup Scripts

macOS: chmod +x setup_mac.sh && ./setup_mac.sh
Windows: setup_windows.bat

Platform Setup

Windows

Install Python 3.8+ — check "Add Python to PATH"
Install Ollama or LM Studio
Install Node.js LTS (optional, for MCP and web UI)

pip install -r requirements.txt
ollama serve
ollama pull llama3.1
python main.py

Tips:

Use a virtual environment to avoid dependency conflicts:

python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt

If torch install fails: install Visual C++ Build Tools
If FAISS fails: pip install faiss-cpu --no-cache-dir

macOS

# Install Homebrew if needed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

brew install python@3.11 ollama node
pip3 install -r requirements.txt
ollama serve
ollama pull llama3.1
python3 main.py

Tips:

Apple Silicon (M1/M2/M3): all dependencies support ARM natively
If python is not found, use python3
If SSL errors: /Applications/Python\ 3.11/Install\ Certificates.command
Add Homebrew to PATH (Apple Silicon): eval "$(/opt/homebrew/bin/brew shellenv)"

Linux

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
ollama serve &
ollama pull llama3.1
python main.py

Usage

Starting JRVS

python main.py                          # Default (Ollama)
python main.py --use-lmstudio           # Use LM Studio
python main.py --theme cyberpunk        # Set theme
python main.py --model llama3.1:8b      # Set model
python main.py --debug                  # Debug mode

All options: python main.py --help

CLI Commands

Command	Description
`/help`	Show all commands
`/models`	List available models
`/switch <model>`	Switch active model
Knowledge Base
`/scrape <url>`	Scrape a URL into the knowledge base
`/search <query>`	Search stored documents
`/websearch <query>`	Search via Brave API and auto-ingest results
`/upload list`	List files in the `uploads/` folder
`/upload ingest <file>`	Ingest a specific file into the knowledge base
`/upload ingest-all`	Ingest all files in `uploads/`
`/upload read <file>`	Read a file from `uploads/`
Brave Search
`/brave-key <key>`	Set Brave API key at runtime
`/brave-status`	Show Brave config and request usage
Calendar
`/calendar`	Show upcoming events (7 days)
`/today`	Show today's events
`/month [month] [year]`	Show ASCII calendar
MCP
`/mcp-servers`	List connected MCP servers
`/mcp-tools [server]`	List available MCP tools
System
`/stats`	Show system statistics
`/history`	Show conversation history
`/theme <name>`	Change theme (`matrix`, `cyberpunk`, `minimal`)
`/clear`	Clear screen
`/exit`	Exit JRVS

How It Works

RAG Pipeline

User query
    │
    ▼
Hybrid Search ──── FAISS (semantic) + FTS5 (keyword) ──── RRF fusion
    │
    ▼
Cross-Encoder Reranker (ms-marco-MiniLM-L-6-v2)
    │
    ▼
MMR Diversity Filter
    │
    ▼
Context injected into LLM prompt → Response

Ingestion: URLs/files → chunked → BGE embeddings → FAISS + FTS5
Retrieval: hybrid FAISS + keyword search → reranked → diverse context
Memory: every conversation turn is embedded and stored for cross-session recall

Brave Search Integration

# Set your API key (one time)
export BRAVE_API_KEY="your_key_here"

# Or set at runtime
/brave-key your_key_here

# Search and auto-ingest results
/websearch what is FAISS used for

Results are automatically scraped (full page, not just snippets) and added to your knowledge base.

File Uploads

Drop any text-based file into the uploads/ folder:

cp my_notes.txt /path/to/JRVS/uploads/

Then ingest it:

jarvis❯ /upload ingest-all

Supports 30+ extensions: .txt, .md, .py, .js, .json, .csv, .yaml, .pdf, and more.

Configuration

Environment Variables

Variable	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server URL
`OLLAMA_DEFAULT_MODEL`	`deepseek-r1:14b`	Default Ollama model
`LMSTUDIO_BASE_URL`	`http://127.0.0.1:1234/v1`	LM Studio server URL
`LMSTUDIO_DEFAULT_MODEL`	(auto-detect)	LM Studio model
`BRAVE_API_KEY`	—	Brave Search API key
`BRAVE_MAX_REQUESTS_PER_SESSION`	`20`	Request budget per session
`BRAVE_SEARCH_RESULTS_PER_QUERY`	`5`	Results per search
`BRAVE_AUTO_SCRAPE`	`true`	Scrape full pages vs snippets only
`EMBEDDING_MODEL`	`BAAI/bge-base-en-v1.5`	Sentence-transformer model
`SIMILARITY_THRESHOLD`	`0.35`	Min cosine similarity for retrieval
`MAX_CONTEXT_LENGTH`	`12000`	Max chars of context injected
`CONVERSATION_HISTORY_TURNS`	`8`	In-session turns sent to LLM

All variables can be set in your shell or a .env file.

Example — remote Ollama:

export OLLAMA_BASE_URL="http://192.168.1.100:11434"
export OLLAMA_DEFAULT_MODEL="llama3.1:8b"
python main.py

MCP Integration

MCP Client (connect JRVS to external tools)

Configure servers in mcp_gateway/client_config.json:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home/user"]
    },
    "memory": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-memory"]
    }
  }
}

JRVS auto-connects on startup. Use /mcp-servers and /mcp-tools to inspect.

Available MCP servers: filesystem, github, postgres, brave-search, memory, slack, and more.

MCP Server (expose JRVS to Claude or other agents)

python mcp_gateway/server.py

17 tools exposed: RAG search, web scraping, calendar, model switching, and more.

See docs/MCP_SETUP.md for full configuration.

UTCP Support

JRVS implements UTCP — a lightweight protocol for AI agents to discover and call tools directly via their native protocols (HTTP, WebSocket, CLI).

# Start the API server
python api/server.py

# Discover available tools
curl http://localhost:8000/utcp

# Call a tool directly
curl -X POST http://localhost:8000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello JRVS!"}'

	UTCP	MCP
Architecture	Direct API calls	Wrapper servers
Overhead	Zero	Proxy latency
Best for	REST APIs	Stdio tools, complex workflows

See docs/UTCP_GUIDE.md for details.

Project Structure

JRVS/
├── main.py                  # Entry point
├── config.py                # All configuration (env vars)
├── requirements.txt
│
├── rag/
│   ├── embeddings.py        # BGE-base-en-v1.5 embeddings (768-dim)
│   ├── vector_store.py      # FAISS index (HNSW auto-upgrade at 200k vectors)
│   ├── retriever.py         # Hybrid search + MMR pipeline
│   └── reranker.py          # Cross-encoder reranker
│
├── core/
│   ├── database.py          # SQLite (aiosqlite) + FTS5
│   ├── file_handler.py      # File upload ingestion
│   └── calendar.py          # Calendar parsing
│
├── llm/
│   ├── ollama_client.py     # Ollama /api/chat integration
│   └── lmstudio_client.py   # LM Studio integration
│
├── cli/
│   ├── interface.py         # JarvisCLI main loop
│   ├── commands.py          # Command routing
│   └── themes.py            # Matrix / Cyberpunk / Minimal themes
│
├── scraper/
│   ├── web_scraper.py       # BeautifulSoup scraper
│   └── brave_search.py      # Brave Search API client
│
├── api/
│   └── server.py            # FastAPI server + UTCP endpoint
│
├── mcp_gateway/
│   ├── server.py            # MCP server (17 tools)
│   ├── client.py            # MCP client
│   └── client_config.json   # MCP server configuration
│
├── uploads/                 # Drop files here for ingestion
│
└── data/                    # Auto-generated
    ├── jarvis.db            # SQLite database
    └── faiss_index.*        # Vector index

API Integration

from rag.retriever import rag_retriever
from llm.ollama_client import ollama_client

# Add a document to the knowledge base
doc_id = await rag_retriever.add_document(content, title, url)

# Retrieve relevant context
context = await rag_retriever.retrieve_context(query)

# Generate a response
response = await ollama_client.generate(query, context=context)

Troubleshooting

"Cannot connect to Ollama"

ollama serve          # Start Ollama
ollama list           # Verify models are installed
ollama pull llama3.1  # Pull a model if list is empty

Or switch to LM Studio: python main.py --use-lmstudio

"Cannot connect to LM Studio"

Open LM Studio → enable the local server → load a model

Import errors / missing packages

pip install -r requirements.txt
python --version   # Must be 3.8+

Performance issues

Use a smaller model: ollama pull llama3.1:8b
Reduce MAX_CONTEXT_LENGTH in config.py
Clear the vector cache: rm data/faiss_index.*

FAISS dimension mismatch after upgrading JRVS auto-detects and rebuilds the index if the embedding model changed. Old .map files are auto-migrated to SQLite.

MCP servers not connecting

Install Node.js: node --version and npm --version
Check paths in mcp_gateway/client_config.json

Contributing

Contributions are welcome, especially:

Tool protocol extensions (MCP/UTCP)
Performance improvements to the RAG pipeline
Documentation and design feedback

Please open an issue before large changes to align on direction.

License

Educational and personal use. Respect website terms of service when scraping.

Acknowledgments

Ollama · FAISS · BGE Embeddings · Sentence Transformers · Rich · BeautifulSoup · Brave Search

Version	Changes	Urgency	Date
main@2026-05-21	Latest activity on main branch	High	5/21/2026
0.0.0	No release found — using repo HEAD	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026
main@2026-02-28	Latest activity on main branch	Low	2/28/2026

JRVS

Description

README