Ambient intelligence that sees what you see, hears what you hear, and acts on your behalf.
Quick Start ยท Docs ยท Privacy ยท Configuration ยท Contributing
Sinain captures your screen and audio continuously, runs OCR and transcription, and feeds a rolling context window to your agent. The agent analyzes what's happening, surfaces advice on a private HUD overlay, and can act on its own โ fixing code, running commands, or spawning background tasks.
- Screen capture โ OCR โ context digest, updated every few seconds.
- System audio โ transcription (local whisper.cpp or cloud) โ real-time awareness.
- Private overlay: only you see it. Never in screenshots, recordings, or screen shares.
Sinain feeds the same screen and audio context to any MCP-compatible agent. Switch agents without losing context. Add new ones without reconfiguring.
- Tested with Claude Code, Codex, Goose, Junie, and Aider. Any MCP-compatible agent works.
- Knowledge modules travel with you โ export from one machine, import on another.
- Run with an OpenClaw gateway, or use the shell harness (
sinain-agent/run.sh) to connect your own agent.
By default, sinain uses cloud APIs (OpenRouter) for transcription and analysis. When you need tighter control, switch privacy modes โ no code changes, one env var.
offโstandardโstrictโparanoidโ four modes in~/.sinain/.env.paranoidmode: Ollama + whisper.cpp, fully offline. No network calls.- HUD overlay is invisible to screen capture (
NSWindow.sharingType = .none).
npx @geravant/sinain startThat's it. On first run, sinain will:
- Run an interactive setup wizard โ transcription backend, API key, agent, privacy mode
- Auto-download the overlay app, sck-capture binary, and Python dependencies
- Start all services โ sinain-core, sense_client, overlay, and agent
Re-run the wizard anytime:
npx @geravant/sinain start --setup
- Node.js 18+ โ nodejs.org (LTS recommended)
- Python 3.10+ โ
brew install python3(macOS) or python.org - OpenRouter API key (optional for local-only mode) โ openrouter.ai
Fully local? No API key needed. Ollama + whisper-cli = zero cloud. See Running Fully Local.
- System Settings โ Privacy & Security โ Screen Recording โ add your Terminal
- System Settings โ Privacy & Security โ Microphone โ add your Terminal
npx @geravant/sinain stop # stop all services
npx @geravant/sinain status # check what's running
npx @geravant/sinain start --setup # re-run setup wizard
npx @geravant/sinain start --no-sense # skip screen capture
npx @geravant/sinain start --no-overlay # headless modeโโโโ Your Device โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ sck-capture (Swift) โ
โ โโ system audio (PCM) โโโบ sinain-core :9500 โ
โ โโ screen frames (JPEG) โโโบ sense_client โโโ POST /sense โโโบ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ sinain-core โ
โ โโ audio pipeline โ transcription โ
โ โโ agent loop โ digest + HUD text โ
โ โโ escalation โโโบ OpenClaw Gateway (WS) โ
โ โ or sinain-agent (poll) โ
โ โโ WebSocket feed โ
โ โ โ
โ โผ โ
โ overlay (Flutter) โ
โ private, invisible to screen capture โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโดโโโโโโโโโโ
โผ โผ
OpenClaw Gateway sinain-agent
(server or local) (bare agent, no gateway)
โโ sinain-hud plugin
โ โโ sinain-knowledge (curation, playbook, eval)
โโ SITUATION.md, Telegram alerts
| Component | Language | What it does | Docs |
|---|---|---|---|
| sinain-core | TypeScript | Central hub: audio pipeline, agent loop, escalation, WS feed | README |
| overlay | Dart / Swift / C++ | Private HUD (macOS + Windows), 4 display modes, hotkeys | Hotkeys |
| sense_client | Python | Screen capture, SSIM diff, OCR, privacy filter | sense_client/ |
| sck-capture | Swift | ScreenCaptureKit: system audio + screen frames | tools/sck-capture/ |
| sinain-agent | Bash | Shell harness that connects any agent to sinain-core | sinain-agent/ |
| sinain-knowledge | TypeScript | Curation, playbook, eval, portable knowledge modules | Knowledge System |
| sinain-hud-plugin | TypeScript | OpenClaw plugin: lifecycle, curation, overflow watchdog | sinain-hud-plugin/ |
| sinain-mcp-server | TypeScript | MCP server exposing sinain tools to agents | sinain-mcp-server/ |
All config via .env at project root (created by the setup wizard or cp .env.example .env).
The context analysis loop runs every 3โ30 seconds, sending recent audio/screen context to an LLM. It produces a digest used for escalation scoring โ when the score threshold is met (or always in rich mode), the digest is forwarded to the escalation agent for a full response.
| Variable | Default | Description |
|---|---|---|
ANALYSIS_PROVIDER |
openrouter |
openrouter (cloud) or ollama (local, free) |
ANALYSIS_MODEL |
google/gemini-2.5-flash-lite |
Primary model for text analysis |
ANALYSIS_VISION_MODEL |
google/gemini-2.5-flash |
Auto-selected when screen images are present |
ANALYSIS_ENDPOINT |
(auto per provider) | Override for custom OpenAI-compatible endpoints |
ANALYSIS_API_KEY |
(from OPENROUTER_API_KEY) | API key; not needed for ollama |
ANALYSIS_FALLBACK_MODELS |
gemini-2.5-flash,... |
Comma-separated fallback chain |
| Variable | Default | Description |
|---|---|---|
OPENROUTER_API_KEY |
โ | Required (unless ANALYSIS_PROVIDER=ollama + local transcription) |
ESCALATION_MODE |
selective |
off / selective / focus / rich |
PRIVACY_MODE |
off |
off / standard / strict / paranoid |
See docs/CONFIGURATION.md for the full reference.
| Mode | What it does |
|---|---|
off |
All data flows freely โ maximum insight quality |
standard |
Auto-redacts credentials before cloud APIs (wizard default) |
strict |
Only summaries leave your machine โ no raw text sent to cloud |
paranoid |
Fully local: Ollama + whisper.cpp. Zero network calls. |
See Privacy Threat Model for the full design.
Global hotkeys use Cmd+Shift (macOS) or Ctrl+Shift (Windows):
| Shortcut | Action |
|---|---|
Cmd+Shift+Space |
Toggle overlay visibility |
Cmd+Shift+M |
Cycle display mode |
Cmd+Shift+/ |
Open command input |
Cmd+Shift+H |
Quit overlay |
See docs/HOTKEYS.md for all 15 shortcuts.
No cloud APIs needed. Local models handle everything:
# 1. Install local transcription
./setup-local-stt.sh
# 2. Install Ollama + vision model
brew install ollama && ollama pull llava
# 3. Start in local mode
./start-local.sh| Model | Size | Speed | Best for |
|---|---|---|---|
llava |
4.7 GB | ~2s/frame | General use (recommended) |
llama3.2-vision |
7.9 GB | ~4s/frame | Best accuracy |
moondream |
1.7 GB | ~1s/frame | Fastest, lower quality |
| Setup | Guide |
|---|---|
| Local OpenClaw | docs/INSTALL-LOCAL.md |
| Remote OpenClaw | docs/INSTALL-REMOTE.md |
| NemoClaw (Brev) | docs/INSTALL.md |
| Bare Agent | docs/INSTALL-BARE-AGENT.md |
| Windows | setup-windows.sh |
| From Source | git clone, cp .env.example ~/.sinain/.env, ./start.sh |
Sinain builds a persistent knowledge graph from everything it captures โ audio transcriptions, screen OCR, and agent interactions. Facts are distilled incrementally (on buffer full and session end), stored in an EAV triplestore with graph relationships, and retrieved via hybrid search (FTS5 + tag-based + entity graph backrefs with RRF fusion).
The integration step is fully deterministic โ no LLM decides what to store. Every extracted fact is preserved.
npx @geravant/sinain export-knowledge # export playbook, modules, graph
npx @geravant/sinain import-knowledge ~/sinain-knowledge-export.tar.gzSee Knowledge System docs for architecture details.
| Topic | Doc |
|---|---|
| Knowledge System | docs/knowledge-system.md |
| Escalation Architecture | docs/clean-architecture-escalation.md |
| Personality Traits | docs/PERSONALITY-TRAITS-SYSTEM.md |
| Privacy Threat Model | docs/privacy-protection-design.md |
| HUD Skill Protocol | docs/HUD-SKILL-PROTOCOL.md |
| Full Configuration | docs/CONFIGURATION.md |
| All Hotkeys | docs/HOTKEYS.md |
See CONTRIBUTING.md.
MIT


