Your universal API proxy โ one endpoint, 100+ providers, zero downtime. Now with MCP Server (25 tools), A2A Protocol, Memory/Skills Systems & Electron Desktop App.
Chat Completions โข Embeddings โข Image Generation โข Video โข Music โข Audio โข Reranking โข Web Search โข MCP Server โข A2A Protocol โข 100% TypeScript
๐ Website โข ๐ Quick Start โข ๐ก Features โข ๐ Docs โข ๐ฐ Pricing โข ๐ฌ WhatsApp
๐ Available in: ๐บ๐ธ English | ๐ง๐ท Portuguรชs (Brasil) | ๐ช๐ธ Espaรฑol | ๐ซ๐ท Franรงais | ๐ฎ๐น Italiano | ๐ท๐บ ะ ัััะบะธะน | ๐จ๐ณ ไธญๆ (็ฎไฝ) | ๐ฉ๐ช Deutsch | ๐ฎ๐ณ เคนเคฟเคจเฅเคฆเฅ | ๐น๐ญ เนเธเธข | ๐บ๐ฆ ะฃะบัะฐัะฝััะบะฐ | ๐ธ๐ฆ ุงูุนุฑุจูุฉ | ๐ฏ๐ต ๆฅๆฌ่ช | ๐ป๐ณ Tiแบฟng Viแปt | ๐ง๐ฌ ะัะปะณะฐััะบะธ | ๐ฉ๐ฐ Dansk | ๐ซ๐ฎ Suomi | ๐ฎ๐ฑ ืขืืจืืช | ๐ญ๐บ Magyar | ๐ฎ๐ฉ Bahasa Indonesia | ๐ฐ๐ท ํ๊ตญ์ด | ๐ฒ๐พ Bahasa Melayu | ๐ณ๐ฑ Nederlands | ๐ณ๐ด Norsk | ๐ต๐น Portuguรชs (Portugal) | ๐ท๐ด Romรขnฤ | ๐ต๐ฑ Polski | ๐ธ๐ฐ Slovenฤina | ๐ธ๐ช Svenska | ๐ต๐ญ Filipino | ๐จ๐ฟ ฤeลกtina
Click to see dashboard screenshots
| Page | Screenshot |
|---|---|
| Providers | ![]() |
| Combos | ![]() |
| Analytics | ![]() |
| Health | ![]() |
| Translator | ![]() |
| Settings | ![]() |
| CLI Tools | ![]() |
| Usage Logs | ![]() |
| Endpoints | ![]() |
Connect any AI-powered IDE or CLI tool through OmniRoute โ free API gateway for unlimited coding.
|
OpenClaw โญ 205K |
NanoBot โญ 20.9K |
PicoClaw โญ 14.6K |
ZeroClaw โญ 9.9K |
IronClaw โญ 2.1K |
|
OpenCode โญ 106K |
Codex CLI โญ 60.8K |
Claude Code โญ 67.3K |
Gemini CLI โญ 94.7K |
Kilo Code โญ 15.5K |
๐ก All agents connect via http://localhost:20128/v1 or http://cloud.omniroute.online/v1 โ one config, unlimited models and quota
Stop wasting money and hitting limits:
OmniRoute solves this:
- โ Maximize subscriptions - Track quota, use every bit before reset
- โ Auto fallback - Subscription โ API Key โ Cheap โ Free, zero downtime
- โ Multi-account - Round-robin between accounts per provider
- โ Universal - Works with Claude Code, Codex, Gemini CLI, Cursor, Cline, OpenClaw, any CLI tool
๐ฌ Join our community! WhatsApp Group โ Get help, share tips, and stay updated.
- Website: omniroute.online
- GitHub: github.com/diegosouzapw/OmniRoute
- Issues: github.com/diegosouzapw/OmniRoute/issues
- WhatsApp: Community Group
- Contributing: See CONTRIBUTING.md, open a PR, or pick a
good first issue - Original Project: 9router by decolua
When opening an issue, please run the system-info command and attach the generated file:
npm run system-info
This generates a
system-info.txtwith your Node.js version, OmniRoute version, OS details, installed CLI tools (qoder, gemini, claude, codex, antigravity, droid, etc.), Docker/PM2 status, and system packages โ everything we need to reproduce your issue quickly. Attach the file directly to your GitHub issue.
โโโโโโโโโโโโโโโ โ Your CLI โ (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...) โ Tool โ โโโโโโโโฌโโโโโโโ โ http://localhost:20128/v1 โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ OmniRoute (Smart Router) โ โ โข Format translation (OpenAI โ Claude) โ โ โข Quota tracking + Embeddings + Images โ โ โข Auto token refresh โ โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โโโ [Tier 1: SUBSCRIPTION] Claude Code, Codex, Gemini CLI โ โ quota exhausted โโโ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM, etc. โ โ budget limit โโโ [Tier 3: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M) โ โ budget limit โโโ [Tier 4: FREE] Qoder, Qwen, Kiro (unlimited) Result: Never stop coding, minimal cost
Every developer using AI tools faces these problems daily. OmniRoute was built to solve them all โ from cost overruns to regional blocks, from broken OAuth flows to protocol operations and enterprise observability.
๐ธ 1. "I pay for an expensive subscription but still get interrupted by limits"
Developers pay $20โ200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling โ 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.
How OmniRoute solves it:
- Smart 4-Tier Fallback โ If subscription quota runs out, automatically redirects to API Key โ Cheap โ Free with zero manual intervention
- Provider Limits Tracking โ Cached quota snapshots refresh on a server-side schedule (default
PROVIDER_LIMITS_SYNC_INTERVAL_MINUTES=70) with manual refresh available in the UI - Multi-Account Support โ Multiple accounts per provider with auto round-robin โ when one runs out, switches to the next
- Custom Combos โ Customizable fallback chains with 13 balancing strategies (priority, weighted, fill-first, round-robin, P2C, random, least-used, cost-optimized, strict-random, auto, lkgp, context-optimized, context-relay)
- Codex Business Quotas โ Business/Team workspace quota monitoring directly in the dashboard
๐ 2. "I need to use multiple providers but each has a different API"
OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.
How OmniRoute solves it:
- Unified Endpoint โ A single
http://localhost:20128/v1serves as proxy for all 100+ providers - Format Translation โ Automatic and transparent: OpenAI โ Claude โ Gemini โ Responses API
- Response Sanitization โ Strips non-standard fields (
x_groq,usage_breakdown,service_tier) that break OpenAI SDK v1.83+ - Role Normalization โ Converts
developerโsystemfor non-OpenAI providers;systemโuserfor GLM/ERNIE - Think Tag Extraction โ Extracts
<think>blocks from models like DeepSeek R1 into standardizedreasoning_content - Structured Output for Gemini โ
json_schemaโresponseMimeType/responseSchemaautomatic conversion streamdefaults tofalseโ Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs
๐ 3. "My AI provider blocks my region/country"
Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like
unsupported_country_region_territoryduring OAuth and API connections. This is especially frustrating for developers from developing countries.How OmniRoute solves it:
- 3-Level Proxy Config โ Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
- Color-Coded Proxy Badges โ Visual indicators: ๐ข global proxy, ๐ก provider proxy, ๐ต connection proxy, always showing the IP
- OAuth Token Exchange Through Proxy โ OAuth flow also goes through the proxy, solving
unsupported_country_region_territory - Connection Tests via Proxy โ Connection tests use the configured proxy (no more direct bypass)
- SOCKS5 Support โ Full SOCKS5 proxy support for outbound routing
- TLS Fingerprint Spoofing โ Browser-like TLS fingerprint via
wreq-jsto bypass bot detection - ๐ CLI Fingerprint Matching โ Reorders headers and body fields to match native CLI binary signatures, drastically reducing account flagging risk. The proxy IP is preserved โ you get both stealth and IP masking simultaneously
๐ 4. "I want to use AI for coding but I have no money"
Not everyone can pay $20โ200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.
How OmniRoute solves it:
- Free Tier Providers Built-in โ Native support for 100% free providers: Qoder (5 unlimited models via OAuth: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1, minimax-m2, kimi-k2), Qwen (4 unlimited models: qwen3-coder-plus, qwen3-coder-flash, qwen3-coder-next, vision-model), Kiro (Claude + AWS Builder ID for free), Gemini CLI (180K tokens/month free)
- Ollama Cloud โ Cloud-hosted Ollama models at
api.ollama.comwith free "Light usage" tier; useollamacloud/<model>prefix - Free-Only Combos โ Chain
gc/gemini-3-flash โ if/kimi-k2-thinking โ qw/qwen3-coder-plus= $0/month with zero downtime - NVIDIA NIM Free Access โ ~40 RPM dev-forever free access to 70+ models at build.nvidia.com (transitioning from credits to pure rate limits)
- Cost Optimized Strategy โ Routing strategy that automatically chooses the cheapest available provider
๐ 5. "I need to protect my AI gateway from unauthorized access"
When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.
How OmniRoute solves it:
- API Key Management โ Generation, rotation, and scoping per provider with a dedicated
/dashboard/api-managerpage - Model-Level Permissions โ Restrict API keys to specific models (
openai/*, wildcard patterns), with Allow All/Restrict toggle - API Endpoint Protection โ Require a key for
/v1/modelsand block specific providers from the listing - Auth Guard + CSRF Protection โ All dashboard routes protected with
withAuthmiddleware + CSRF tokens - Rate Limiter โ Per-IP rate limiting with configurable windows
- IP Filtering โ Allowlist/blocklist for access control
- Prompt Injection Guard โ Sanitization against malicious prompt patterns
- AES-256-GCM Encryption โ Credentials encrypted at rest
๐ 6. "My provider went down and I lost my coding flow"
AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.
How OmniRoute solves it:
- Circuit Breaker per-model โ Auto-open/close with configurable thresholds and cooldown (Closed/Open/Half-Open), scoped per-model to avoid cascading blocks
- Exponential Backoff โ Progressive retry delays
- Anti-Thundering Herd โ Mutex + semaphore protection against concurrent retry storms
- Combo Fallback Chains โ If the primary provider fails, automatically falls through the chain with no intervention
- Combo Circuit Breaker โ Auto-disables failing providers within a combo chain
- Health Dashboard โ Uptime monitoring, circuit breaker states, lockouts, cache stats, p50/p95/p99 latency
๐ง 7. "Configuring each AI tool is tedious and repetitive"
Developers use Cursor, Claude Code, Codex CLI, OpenClaw, Gemini CLI, Kilo Code... Each tool needs a different config (API endpoint, key, model). Reconfiguring when switching providers or models is a waste of time.
How OmniRoute solves it:
- CLI Tools Dashboard โ Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
- GitHub Copilot Config Generator โ Generates
chatLanguageModels.jsonfor VS Code with bulk model selection - Onboarding Wizard โ Guided 4-step setup for first-time users
- One endpoint, all models โ Configure
http://localhost:20128/v1once, access 100+ providers
๐ 8. "Managing OAuth tokens from multiple providers is hell"
Claude Code, Codex, Gemini CLI, Copilot โ all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with
client_secret is missing,redirect_uri_mismatch, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.How OmniRoute solves it:
- Auto Token Refresh โ OAuth tokens refresh in background before expiration
- OAuth 2.0 (PKCE) Built-in โ Automatic flow for Claude Code, Codex, Gemini CLI, Copilot, Kiro, Qwen, Qoder
- Multi-Account OAuth โ Multiple accounts per provider via JWT/ID token extraction
- OAuth LAN/Remote Fix โ Private IP detection for
redirect_uri+ manual URL mode for remote servers - OAuth Behind Nginx โ Uses
window.location.originfor reverse proxy compatibility - Remote OAuth Guide โ Step-by-step guide for Google Cloud credentials on VPS/Docker
๐ 9. "I don't know how much I'm spending or where"
Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.
How OmniRoute solves it:
- Cost Analytics Dashboard โ Per-token cost tracking and budget management per provider
- Budget Limits per Tier โ Spending ceiling per tier that triggers automatic fallback
- Per-Model Pricing Configuration โ Configurable prices per model
- Usage Statistics Per API Key โ Request count and last-used timestamp per key
- Analytics Dashboard โ Stat cards, model usage chart, provider table with success rates and latency
๐ 10. "I can't diagnose errors and problems in AI calls"
When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.
How OmniRoute solves it:
- Unified Logs Dashboard โ 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
- Console Log Viewer โ Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
- SQLite Proxy Logs โ Persistent logs that survive server restarts
- Translator Playground โ 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
- Request Telemetry โ p50/p95/p99 latency + X-Request-Id tracing
- File-Based Logging with Rotation โ App logs rotate by size, retention days, and archive count; call log artifacts rotate by retention days and file count
- System Info Report โ
npm run system-infogeneratessystem-info.txtwith your full environment (Node version, OmniRoute version, OS, CLI tools, Docker/PM2 status). Attach it when reporting issues for instant triage.
๐๏ธ 11. "Deploying and maintaining the gateway is complex"
Installing, configuring, and maintaining an AI proxy across different environments (local, VPS, Docker, cloud) is labor-intensive. Problems like hardcoded paths,
EACCESon directories, port conflicts, and cross-platform builds add friction.How OmniRoute solves it:
- npm global install โ
npm install -g omniroute && omnirouteโ done - Docker Multi-Platform โ AMD64 + ARM64 native (Apple Silicon, AWS Graviton, Raspberry Pi)
- Docker Compose Profiles โ
base(no CLI tools) andcli(with Claude Code, Codex, OpenClaw) - Electron Desktop App โ Native app for Windows/macOS/Linux with system tray, auto-start, offline mode
- Split-Port Mode โ API and Dashboard on separate ports for advanced scenarios (reverse proxy, container networking)
- Cloud Sync โ Config synchronization across devices via Cloudflare Workers
- DB Backups โ Automatic backup, restore, export and import of all settings, with
DISABLE_SQLITE_AUTO_BACKUPfor externally managed backups
๐ 12. "The interface is English-only and my team doesn't speak English"
Teams in non-English-speaking countries, especially in Latin America, Asia, and Europe, struggle with English-only interfaces. Language barriers reduce adoption and increase configuration errors.
How OmniRoute solves it:
- Dashboard i18n โ 30 Languages โ All 500+ keys translated including Arabic, Bulgarian, Danish, German, Spanish, Finnish, French, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese (PT/BR), Romanian, Russian, Slovak, Swedish, Thai, Ukrainian, Vietnamese, Chinese, Filipino, English
- RTL Support โ Right-to-left support for Arabic and Hebrew
- Multi-Language READMEs โ 30 complete documentation translations
- Language Selector โ Globe icon in header for real-time switching
๐ 13. "I need more than chat โ I need embeddings, images, audio"
AI isn't just chat completion. Devs need to generate images, transcribe audio, create embeddings for RAG, rerank documents, and moderate content. Each API has a different endpoint and format.
How OmniRoute solves it:
- Embeddings โ
/v1/embeddingswith 6 providers and 9+ models - Image Generation โ
/v1/images/generationswith 10 providers and 20+ models (OpenAI, xAI, Together, Fireworks, Nebius, Hyperbolic, NanoBanana, Antigravity, SD WebUI, ComfyUI) - Text-to-Video โ
/v1/videos/generationsโ ComfyUI (AnimateDiff, SVD) and SD WebUI - Text-to-Music โ
/v1/music/generationsโ ComfyUI (Stable Audio Open, MusicGen) - Audio Transcription โ
/v1/audio/transcriptionsโ Whisper + Nvidia NIM, HuggingFace, Qwen3 - Text-to-Speech โ
/v1/audio/speechโ ElevenLabs, Nvidia NIM, HuggingFace, Coqui, Tortoise, Qwen3, Inworld, Cartesia, PlayHT, + existing providers - Moderations โ
/v1/moderationsโ Content safety checks - Reranking โ
/v1/rerankโ Document relevance reranking - Responses API โ Full
/v1/responsessupport for Codex
๐งช 14. "I have no way to test and compare quality across models"
Developers want to know which model is best for their use case โ code, translation, reasoning โ but comparing manually is slow. No integrated eval tools exist.
How OmniRoute solves it:
- LLM Evaluations โ Golden set testing with 10 pre-loaded cases covering greetings, math, geography, code generation, JSON compliance, translation, markdown, safety refusal
- 4 Match Strategies โ
exact,contains,regex,custom(JS function) - Translator Playground Test Bench โ Batch testing with multiple inputs and expected outputs, cross-provider comparison
- Chat Tester โ Full round-trip with visual response rendering
- Live Monitor โ Real-time stream of all requests flowing through the proxy
๐ 15. "I need to scale without losing performance"
As request volume grows, without caching the same questions generate duplicate costs. Without idempotency, duplicate requests waste processing. Per-provider rate limits must be respected.
How OmniRoute solves it:
- Semantic Cache โ Two-tier cache (signature + semantic) reduces cost and latency
- Request Idempotency โ 5s deduplication window for identical requests
- Rate Limit Detection โ Per-provider RPM, min gap, and max concurrent tracking
- Editable Rate Limits โ Configurable defaults in Settings โ Resilience with persistence
- API Key Validation Cache โ 3-tier cache for production performance
- Health Dashboard with Telemetry โ p50/p95/p99 latency, cache stats, uptime
๐ค 16. "I want to control model behavior globally"
Developers who want all responses in a specific language, with a specific tone, or want to limit reasoning tokens. Configuring this in every tool/request is impractical.
How OmniRoute solves it:
- System Prompt Injection โ Global prompt applied to all requests
- Thinking Budget Validation โ Reasoning token allocation control per request (passthrough, auto, custom, adaptive)
- 9 Routing Strategies โ Global strategies that determine how requests are distributed
- Wildcard Router โ
provider/*patterns route dynamically to any provider - Combo Enable/Disable Toggle โ Toggle combos directly from the dashboard
- Manual Combo Ordering โ Drag combo cards by handle and persist the order in SQLite
- Provider Toggle โ Enable/disable all connections for a provider with one click
- Blocked Providers โ Exclude specific providers from
/v1/modelslisting
๐งฐ 17. "I need MCP tools as first-class product capabilities"
Many AI gateways expose MCP only as a hidden implementation detail. Teams need a visible, manageable operation layer.
How OmniRoute solves it:
- MCP appears in the dashboard navigation and endpoint protocol tab
- Dedicated MCP management page with process, tools, scopes, and audit
- Built-in quick-start for
omniroute --mcpand client onboarding
๐ง 18. "I need A2A orchestration with sync + stream task paths"
Agent workflows need both direct replies and long-running streamed execution with lifecycle control.
How OmniRoute solves it:
- A2A JSON-RPC endpoint (
POST /a2a) withmessage/sendandmessage/stream - SSE streaming with terminal state propagation
- Task lifecycle APIs for
tasks/getandtasks/cancel
๐ฐ๏ธ 19. "I need real MCP process health, not guessed status"
Operational teams need to know if MCP is actually alive, not just whether an API is reachable.
How OmniRoute solves it:
- Runtime heartbeat file with PID, timestamps, transport, tool count, and scope mode
- MCP status API combining heartbeat + recent activity
- UI status cards for process/uptime/heartbeat freshness
๐ 20. "I need auditable MCP tool execution"
When tools mutate config or trigger ops actions, teams need forensic traceability.
How OmniRoute solves it:
- SQLite-backed audit logging for MCP tool calls
- Filters by tool, success/failure, API key, and pagination
- Dashboard audit table + stats endpoints for automation
๐ 21. "I need scoped MCP permissions per integration"
Different clients should have least-privilege access to tool categories.
How OmniRoute solves it:
- 10 granular MCP scopes for controlled tool access
- Scope enforcement and visibility in MCP management UI
- Safe default posture for operational tooling
โ๏ธ 22. "I need operational controls without redeploying"
Teams need quick runtime changes during incidents or cost events.
How OmniRoute solves it:
- Switch combo activation directly from MCP dashboard
- Apply resilience profiles from pre-defined policy packs
- Reset circuit breaker state from the same operations panel
๐ 23. "I need live A2A task lifecycle visibility and cancellation"
Without lifecycle visibility, task incidents become hard to triage.
How OmniRoute solves it:
- Task listing/filtering by state/skill with pagination
- Drill-down on task metadata, events, and artifacts
- Task cancellation endpoint and UI action with confirmation
๐ 24. "I need active stream metrics for A2A load"
Streaming workflows require operational insight into concurrency and live connections.
How OmniRoute solves it:
- Active stream counters integrated into A2A status
- Last task timestamp and per-state counts
- A2A dashboard cards for real-time ops monitoring
๐ชช 25. "I need standard agent discovery for clients"
External clients and orchestrators need machine-readable metadata for onboarding.
How OmniRoute solves it:
- Agent Card exposed at
/.well-known/agent.json - Capabilities and skills shown in management UI
- A2A status API includes discovery metadata for automation
๐งญ 26. "I need protocol discoverability in the product UX"
If users cannot discover protocol surfaces, adoption and support quality drop.
How OmniRoute solves it:
- Consolidated Endpoints page with tabs for Proxy, MCP, A2A, and API Endpoints
- Inline service status toggles (Online/Offline) for MCP and A2A
- Links from overview to dedicated management tabs
๐งช 27. "I need end-to-end protocol validation with real clients"
Mock tests are not enough to validate protocol compatibility before release.
How OmniRoute solves it:
- E2E suite that boots app and uses real MCP SDK client transport
- A2A client tests for discovery, send, stream, get, and cancel flows
- Cross-check assertions against MCP audit and A2A tasks APIs
๐ก 28. "I need unified observability across all interfaces"
Splitting observability by protocol creates blind spots and longer MTTR.
How OmniRoute solves it:
- Unified dashboards/logs/analytics in one product
- Health + audit + request telemetry across OpenAI, MCP, and A2A layers
- Operational APIs for status and automation
๐ผ 29. "I need one runtime for proxy + tools + agent orchestration"
Running many separate services increases operational cost and failure modes.
How OmniRoute solves it:
- OpenAI-compatible proxy, MCP server, and A2A server in one stack
- Shared auth, resilience, data store, and observability
- Consistent policy model across all interaction surfaces
๐ 30. "I need to ship agentic workflows without glue-code sprawl"
Teams lose velocity when stitching multiple ad-hoc services and scripts.
How OmniRoute solves it:
- Unified endpoint strategy for clients and agents
- Built-in protocol management UIs and smoke validation paths
- Production-ready foundations (security, logging, resilience, backup)
Playbook A: Maximize paid subscription + cheap backup
Combo: "maximize-claude" 1. cc/claude-opus-4-6 2. glm/glm-4.7 3. if/kimi-k2-thinking Monthly cost: $20 + small backup spend Outcome: higher quality, near-zero interruption
Playbook B: Zero-cost coding stack
Combo: "free-forever" 1. gc/gemini-3-flash 2. if/kimi-k2-thinking 3. qw/qwen3-coder-plus Monthly cost: $0 Outcome: stable free coding workflowPlaybook C: 24/7 always-on fallback chain
Combo: "always-on" 1. cc/claude-opus-4-6 2. cx/gpt-5.2-codex 3. glm/glm-4.7 4. minimax/MiniMax-M2.1 5. if/kimi-k2-thinking Outcome: deep fallback depth for deadline-critical workloads
Playbook D: Agent ops with MCP + A2A
1) Start MCP transport (`omniroute --mcp`) for tool-driven operations 2) Run A2A tasks via `message/send` and `message/stream` 3) Observe via /dashboard/endpoint (MCP and A2A tabs) 4) Toggle services via inline status controls
Setup AI coding in minutes at $0/month. Connect these free accounts and use the built-in Free Stack combo.
Step Action Providers Unlocked 1 Connect Kiro (AWS Builder ID OAuth) Claude Sonnet 4.5, Haiku 4.5 โ unlimited 2 Connect Qoder (Google OAuth) kimi-k2-thinking, qwen3-coder-plus, deepseek-r1... โ unlimited 3 Connect Qwen (Device Code) qwen3-coder-plus, qwen3-coder-flash... โ unlimited 4 Connect Gemini CLI (Google OAuth) gemini-3-flash, gemini-2.5-pro โ 180K/mo free 5 /dashboard/combosโ Free Stack ($0) templateRound-robin all free providers automatically Point any IDE/CLI to:
http://localhost:20128/v1ยท API Key:any-stringยท Done.Optional extra coverage (also free): Groq API key (30 RPM free), NVIDIA NIM (40 RPM free, 70+ models), Cerebras (1M tok/day), LongCat API key (50M tokens/day!), Cloudflare Workers AI (10K Neurons/day, 50+ models).
npm install -g omniroute omniroute
pnpm users: Run
pnpm approve-builds -gafter install to enable native build scripts required bybetter-sqlite3and@swc/core:pnpm install -g omniroute pnpm approve-builds -g # Select all packages โ approve omnirouteDashboard opens at
http://localhost:20128and API base URL ishttp://localhost:20128/v1.Command Description omnirouteStart server ( PORT=20128, API and dashboard on same port)omniroute --port 3000Set canonical/API port to 3000 omniroute --mcpStart MCP server (stdio transport) omniroute --no-openDon't auto-open browser omniroute --helpShow help Optional split-port mode:
PORT=20128 DASHBOARD_PORT=20129 omniroute # API: http://localhost:20128/v1 # Dashboard: http://localhost:20129
When you no longer need OmniRoute, we provide two quick scripts for a clean removal:
Command Action npm run uninstallRemoves the system app but keeps your DB and configurations in ~/.omniroute.npm run uninstall:fullRemoves the app AND permanently erases all configurations, keys, and databases. Note: To run these commands, navigate to the OmniRoute project folder (if you cloned it) and run them. Alternatively, if globally installed, you can simply run
npm uninstall -g omniroute.For most deployments, you only need:
Variable Default Purpose REQUEST_TIMEOUT_MS600000Shared baseline for upstream fetch, hidden Undici timeouts, TLS fingerprint requests, and API bridge request/proxy timeouts STREAM_IDLE_TIMEOUT_MSinherits REQUEST_TIMEOUT_MSMaximum gap between streaming chunks before OmniRoute aborts the SSE stream Backward compatibility is preserved: existing
FETCH_TIMEOUT_MS,API_BRIDGE_PROXY_TIMEOUT_MS, and other per-layer timeout vars still work and override the shared baseline.Advanced overrides are available if you need finer control:
Variable Default Purpose FETCH_TIMEOUT_MSinherits REQUEST_TIMEOUT_MSTotal upstream request timeout used by the main fetch abort signal FETCH_HEADERS_TIMEOUT_MSinherits FETCH_TIMEOUT_MSUndici time limit for receiving upstream response headers FETCH_BODY_TIMEOUT_MSinherits FETCH_TIMEOUT_MSUndici time limit between upstream body chunks ( 0disables it)FETCH_CONNECT_TIMEOUT_MS30000Undici TCP connect timeout FETCH_KEEPALIVE_TIMEOUT_MS4000Undici idle keep-alive socket timeout TLS_CLIENT_TIMEOUT_MSinherits FETCH_TIMEOUT_MSTimeout for TLS fingerprint requests made through wreq-jsAPI_BRIDGE_PROXY_TIMEOUT_MSinherits REQUEST_TIMEOUT_MSor30000Timeout for /v1proxy forwarding from API port to dashboard portAPI_BRIDGE_SERVER_REQUEST_TIMEOUT_MSmax(API_BRIDGE_PROXY_TIMEOUT_MS, 300000)Incoming request timeout on the API bridge server API_BRIDGE_SERVER_HEADERS_TIMEOUT_MS60000Incoming header timeout on the API bridge server API_BRIDGE_SERVER_KEEPALIVE_TIMEOUT_MS5000Keep-alive timeout on the API bridge server API_BRIDGE_SERVER_SOCKET_TIMEOUT_MS0Socket inactivity timeout on the API bridge server ( 0disables it)If you run OmniRoute behind Nginx, Caddy, Cloudflare, or another reverse proxy, make sure the proxy timeouts are also higher than your OmniRoute stream/fetch timeouts.
- Open Dashboard โ
Providersand connect at least one provider (OAuth or API key). - Open Dashboard โ
Endpointsand create an API key. - (Optional) Open Dashboard โ
Combosand set your fallback chain.
Base URL: http://localhost:20128/v1 API Key: [copy from Endpoint page] Model: if/kimi-k2-thinking (or any provider/model prefix)
Works with Claude Code, Codex CLI, Gemini CLI, Cursor, Cline, OpenClaw, OpenCode, and OpenAI-compatible SDKs.
MCP (for tool-driven operations):
omniroute --mcp
Then connect your MCP client over
stdioand test tools like:omniroute_get_healthomniroute_list_combos
A2A (for agent-to-agent workflows):
curl http://localhost:20128/.well-known/agent.json
curl -X POST http://localhost:20128/a2a \ -H 'content-type: application/json' \ -d '{"jsonrpc":"2.0","id":"quickstart","method":"message/send","params":{"skill":"quota-management","messages":[{"role":"user","content":"Give me a short quota summary."}]}}'
npm run test:protocols:e2e
This suite validates real MCP and A2A client flows against a running app.
cp .env.example .env npm install PORT=20128 DASHBOARD_PORT=20129 NEXT_PUBLIC_BASE_URL=http://localhost:20129 npm run dev
Void Linux (`xbps-src` template)
For Void Linux users, you can build a native package using
xbps-src. Save this block assrcpkgs/omniroute/template:# Template file for 'omniroute' pkgname=omniroute version=3.4.1 revision=1 hostmakedepends="nodejs python3 make" depends="openssl" short_desc="Universal AI gateway with smart routing for multiple LLM providers" maintainer="zenobit <zenobit@disroot.org>" license="MIT" homepage="https://github.com/diegosouzapw/OmniRoute" distfiles="https://github.com/diegosouzapw/OmniRoute/archive/refs/tags/v${version}.tar.gz" checksum=009400afee90a9f32599d8fe734145cfd84098140b7287990183dde45ae2245b system_accounts="_omniroute" omniroute_homedir="/var/lib/omniroute" export NODE_ENV=production export npm_config_engine_strict=false export npm_config_loglevel=error export npm_config_fund=false export npm_config_audit=false do_build() { # Determine target CPU arch for node-gyp local _gyp_arch case "$XBPS_TARGET_MACHINE" in aarch64*) _gyp_arch=arm64 ;; armv7*|armv6*) _gyp_arch=arm ;; i686*) _gyp_arch=ia32 ;; *) _gyp_arch=x64 ;; esac # 1) Install all deps โ skip scripts (no network in do_build, native modules # compiled separately below; better-sqlite3 is serverExternalPackage so # Next.js does not execute it during next build) NODE_ENV=development npm ci --ignore-scripts # 2) Build the Next.js standalone bundle npm run build # 3) Copy static assets into standalone cp -r .next/static .next/standalone/.next/static [ -d public ] && cp -r public .next/standalone/public || true # 4) Compile better-sqlite3 native binding for the target architecture. # Use node-gyp directly so CC/CXX from xbps-src cross-toolchain are used # without npm altering them. local _node_gyp=/usr/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js (cd node_modules/better-sqlite3 && node "$_node_gyp" rebuild --arch="$_gyp_arch") # 5) Place the compiled binding into the standalone bundle local _bs3_release=.next/standalone/node_modules/better-sqlite3/build/Release mkdir -p "$_bs3_release" cp node_modules/better-sqlite3/build/Release/better_sqlite3.node "$_bs3_release/" # 6) Remove arch-specific sharp bundles โ upstream sets images.unoptimized=true # so sharp is not used at runtime; x64 .so files would break aarch64 strip rm -rf .next/standalone/node_modules/@img # 7) Copy pino runtime deps omitted by Next.js static analysis: # pino-abstract-transport โ required by pino's worker thread # split2 โ dep of pino-abstract-transport # process-warning โ dep of pino itself for _mod in pino-abstract-transport split2 process-warning; do cp -r "node_modules/$_mod" .next/standalone/node_modules/ done } do_check() { npm run test:unit } do_install() { vmkdir usr/lib/omniroute/.next vcopy .next/standalone/. usr/lib/omniroute/.next/standalone # Prevent removal of empty Next.js app router dirs by the post-install hook for _d in \ .next/standalone/.next/server/app/dashboard \ .next/standalone/.next/server/app/dashboard/settings \ .next/standalone/.next/server/app/dashboard/providers; do touch "${DESTDIR}/usr/lib/omniroute/${_d}/.keep" done cat > "${WRKDIR}/omniroute" <<'EOF' #!/bin/sh export PORT="${PORT:-20128}" export DATA_DIR="${DATA_DIR:-${XDG_DATA_HOME:-${HOME}/.local/share}/omniroute}" export APP_LOG_TO_FILE="${APP_LOG_TO_FILE:-false}" mkdir -p "${DATA_DIR}" exec node /usr/lib/omniroute/.next/standalone/server.js "$@" EOF vbin "${WRKDIR}/omniroute" } post_install() { vlicense LICENSE }
OmniRoute is available as a public Docker image on Docker Hub.
Quick run:
docker run -d \ --name omniroute \ --restart unless-stopped \ --stop-timeout 40 \ -p 20128:20128 \ -v omniroute-data:/app/data \ diegosouzapw/omniroute:latest
With environment file:
# Copy and edit .env first cp .env.example .env docker run -d \ --name omniroute \ --restart unless-stopped \ --stop-timeout 40 \ --env-file .env \ -p 20128:20128 \ -v omniroute-data:/app/data \ diegosouzapw/omniroute:latestUsing Docker Compose:
# Base profile (no CLI tools) docker compose --profile base up -d # CLI profile (Claude Code, Codex, OpenClaw built-in) docker compose --profile cli up -d
Dashboard support for Docker deployments now includes a one-click Cloudflare Quick Tunnel on
Dashboard โ Endpoints. The first enable downloadscloudflaredonly when needed, starts a temporary tunnel to your current/v1endpoint, and shows the generatedhttps://*.trycloudflare.com/v1URL directly below your normal public URL.Notes:
- Quick Tunnel URLs are temporary and change after every restart.
- Quick Tunnels are not auto-restored after an OmniRoute or container restart. Re-enable them from the dashboard when needed.
- Managed install currently supports Linux, macOS, and Windows on
x64/arm64. - Managed Quick Tunnels default to HTTP/2 transport to avoid noisy QUIC UDP buffer warnings in constrained container environments. Set
CLOUDFLARED_PROTOCOL=quicorautoif you want a different transport. - Docker images bundle system CA roots and pass them to managed
cloudflared, which avoids TLS trust failures when the tunnel bootstraps inside the container. - SQLite runs in WAL mode.
docker stopshould be allowed to finish so OmniRoute can checkpoint the latest changes back intostorage.sqlite. - The bundled Compose files already set a 40s stop grace period. If you run the image directly, keep
--stop-timeout 40(or similar) so manual stops do not cut off shutdown cleanup. - Set
CLOUDFLARED_BIN=/absolute/path/to/cloudflaredif you want OmniRoute to use an existing binary instead of downloading one.
Using Docker Compose with Caddy (HTTPS Auto-TLS):
OmniRoute can be securely exposed using Caddy's automatic SSL provisioning. Ensure your domain's DNS A record points to your server's IP.
services: omniroute: image: diegosouzapw/omniroute:latest container_name: omniroute restart: unless-stopped volumes: - omniroute-data:/app/data environment: - PORT=20128 - NEXT_PUBLIC_BASE_URL=https://your-domain.com caddy: image: caddy:latest container_name: caddy restart: unless-stopped ports: - "80:80" - "443:443" command: caddy reverse-proxy --from https://your-domain.com --to http://omniroute:20128 volumes: omniroute-data:
Image Tag Size Description diegosouzapw/omniroutelatest~250MB Latest stable release diegosouzapw/omniroute3.6.2~250MB Current version
๐ NEW! OmniRoute is now available as a native desktop application for Windows, macOS, and Linux.
Run OmniRoute as a standalone desktop app โ no terminal, no browser, no internet required for local models. The Electron-based app includes:
- ๐ฅ๏ธ Native Window โ Dedicated app window with system tray integration
- ๐ Auto-Start โ Launch OmniRoute on system login
- ๐ Native Notifications โ Get alerts for quota exhaustion or provider issues
- โก One-Click Install โ NSIS (Windows), DMG (macOS), AppImage (Linux)
- ๐ Offline Mode โ Works fully offline with bundled server
# Development mode npm run electron:dev # Build for your platform npm run electron:build # Current platform npm run electron:build:win # Windows (.exe) npm run electron:build:mac # macOS (.dmg) โ x64 & arm64 npm run electron:build:linux # Linux (.AppImage)
When minimized, OmniRoute lives in your system tray with quick actions:
- Open dashboard
- Change server port
- Quit application
๐ Full documentation:
electron/README.md
Tier Provider Cost Quota Reset Best For ๐ณ SUBSCRIPTION Claude Code (Pro) $20/mo 5h + weekly Already subscribed Codex (Plus/Pro) $20-200/mo 5h + weekly OpenAI users Gemini CLI FREE 180K/mo + 1K/day Everyone! GitHub Copilot $10-19/mo Monthly GitHub users ๐ API KEY NVIDIA NIM FREE (dev forever) ~40 RPM 70+ open models Cerebras FREE (1M tok/day) 60K TPM / 30 RPM World's fastest Groq FREE (30 RPM) 14.4K RPD Ultra-fast Llama/Gemma DeepSeek V3.2 $0.27/$1.10 per 1M None Best price/quality reasoning xAI Grok-4 Fast $0.20/$0.50 per 1M ๐ None Fastest + tool calling, ultralow xAI Grok-4 (standard) $0.20/$1.50 per 1M ๐ None Reasoning flagship from xAI Mistral Free trial + paid Rate limited European AI OpenRouter Pay-per-use None 100+ models aggr. ๐ฐ CHEAP GLM-5 (via Z.AI) ๐ $0.5/1M Daily 10AM 128K output, newest flagship GLM-4.7 $0.6/1M Daily 10AM Budget backup MiniMax M2.5 ๐ $0.3/1M input 5-hour rolling Reasoning + agentic tasks MiniMax M2.1 $0.2/1M 5-hour rolling Cheapest option Kimi K2.5 (Moonshot API) ๐ Pay-per-use None Direct Moonshot API access Kimi K2 $9/mo flat 10M tokens/mo Predictable cost ๐ FREE Qoder $0 Unlimited 5 models unlimited Qwen $0 Unlimited 4 models unlimited Kiro $0 Unlimited Claude Sonnet/Haiku (AWS Builder) LongCat Flash-Lite ๐ $0 (50M tok/day ๐ฅ) 1 RPS Largest free quota on Earth Pollinations AI ๐ $0 (no key needed) 1 req/15s GPT-5, Claude, DeepSeek, Llama 4 Cloudflare Workers AI ๐ $0 (10K Neurons/day) ~150 resp/day 50+ models, global edge Scaleway AI ๐ $0 (1M tokens total) Rate limited EU/GDPR, Qwen3 235B, Llama 70B ๐ New models added (Mar 2026): Grok-4 Fast family at $0.20/$0.50/M (benchmarked at 1143ms โ 30% faster than Gemini 2.5 Flash), GLM-5 via Z.AI with 128K output, MiniMax M2.5 reasoning, DeepSeek V3.2 updated pricing, Kimi K2.5 via Moonshot direct API.
๐ก $0 Combo Stack โ The Complete Free Setup:
# ๐ Ultimate Free Stack 2026 โ 11 Providers, $0 Forever Kiro (kr/) โ Claude Sonnet/Haiku UNLIMITED Qoder (if/) โ kimi-k2-thinking, qwen3-coder-plus, deepseek-r1 UNLIMITED LongCat Lite (lc/) โ LongCat-Flash-Lite โ 50M tokens/day ๐ฅ Pollinations (pol/) โ GPT-5, Claude, DeepSeek, Llama 4 โ no key needed Qwen (qw/) โ qwen3-coder-plus, qwen3-coder-flash, qwen3-coder-next UNLIMITED Gemini (gemini/) โ Gemini 2.5 Flash โ 1,500 req/day free API key Cloudflare AI (cf/) โ Llama 70B, Gemma 3, Mistral โ 10K Neurons/day Scaleway (scw/) โ Qwen3 235B, Llama 70B โ 1M free tokens (EU) Groq (groq/) โ Llama/Gemma ultra-fast โ 14.4K req/day NVIDIA NIM (nvidia/) โ 70+ open models โ 40 RPM forever Cerebras (cerebras/) โ Llama/Qwen world-fastest โ 1M tok/dayZero cost. Never stops coding. Configure this as one OmniRoute combo and all fallbacks happen automatically โ no manual switching ever.
All models below are 100% free with zero credit card required. OmniRoute auto-routes between them when one quota runs out โ combine them all for an unbreakable $0 combo.
Model Prefix Limit Rate Limit claude-sonnet-4.5kr/Unlimited No reported daily cap claude-haiku-4.5kr/Unlimited No reported daily cap claude-opus-4.6kr/Unlimited Latest Opus via Kiro Model Prefix Limit Rate Limit kimi-k2-thinkingif/Unlimited No reported cap qwen3-coder-plusif/Unlimited No reported cap deepseek-r1if/Unlimited No reported cap minimax-m2.1if/Unlimited No reported cap kimi-k2if/Unlimited No reported cap Recommended connection method: Personal Access Token +
qodercli. Browser OAuth is experimental and disabled by default unlessQODER_OAUTH_*environment variables are configured.Model Prefix Limit Rate Limit qwen3-coder-plusqw/Unlimited No reported cap qwen3-coder-flashqw/Unlimited No reported cap qwen3-coder-nextqw/Unlimited No reported cap vision-modelqw/Unlimited Multimodal (images) Model Prefix Limit Rate Limit gemini-3-flash-previewgc/180K tok/month + 1K/day Monthly reset gemini-2.5-progc/180K/month (shared pool) High quality Tier Daily Limit Rate Limit Notes Free (Dev) No token cap ~40 RPM 70+ models; transitioning to pure rate limits mid-2025 Popular free models:
moonshotai/kimi-k2.5(Kimi K2.5),z-ai/glm4.7(GLM 4.7),deepseek-ai/deepseek-v3.2(DeepSeek V3.2),nvidia/llama-3.3-70b-instruct,deepseek/deepseek-r1Tier Daily Limit Rate Limit Notes Free 1M tokens/day 60K TPM / 30 RPM World's fastest LLM inference; resets daily Available free:
llama-3.3-70b,llama-3.1-8b,deepseek-r1-distill-llama-70bTier Daily Limit Rate Limit Notes Free 14.4K RPD 30 RPM per model No credit card; 429 on limit, not charged Available free:
llama-3.3-70b-versatile,gemma2-9b-it,mixtral-8x7b,whisper-large-v3Model Prefix Daily Free Quota Notes LongCat-Flash-Litelc/50M tokens ๐ฅ Largest free quota ever LongCat-Flash-Chatlc/500K tokens Multi-turn chat LongCat-Flash-Thinkinglc/500K tokens Reasoning / CoT LongCat-Flash-Thinking-2601lc/500K tokens Jan 2026 version LongCat-Flash-Omni-2603lc/500K tokens Multimodal 100% free while in public beta. Sign up at longcat.chat with email or phone. Resets daily 00:00 UTC.
Model Prefix Rate Limit Provider Behind openaipol/1 req/15s GPT-5 claudepol/1 req/15s Anthropic Claude geminipol/1 req/15s Google Gemini deepseekpol/1 req/15s DeepSeek V3 llamapol/1 req/15s Meta Llama 4 Scout mistralpol/1 req/15s Mistral AI โจ Zero friction: No signup, no API key. Add the Pollinations provider with an empty key field and it works immediately.
Tier Daily Neurons Equivalent Usage Notes Free 10,000 ~150 LLM resp / 500s audio / 15K embeds Global edge, 50+ models Popular free models:
@cf/meta/llama-3.3-70b-instruct,@cf/google/gemma-3-12b-it,@cf/openai/whisper-large-v3-turbo(free audio!),@cf/qwen/qwen2.5-coder-15b-instructRequires API Token + Account ID from dash.cloudflare.com. Store Account ID in provider settings.
Tier Free Quota Location Notes Free 1M tokens ๐ซ๐ท Paris, EU No credit card needed within limits Available free:
qwen3-235b-a22b-instruct-2507(Qwen3 235B!),llama-3.1-70b-instruct,mistral-small-3.2-24b-instruct-2506,deepseek-v3-0324EU/GDPR compliant. Get API key at console.scaleway.com.
๐ก The Ultimate Free Stack (11 Providers, $0 Forever):
Kiro (kr/) โ Claude Sonnet/Haiku UNLIMITED Qoder (if/) โ kimi-k2-thinking, qwen3-coder-plus, deepseek-r1 UNLIMITED LongCat Lite (lc/) โ LongCat-Flash-Lite โ 50M tokens/day ๐ฅ Pollinations (pol/) โ GPT-5, Claude, DeepSeek, Llama 4 โ no key needed Qwen (qw/) โ qwen3-coder models UNLIMITED Gemini (gemini/) โ Gemini 2.5 Flash โ 1,500 req/day free Cloudflare AI (cf/) โ 50+ models โ 10K Neurons/day Scaleway (scw/) โ Qwen3 235B, Llama 70B โ 1M free tokens (EU) Groq (groq/) โ Llama/Gemma โ 14.4K req/day ultra-fast NVIDIA NIM (nvidia/) โ 70+ open models โ 40 RPM forever Cerebras (cerebras/) โ Llama/Qwen world-fastest โ 1M tok/dayTranscribe any audio/video for $0 โ Deepgram leads with $200 free, AssemblyAI $50 fallback, Groq Whisper as unlimited emergency backup.
Provider Free Credits Best Model Rate Limit ๐ข Deepgram $200 free (signup) nova-3โ best accuracy, 30+ languagesNo RPM limit on free credits ๐ต AssemblyAI $50 free (signup) universal-3-proโ chapters, sentiment, PIINo RPM limit on free credits ๐ด Groq Free forever whisper-large-v3โ OpenAI Whisper30 RPM (rate limited) Suggested combo in
/dashboard/combos:Name: free-transcription Strategy: Priority Nodes: [1] deepgram/nova-3 โ uses $200 free first [2] assemblyai/universal-3-pro โ fallback when Deepgram credits run out [3] groq/whisper-large-v3 โ free forever, emergency fallbackThen in
/dashboard/mediaโ Transcription tab: upload any audio or video file โ select your combo endpoint โ get transcription in supported formats.OmniRoute v3.6 is built as an operational platform, not just a relay proxy.
Feature What It Does ๐๏ธ Uninstall / Full Uninstall npm run uninstallkeeps data,npm run uninstall:fullremoves everything โ clean removal scripts for all install methods๐ง OAuth Env Repair One-click "Repair env" action for OAuth providers restores missing environment variables and fixes broken auth state ๐ Graceful Electron Shutdown Electron before-quitnow shuts down Next.js gracefully, preventing SQLite WAL database locks on desktop app close๐๏ธ Model Visibility Toggle Per-model visibility toggle (๐ icon) with search filter and active-count badge ( N/M active) on provider pages๐ง Email Privacy Masking OAuth account emails masked in provider dashboard ( di*****@g****.com), full address visible on hover๐ Context Relay Strategy Combo strategy that preserves session continuity via structured handoff summaries when accounts rotate mid-conversation ๐ก๏ธ Proxy Hardening Token health check, API key validation, and undici dispatcher all honor proxy config โ no more bypass in restricted envs โ ๏ธ Node.js 24 Login WarningLogin page proactively detects incompatible Node.js versions and shows a clear warning banner with instructions ๐ Gemini PDF Attachments PDF files attached in chat messages are now correctly routed to Gemini via inline_dataand generic base64 detection๐ CodeQL Security Hardening Resolved SSRF, insecure randomness, polynomial ReDoS, and incomplete URL sanitization alerts Feature What It Does โก Grok-4 Fast Family xAI models at $0.20/$0.50/M โ benchmarked 1143ms (30% faster than Gemini 2.5 Flash) ๐ง GLM-5 via Z.AI 128K output context, $0.5/1M โ newest flagship from the GLM family ๐ฎ MiniMax M2.5 Reasoning + agentic tasks at $0.30/1M โ significant upgrade from M2.1 ๐ฏ toolCalling Flag per Model Per-model toolCalling: true/falsein registry โ AutoCombo skips non-tool-capable models๐ Multilingual Intent Detection PT/ZH/ES/AR keywords in AutoCombo scoring โ better model selection for non-English content ๐ Benchmark-Driven Fallbacks Real p95 latency from live requests feeds combo scoring โ AutoCombo learns from actual data ๐ Request Deduplication Content-hash based dedup window โ multi-agent safe, prevents duplicate charges ๐ Pluggable RouterStrategy Extensible RouterStrategyinterface โ add custom routing logic as pluginsFeature What It Does ๐ฎ Model Playground Dashboard page to test any model directly โ provider/model/endpoint selectors, Monaco Editor, streaming, abort, timing ๐ CLI Fingerprint Matching Per-provider header/body ordering to match native CLI signatures โ toggle per provider in Settings > Security. Your proxy IP is preserved ๐ค ACP Support (Agent Client Protocol) CLI agent discovery (Codex, Claude, Goose, Gemini CLI, OpenClaw + 9 more), process spawner, /api/acp/agentsendpoint๐ค ACP Agents Dashboard Debug โบ Agents page โ grid of 14 agents with install status, version, custom agent form for any CLI tool. OpenCode users get a "Download opencode.json" button that auto-generates a ready-to-use config with all available models. ๐ง Custom Model apiFormatRoutingCustom models with apiFormat: "responses"now correctly route to the Responses API translator๐ข Codex Workspace Isolation Multiple Codex workspaces per email โ OAuth correctly separates connections by workspace ID ๐ Electron Auto-Update Desktop app checks for updates + auto-install on restart Feature What It Does ๐ง MCP Server (25 tools) IDE/agent tools via 3 transports: stdio, SSE ( /api/mcp/sse), Streamable HTTP (/api/mcp/stream). 18 core + 3 memory + 4 skill tools๐ค A2A Server (JSON-RPC + SSE) Agent-to-agent task execution with sync and streaming flows ๐งญ Consolidated Endpoints Page Tabbed management page with Endpoint Proxy, MCP, A2A, and API Endpoints tabs ๐๏ธ Service Enable/Disable Toggles ON/OFF switches for MCP and A2A with settings persistence (default: OFF) ๐ฐ๏ธ MCP Runtime Heartbeat Real process status (pid, uptime, heartbeat age, transport, scope mode) ๐ MCP Audit Trail Filterable audit logs with success/failure and key attribution ๐ MCP Scope Enforcement 10 granular scope permissions for controlled tool access ๐ก A2A Task Lifecycle Management List/filter tasks, inspect events/artifacts, cancel running tasks ๐ Agent Card Discovery /.well-known/agent.jsonfor client auto-discovery๐งช Protocol E2E Test Harness Real MCP SDK + A2A client flows in test:protocols:e2eโ๏ธ Operational Controls Switch combo, apply resilience profiles, reset breakers from one control surface Feature What It Does ๐ฏ Smart 4-Tier Fallback Auto-route: Subscription โ API Key โ Cheap โ Free ๐ Real-Time Quota Tracking Live token count + reset countdown per provider ๐ Format Translation OpenAI โ Claude โ Gemini โ Responses with schema-safe conversions ๐ฅ Multi-Account Support Multiple accounts per provider with intelligent selection ๐ Auto Token Refresh OAuth tokens refresh automatically with retry ๐จ Custom Combos 13 balancing strategies + fallback chain control ๐ Context Relay Session continuity handoffs when account rotation happens mid-session ๐ Wildcard Router provider/*dynamic routing๐ง Thinking Budget Controls Passthrough, auto, custom, and adaptive reasoning limits ๐ Model Aliases Built-in + custom model aliasing and migration safety โก Background Degradation Route low-priority background tasks to cheaper models ๐งช Task-Aware Smart Routing Auto-select model by content type (coding/vision/analysis/summarization) ๐ A2A Agent Workflows Deterministic FSM orchestrator for stateful multi-step agent executions ๐ Adaptive Routing Dynamic strategy override based on token volume and prompt complexity ๐ฒ Provider Diversity Shannon entropy scoring balancing auto-combo traffic distribution ๐ฌ System Prompt Injection Global behavior controls applied consistently ๐ Responses API Compatibility Full /v1/responsessupport for Codex and advanced agentic workflowsFeature What It Does ๐ผ๏ธ Image Generation /v1/images/generationswith cloud and local backends๐ Embeddings /v1/embeddingsfor search and RAG pipelines๐ค Audio Transcription /v1/audio/transcriptionsโ 7 providers (Deepgram Nova 3, AssemblyAI, Groq Whisper, HuggingFace, ElevenLabs, OpenAI, Azure), auto-language detection, MP4/MP3/WAV support๐ Text-to-Speech /v1/audio/speechโ 10 providers (ElevenLabs, OpenAI, Deepgram, Cartesia, PlayHT, HuggingFace, Nvidia NIM, Inworld, Coqui, Tortoise) with correct error messages๐ฌ Video Generation /v1/videos/generations(ComfyUI + SD WebUI workflows)๐ต Music Generation /v1/music/generations(ComfyUI workflows)๐ก๏ธ Moderations /v1/moderationssafety checks๐ Reranking /v1/rerankfor relevance scoring๐ Web Search ๐ /v1/searchโ 5 providers (Serper, Brave, Perplexity, Exa, Tavily), 6,500+ free/month, auto-failover, cacheFeature What It Does ๐ Circuit Breakers Per-model trip/recover with threshold controls ๐ฏ Endpoint-Aware Models Custom models declare supported endpoints + API format ๐ก๏ธ Anti-Thundering Herd Mutex + semaphore protections on retry/rate events ๐ง Semantic + Signature Cache Cost/latency reduction with two cache layers โก Request Idempotency Duplicate protection window ๐ TLS Fingerprint Spoofing Browser-like TLS fingerprint โ reduces bot detection and account flagging ๐ CLI Fingerprint Matching Matches native CLI request signatures โ reduces ban risk while preserving proxy IP ๐ IP Filtering Allowlist/blocklist control for exposed deployments ๐ Editable Rate Limits Configurable global/provider-level limits with persistence ๐ Graceful Degradation Multi-layer capability fallbacks protecting core gateway operations ๐ Config Audit Trail Diff-based change tracking preventing operational drift with simple rollbacks โณ Provider Health Sync Proactive token expiration monitoring triggering alerts before authorization failures ๐ช Auto-Disable Banned Accounts Operational circuit breaker sealing permanently blocked token accounts automatically ๐ API Key Management + Scoping Secure key issuance/rotation and model/provider controls ๐๏ธ Scoped API Key Reveal ๐ Opt-in recovery of API keys via ALLOW_API_KEY_REVEAL๐ก๏ธ Protected /modelsOptional auth gating and provider hiding for model catalog Feature What It Does ๐ Request + Proxy Logging Full request/response and proxy logging ๐ Streamed Detailed Logs ๐ Reconstructs SSE payload streams cleanly into the UI ๐ Unified Logs Dashboard Request, proxy, audit, and console views in one page ๐ Request Telemetry p50/p95/p99 latency and request tracing ๐ฅ Health Dashboard Uptime, breaker states, lockouts, cache stats ๐ฐ Cost Tracking Budget controls and per-model pricing visibility ๐ Analytics Visualizations Model/provider usage insights and trend views ๐งช Evaluation Framework Golden set testing with configurable match strategies ๐ก Live Diagnostics ๐ Semantic cache bypass for accurate combo live testing Feature What It Does ๐ Deploy Anywhere Localhost, VPS, Docker, Cloud environments ๐ Cloudflare Tunnel ๐ One-click Quick Tunnel integration from the dashboard ๐ API Key Model Filtering Native /v1/models response filtered via assigned Bearer context roles โก Smart Cache Bypass Configurable TTL heuristics and forced refetch controls ๐ Backup/Restore Export/import and disaster recovery flows ๐ง Onboarding Wizard First-run guided setup ๐ง CLI Tools Dashboard One-click setup for popular coding tools ๐ฎ Model Playground Test any provider/model/endpoint from the dashboard ๐ CLI Fingerprint Toggle Per-provider fingerprint matching in Settings > Security ๐ i18n (30 languages) Full dashboard + docs language support with RTL coverage ๐งน Clear All Models One-click model list clearing in provider details ๐๏ธ Sidebar Controls ๐ Hide components and integrations from Appearance Settings ๐ Issue Templates Standardized GitHub templates for bugs and features ๐ Custom Data Directory DATA_DIRoverride for storage locationCombo: "my-coding-stack" 1. cc/claude-opus-4-6 2. nvidia/llama-3.3-70b 3. glm/glm-4.7 4. if/kimi-k2-thinking
When quota, rate, or health fails, OmniRoute automatically moves to the next candidate without manual switching.
- MCP + A2A are discoverable in UI and docs (not hidden)
- Protocol status APIs expose live operational data (
/api/mcp/*,/api/a2a/*) - Dashboards include actions for day-2 ops (combo toggles, breaker resets, task cancellation)
The Translator area includes:
- Playground: request transformation checks
- Chat Tester: full request/response round-trip
- Test Bench: multiple cases in one run
- Live Monitor: real-time traffic view
Plus protocol validation with real clients via
npm run test:protocols:e2e.๐ MCP Server README โ Tool reference, IDE configs, and client examples
๐ A2A Server README โ Skills, JSON-RPC methods, streaming, and task lifecycle
OmniRoute includes a built-in evaluation framework to test LLM response quality against a golden set. Access it via Analytics โ Evals in the dashboard.
The pre-loaded "OmniRoute Golden Set" contains test cases for:
- Greetings, math, geography, code generation
- JSON format compliance, translation, markdown generation
- Safety refusal (harmful content), counting, boolean logic
Strategy Description Example exactOutput must match exactly "4"containsOutput must contain substring (case-insensitive) "Paris"regexOutput must match regex pattern "1.*2.*3"customCustom JS function returns true/false (output) => output.length > 10
๐งฉ MCP Setup (Model Context Protocol)
Start MCP transport in stdio mode:
omniroute --mcp
Recommended validation flow:
- Connect your MCP client over stdio.
- Run
omniroute_get_health. - Run
omniroute_list_combos. - Open
/dashboard/mcpto confirm heartbeat, activity, and audit.
Useful APIs for automation:
GET /api/mcp/statusGET /api/mcp/toolsGET /api/mcp/auditGET /api/mcp/audit/stats
๐ค A2A Setup (Agent2Agent)
Discover the agent:
curl http://localhost:20128/.well-known/agent.json
Send a task:
curl -X POST http://localhost:20128/a2a \ -H 'content-type: application/json' \ -d '{"jsonrpc":"2.0","id":"setup-a2a","method":"message/send","params":{"skill":"quota-management","messages":[{"role":"user","content":"Summarize quota status."}]}}'
Manage lifecycle:
GET /api/a2a/statusGET /api/a2a/tasksGET /api/a2a/tasks/:idPOST /api/a2a/tasks/:id/cancel
Operational UI:
/dashboard/a2afor task/state/stream observability and smoke actions
๐งช End-to-end protocol validation
Validate both protocols with real clients:
npm run test:protocols:e2e
This verifies:
- MCP SDK client connect/list/call
- A2A discovery/send/stream/get/cancel
- Cross-check data in MCP audit and A2A task management APIs
๐ณ Subscription Providers
Dashboard โ Providers โ Connect Claude Code โ OAuth login โ Auto token refresh โ 5-hour + weekly quota tracking Models: cc/claude-opus-4-6 cc/claude-sonnet-4-5-20250929 cc/claude-haiku-4-5-20251001
Pro Tip: Use Opus for complex tasks, Sonnet for speed. OmniRoute tracks quota per model!
Dashboard โ Providers โ Connect Codex โ OAuth login (port 1455) โ 5-hour + weekly reset Models: cx/gpt-5.2-codex cx/gpt-5.1-codex-max
Each Codex account now has policy toggles in
Dashboard -> Providers:5h(ON/OFF): enforce the 5-hour window threshold policy.Weekly(ON/OFF): enforce the weekly window threshold policy.- Threshold behavior: when an enabled window reaches >=90% usage, that account is skipped.
- Rotation behavior: OmniRoute routes to the next eligible Codex account automatically.
- Reset behavior: when the provider
resetAttime passes, the account becomes eligible again automatically.
Scenarios:
5h ON+Weekly ON: account is skipped when either window reaches threshold.5h OFF+Weekly ON: only weekly usage can block the account.5h ON+Weekly OFF: only 5-hour usage can block the account.resetAtpassed: account re-enters rotation automatically (no manual re-enable).
Dashboard โ Providers โ Connect Gemini CLI โ Google OAuth โ 180K completions/month + 1K/day Models: gc/gemini-3-flash-preview gc/gemini-2.5-pro
Best Value: Huge free tier! Use this before paid tiers.
Dashboard โ Providers โ Connect GitHub โ OAuth via GitHub โ Monthly reset (1st of month) Models: gh/gpt-5 gh/claude-4.5-sonnet gh/gemini-3.1-pro-preview
๐ API Key Providers
- Sign up: build.nvidia.com
- Get free API key (1000 inference credits included)
- Dashboard โ Add Provider โ NVIDIA NIM:
- API Key:
nvapi-your-key
- API Key:
Models:
nvidia/llama-3.3-70b-instruct,nvidia/mistral-7b-instruct, and 50+ morePro Tip: OpenAI-compatible API โ works seamlessly with OmniRoute's format translation!
- Sign up: platform.deepseek.com
- Get API key
- Dashboard โ Add Provider โ DeepSeek
Models:
deepseek/deepseek-chat,deepseek/deepseek-coder- Sign up: console.groq.com
- Get API key (free tier included)
- Dashboard โ Add Provider โ Groq
Models:
groq/llama-3.3-70b,groq/mixtral-8x7bPro Tip: Ultra-fast inference โ best for real-time coding!
- Sign up: openrouter.ai
- Get API key
- Dashboard โ Add Provider โ OpenRouter
Models: Access 100+ models from all major providers through a single API key.
Dashboard behavior: OpenRouter models are managed from Available Models. Manual add, import, and auto-sync all update the same list.
๐ฐ Cheap Providers (Backup)
- Sign up: Zhipu AI
- Get API key from Coding Plan
- Dashboard โ Add API Key:
- Provider:
glm - API Key:
your-key
- Provider:
Use:
glm/glm-4.7Pro Tip: Coding Plan offers 3ร quota at 1/7 cost! Reset daily 10:00 AM.
- Sign up: MiniMax
- Get API key
- Dashboard โ Add API Key
Use:
minimax/MiniMax-M2.1Pro Tip: Cheapest option for long context (1M tokens)!
- Subscribe: Moonshot AI
- Get API key
- Dashboard โ Add API Key
Use:
kimi/kimi-latestPro Tip: Fixed $9/month for 10M tokens = $0.90/1M effective cost!
๐ FREE Providers (Emergency Backup)
Dashboard โ Connect Qoder โ Qoder OAuth login โ Unlimited usage Models: if/kimi-k2-thinking if/qwen3-coder-plus if/glm-4.7 if/minimax-m2 if/deepseek-r1
Dashboard โ Connect Qwen โ Device code authorization โ Unlimited usage Models: qw/qwen3-coder-plus qw/qwen3-coder-flash
Dashboard โ Connect Kiro โ AWS Builder ID or Google/GitHub โ Unlimited usage Models: kr/claude-sonnet-4.5 kr/claude-haiku-4.5
๐จ Create Combos
Dashboard โ Combos โ Create New Name: premium-coding Models: 1. cc/claude-opus-4-6 (Subscription primary) 2. glm/glm-4.7 (Cheap backup, $0.6/1M) 3. minimax/MiniMax-M2.1 (Cheapest fallback, $0.20/1M) Use in CLI: premium-codingName: free-combo Models: 1. gc/gemini-3-flash-preview (180K free/month) 2. if/kimi-k2-thinking (unlimited) 3. qw/qwen3-coder-plus (unlimited) Cost: $0 forever!๐ง CLI Integration
Settings โ Models โ Advanced: OpenAI API Base URL: http://localhost:20128/v1 OpenAI API Key: [from OmniRoute dashboard] Model: cc/claude-opus-4-6Use the CLI Tools page in the dashboard for one-click configuration, or edit
~/.claude/settings.jsonmanually.export OPENAI_BASE_URL="http://localhost:20128" export OPENAI_API_KEY="your-omniroute-api-key" codex "your prompt"
Option 1 โ Dashboard (recommended):
Dashboard โ CLI Tools โ OpenClaw โ Select Model โ ApplyOption 2 โ Manual: Edit
~/.openclaw/openclaw.json:{ "models": { "providers": { "omniroute": { "baseUrl": "http://127.0.0.1:20128/v1", "apiKey": "sk_omniroute", "api": "openai-completions" } } } }Note: OpenClaw only works with local OmniRoute. Use
127.0.0.1instead oflocalhostto avoid IPv6 resolution issues.Settings โ API Configuration: Provider: OpenAI Compatible Base URL: http://localhost:20128/v1 API Key: [from OmniRoute dashboard] Model: if/kimi-k2-thinkingStep 1: Add OmniRoute as a custom provider:
opencode /connect # Select "Other" โ Enter ID: "omniroute" โ Enter your OmniRoute API keyStep 2: Create/edit
opencode.jsonin your project root:{ "$schema": "https://opencode.ai/config.json", "provider": { "omniroute": { "npm": "@ai-sdk/openai-compatible", "name": "OmniRoute", "options": { "baseURL": "http://localhost:20128/v1" }, "models": { "cc/claude-sonnet-4-20250514": { "name": "Claude Sonnet 4" }, "gg/gemini-2.5-pro": { "name": "Gemini 2.5 Pro" }, "if/kimi-k2-thinking": { "name": "Kimi K2 (Free)" } } } } }Step 3: Select the model in OpenCode:
/models # Select any OmniRoute model from the listTip: Add any model available in your OmniRoute
/v1/modelsendpoint to themodelssection. Use the formatprovider/model-idfrom your OmniRoute dashboard.
Click to expand troubleshooting guide
"Language model did not provide messages"
- Provider quota exhausted โ Check dashboard quota tracker
- Solution: Use combo fallback or switch to cheaper tier
Rate limiting
- Subscription quota out โ Fallback to GLM/MiniMax
- Add combo:
cc/claude-opus-4-6 โ glm/glm-4.7 โ if/kimi-k2-thinking
OAuth token expired
- Auto-refreshed by OmniRoute
- If issues persist: Dashboard โ Provider โ Reconnect
High costs
- Check usage stats in Dashboard โ Costs
- Switch primary model to GLM/MiniMax
- Use free tier (Gemini CLI, Qoder) for non-critical tasks
Dashboard/API ports are wrong
PORTis the canonical base port (and API port by default)API_PORToverrides only OpenAI-compatible API listenerDASHBOARD_PORToverrides only dashboard/Next.js listener- Set
NEXT_PUBLIC_BASE_URLto your dashboard/public URL (for OAuth callbacks)
Cloud sync errors
- Verify
BASE_URLpoints to your running instance - Verify
CLOUD_URLpoints to your expected cloud endpoint - Keep
NEXT_PUBLIC_*values aligned with server-side values
First login not working
- Check
INITIAL_PASSWORDin.env - If unset, fallback password is
123456
No request logs
- Request artifacts are written to
DATA_DIR/call_logs/as one JSON file per request - Enable pipeline capture from Dashboard โ Logs โ Request Logs if you need detailed per-stage payloads
- Set
APP_LOG_TO_FILE=trueif you also want application console logs inlogs/application/app.log - Adjust
APP_LOG_MAX_FILE_SIZE,APP_LOG_RETENTION_DAYS,APP_LOG_MAX_FILES, andCALL_LOG_MAX_ENTRIESas needed
Connection test shows "Invalid" for OpenAI-compatible providers
- Many providers don't expose a
/modelsendpoint - OmniRoute v1.0.6+ includes fallback validation via chat completions
- Ensure base URL includes
/v1suffix
โ ๏ธ Important for users running OmniRoute on a VPS, Docker, or any remote serverThe Antigravity and Gemini CLI providers use Google OAuth 2.0. Google requires the
redirect_uriin the OAuth flow to exactly match one of the pre-registered URIs in the app's Google Cloud Console.The OAuth credentials bundled in OmniRoute are registered for
localhostonly. When you access OmniRoute on a remote server (e.g.https://omniroute.myserver.com), Google rejects the authentication with:Error 400: redirect_uri_mismatchYou need to create an OAuth 2.0 Client ID in Google Cloud Console with your server's URI.
1. Open Google Cloud Console
Go to: https://console.cloud.google.com/apis/credentials
2. Create a new OAuth 2.0 Client ID
- Click "+ Create Credentials" โ "OAuth client ID"
- Application type: "Web application"
- Name: anything you like (e.g.
OmniRoute Remote)
3. Add Authorized Redirect URIs
In the "Authorized redirect URIs" field, add:
https://your-server.com/callbackReplace
your-server.comwith your server's domain or IP (include the port if needed, e.g.http://45.33.32.156:20128/callback).4. Save and copy the credentials
After creating, Google will show the Client ID and Client Secret.
5. Set environment variables
In your
.env(or Docker environment variables):# For Antigravity: ANTIGRAVITY_OAUTH_CLIENT_ID=your-client-id.apps.googleusercontent.com ANTIGRAVITY_OAUTH_CLIENT_SECRET=GOCSPX-your-secret # For Gemini CLI: GEMINI_OAUTH_CLIENT_ID=your-client-id.apps.googleusercontent.com GEMINI_OAUTH_CLIENT_SECRET=GOCSPX-your-secret GEMINI_CLI_OAUTH_CLIENT_SECRET=GOCSPX-your-secret
6. Restart OmniRoute
# npm: npm run dev # Docker: docker restart omniroute
7. Try connecting again
Dashboard โ Providers โ Antigravity (or Gemini CLI) โ OAuth
Google will now redirect correctly to
https://your-server.com/callback.
If you don't want to set up your own credentials right now, you can still use the manual URL flow:
- OmniRoute opens the Google authorization URL
- After authorizing, Google tries to redirect to
localhost(which fails on the remote server) - Copy the full URL from your browser's address bar (even if the page doesn't load)
- Paste that URL into the field shown in the OmniRoute connection modal
- Click "Connect"
This works because the authorization code in the URL is valid regardless of whether the redirect page loaded.
๐ง๐ท Versรฃo em Portuguรชs
Os provedores Antigravity e Gemini CLI usam Google OAuth 2.0 para autenticaรงรฃo. O Google exige que a
redirect_uriusada no fluxo OAuth seja exatamente uma das URIs prรฉ-cadastradas no Google Cloud Console do aplicativo.As credenciais OAuth embutidas no OmniRoute estรฃo cadastradas apenas para
localhost. Quando vocรช acessa o OmniRoute em um servidor remoto (ex:https://omniroute.meuservidor.com), o Google rejeita a autenticaรงรฃo com:Error 400: redirect_uri_mismatchVocรช precisa criar um OAuth 2.0 Client ID no Google Cloud Console com a URI do seu servidor.
1. Acesse o Google Cloud Console
Abra: https://console.cloud.google.com/apis/credentials
2. Crie um novo OAuth 2.0 Client ID
- Clique em "+ Create Credentials" โ "OAuth client ID"
- Tipo de aplicativo: "Web application"
- Nome: escolha qualquer nome (ex:
OmniRoute Remote)
3. Adicione as Authorized Redirect URIs
No campo "Authorized redirect URIs", adicione:
https://seu-servidor.com/callbackSubstitua
seu-servidor.compelo domรญnio ou IP do seu servidor (inclua a porta se necessรกrio, ex:http://45.33.32.156:20128/callback).4. Salve e copie as credenciais
Apรณs criar, o Google mostrarรก o Client ID e o Client Secret.
5. Configure as variรกveis de ambiente
No seu
.env(ou nas variรกveis de ambiente do Docker):# Para Antigravity: ANTIGRAVITY_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com ANTIGRAVITY_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret # Para Gemini CLI: GEMINI_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com GEMINI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret GEMINI_CLI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret
6. Reinicie o OmniRoute
# Se usando npm: npm run dev # Se usando Docker: docker restart omniroute
7. Tente conectar novamente
Dashboard โ Providers โ Antigravity (ou Gemini CLI) โ OAuth
Agora o Google redirecionarรก corretamente para
https://seu-servidor.com/callbacke a autenticaรงรฃo funcionarรก.
Se nรฃo quiser criar credenciais prรณprias agora, ainda รฉ possรญvel usar o fluxo manual de URL:
- O OmniRoute abrirรก a URL de autorizaรงรฃo do Google
- Apรณs vocรช autorizar, o Google tentarรก redirecionar para
localhost(que falha no servidor remoto) - Copie a URL completa da barra de endereรงo do seu browser (mesmo que a pรกgina nรฃo carregue)
- Cole essa URL no campo que aparece no modal de conexรฃo do OmniRoute
- Clique em "Connect"
Este workaround funciona porque o cรณdigo de autorizaรงรฃo na URL รฉ vรกlido independente do redirect ter carregado ou nรฃo.
Click to expand tech stack details
- Runtime: Node.js 18โ22 LTS (
โ ๏ธ Node.js 24+ is not supported โbetter-sqlite3native binaries are incompatible) - Language: TypeScript 5.9 โ 100% TypeScript across
src/andopen-sse/(zeroanyin core modules since v2.0) - Framework: Next.js 16 + React 19 + Tailwind CSS 4
- Database: better-sqlite3 (SQLite) + LowDB (JSON legacy) โ domain state, proxy logs, MCP audit, routing decisions, memory, skills
- Schemas: Zod (MCP tool I/O validation, API contracts)
- Protocols: MCP (stdio/HTTP) + A2A v0.3 (JSON-RPC 2.0 + SSE)
- Streaming: Server-Sent Events (SSE)
- Auth: OAuth 2.0 (PKCE) + JWT + API Keys + MCP Scoped Authorization
- Testing: Node.js test runner + Vitest (900+ tests including unit, integration, E2E)
- CI/CD: GitHub Actions (auto npm publish + Docker Hub on release)
- Website: omniroute.online
- Package: npmjs.com/package/omniroute
- Docker: hub.docker.com/r/diegosouzapw/omniroute
- Resilience: Circuit breaker, exponential backoff, anti-thundering herd, TLS spoofing, auto-combo self-healing
Document Description User Guide Providers, combos, CLI integration, deployment API Reference All endpoints with examples MCP Server 25 MCP tools, IDE configs, Python/TS/Go clients A2A Server JSON-RPC 2.0 protocol, skills, streaming, task mgmt Auto-Combo Engine 6-factor scoring, mode packs, self-healing Context Relay Session handoff strategy for account rotation Troubleshooting Common problems and solutions Architecture System architecture and internals Codebase Documentation Beginner-friendly codebase walkthrough Uninstall Guide Clean removal for all install methods Contributing Development setup and guidelines OpenAPI Spec OpenAPI 3.0 specification Security Policy Vulnerability reporting and security practices VM Deployment Complete guide: VM + nginx + Cloudflare setup Features Gallery Visual dashboard tour with screenshots Release Checklist Pre-release validation steps
OmniRoute has 210+ features planned across multiple development phases. Here are the key areas:
Category Planned Features Highlights ๐ง Routing & Intelligence 25+ Lowest-latency routing, tag-based routing, quota preflight, P2C account selection ๐ Security & Compliance 20+ SSRF hardening, credential cloaking, rate-limit per endpoint, management key scoping ๐ Observability 15+ OpenTelemetry integration, real-time quota monitoring, cost tracking per model ๐ Provider Integrations 20+ Dynamic model registry, provider cooldowns, multi-account Codex, Copilot quota parsing โก Performance 15+ Dual cache layer, prompt cache, response cache, streaming keepalive, batch API ๐ Ecosystem 10+ WebSocket API, config hot-reload, distributed config store, commercial mode - ๐ OpenCode Integration โ Native provider support for the OpenCode AI coding IDE
- ๐ TRAE Integration โ Full support for the TRAE AI development framework
- ๐ฆ Batch API โ Asynchronous batch processing for bulk requests
- ๐ฏ Tag-Based Routing โ Route requests based on custom tags and metadata
- ๐ฐ Lowest-Cost Strategy โ Automatically select the cheapest available provider
๐ Full feature specifications available in
docs/new-features/(217 detailed specs)
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
# Create a release โ npm publish happens automatically gh release create v2.0.0 --title "v2.0.0" --generate-notes
Special thanks to 9router by decolua โ the original project that inspired this fork. OmniRoute builds upon that incredible foundation with additional features, multi-modal APIs, and a full TypeScript rewrite.
Special thanks to CLIProxyAPI โ the original Go implementation that inspired this JavaScript port.
MIT License - see LICENSE for details.
Built with โค๏ธ for developers who code 24/7
omniroute.online










