deltallm
Route, manage, and analyze your LLM requests across multiple providers with a unified API interface
Description
Route, manage, and analyze your LLM requests across multiple providers with a unified API interface
README
DeltaLLM is a self-hosted AI gateway that gives you a single OpenAI-compatible API for 100+ LLM providers โ with enterprise controls like routing, budgets, guardrails, and team management built in.
# Before: Direct to OpenAI
client = OpenAI(api_key="sk-...")
# After: Through DeltaLLM
client = OpenAI(
base_url="http://localhost:4002/v1", # โ Just change this
api_key="sk-deltallm-key"
)That's it. Your existing code works unchanged โ now with routing, spend tracking, and guardrails.
Manage all your model deployments, API keys, teams, and usage from a clean web interface:
- Unified API โ One OpenAI-compatible endpoint for 100+ LLM providers
- Virtual API Keys โ Scoped keys with budgets, rate limits, and model restrictions
- MCP Gateway โ Register external MCP servers, expose approved tools safely
- Routing & Failover โ Multiple strategies with automatic retries
- Guardrails โ Built-in PII detection and prompt injection protection
- Spend Tracking โ Per-key, per-team, per-model cost attribution
- RBAC โ Role-based access at platform, organization, and team levels
- Admin Dashboard โ Full-featured web UI for managing everything
- Response Caching โ Memory, Redis, or S3 backends for lower latency and cost
- Observability โ Prometheus metrics, request logging, and spend analytics
Docs: https://deltallm.readthedocs.io/en/latest
- Option 1: Docker Compose: fastest way to run DeltaLLM locally for evaluation.
- Option 2: Kubernetes From a Released Chart: install with Helm from the public chart repository without cloning the repository.
- Option 3: Local Development From the Repo: best path for contributors and for local backend or UI work.
Use Docker Compose if you want the fastest working setup.
git clone https://github.com/deltawi/deltallm.git
cd deltallmcp config.example.yaml config.yamlFor the quickest first successful request, enable one-time model bootstrap in config.yaml:
general_settings:
model_deployment_source: db_only
model_deployment_bootstrap_from_config: trueThis seeds the sample model_list into the database on first startup. After the first successful boot, set model_deployment_bootstrap_from_config back to false.
DeltaLLM will not start with placeholder values such as change-me.
python3 -c 'import secrets; print("DELTALLM_MASTER_KEY=sk-" + secrets.token_hex(20) + "A1")'
python3 -c 'import secrets; print("DELTALLM_SALT_KEY=" + secrets.token_hex(32))'Create a .env file in the project root:
DELTALLM_MASTER_KEY=sk-your-generated-master-key
DELTALLM_SALT_KEY=your-generated-salt-key
OPENAI_API_KEY=sk-your-openai-key
PLATFORM_BOOTSTRAP_ADMIN_EMAIL=admin@example.com
PLATFORM_BOOTSTRAP_ADMIN_PASSWORD=ChangeMe123!The sample config uses OPENAI_API_KEY. If you want a different provider, edit config.yaml before starting.
docker compose --profile single up -d --buildIf you want the full Presidio engine for guardrails instead of the default regex fallback:
INSTALL_PRESIDIO=true docker compose --profile single up -d --buildThis starts:
- DeltaLLM on
http://localhost:4002 - PostgreSQL
- Redis
Check liveliness:
curl http://localhost:4002/health/livelinessList available models:
curl http://localhost:4002/v1/models \
-H "Authorization: Bearer $DELTALLM_MASTER_KEY"If this list is empty, you did not bootstrap a model and must either:
- set
model_deployment_bootstrap_from_config: trueand restart once, or - create a model deployment in the Admin UI before sending requests
curl http://localhost:4002/v1/chat/completions \
-H "Authorization: Bearer $DELTALLM_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Hello from DeltaLLM"}
]
}'Open http://localhost:4002.
If you set PLATFORM_BOOTSTRAP_ADMIN_EMAIL and PLATFORM_BOOTSTRAP_ADMIN_PASSWORD, you can log in with that initial admin account. You can also keep using the master key for gateway calls.
Released Helm charts and matching values overlays are published to the public Helm repository at https://deltawi.github.io/deltallm.
helm repo add deltallm https://deltawi.github.io/deltallm
helm repo updateGenerate the required secrets first:
export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"Quick-start install with bundled PostgreSQL and Redis:
helm install deltallm deltallm/deltallm \
--version <chart-version> \
--namespace deltallm \
--create-namespace \
-f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
--set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
--set secret.values.saltKey="$DELTALLM_SALT_KEY" \
--set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
--set-string env[0].value=admin@example.com \
--set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
--set-string env[1].value='ChangeMe123!'If you want the Presidio-enabled image variant from the same chart release:
helm install deltallm deltallm/deltallm \
--version <chart-version> \
--namespace deltallm \
--create-namespace \
-f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
--set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
--set secret.values.saltKey="$DELTALLM_SALT_KEY" \
--set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
--set-string env[0].value=admin@example.com \
--set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
--set-string env[1].value='ChangeMe123!' \
--set image.tag=v<chart-version>-presidiovalues-eval-<chart-version>.yaml is the self-contained quick-start profile. Use values-production-<chart-version>.yaml with external PostgreSQL and Redis for production.
Use the latest GitHub Release version for <chart-version>. The exact copy-paste install commands for each release live in the release notes.
For full Kubernetes examples and values, see docs/deployment/kubernetes.md.
Use this path if you want to work on the backend or UI locally instead of running the full Compose stack.
- Python 3.11+
- Node.js 20+
- PostgreSQL 15+
- Redis 7+ optional
uv is the recommended backend installer because the repo includes uv.lock.
uv sync --devIf you want the full Presidio engine locally for guardrails:
uv sync --dev --extra guardrails-presidioIn another shell for the UI:
cd ui
npm ci
cd ..export DATABASE_URL="postgresql://postgres:postgres@localhost:5432/deltallm"
export DELTALLM_CONFIG_PATH=./config.yaml
export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"
export OPENAI_API_KEY="sk-your-openai-key"
export PLATFORM_BOOTSTRAP_ADMIN_EMAIL="admin@example.com"
export PLATFORM_BOOTSTRAP_ADMIN_PASSWORD="ChangeMe123!"If Redis is available:
export REDIS_URL="redis://localhost:6379/0"cp config.example.yaml config.yamlFor a fresh database, enable one-time bootstrap in config.yaml if you want the sample model available immediately:
general_settings:
model_deployment_source: db_only
model_deployment_bootstrap_from_config: trueuv run prisma generate --schema=./prisma/schema.prisma
uv run prisma py fetch
uv run prisma db push --schema=./prisma/schema.prismauv run uvicorn src.main:app --host 0.0.0.0 --port 8000 --reloadcd ui
npm run devThe local development UI runs at http://localhost:5000 and proxies API requests to the backend on http://localhost:8000.
- Docker quick start
- Local installation
- Gateway usage examples
- Configuration reference
- Model configuration
- Authentication
uv run pytestโญ Star this repo if you find it useful!
- ๐ Report issues
- ๐ก Request features
- ๐ Read the docs
- ๐ค PRs welcome โ see Local Development to get started
See LICENSE.
Release History
| Version | Changes | Urgency | Date |
|---|---|---|---|
| v0.1.24-rc2 | Release candidate for v0.1.24 including OpenAI Batch API compatibility fixes for LiteLLM. | High | 5/6/2026 |
| v0.1.24 | # v0.1.24 This release focuses on batch reliability, chat batch support, model access governance, upstream/provider correctness, and admin UI polish. Compared against `v0.1.23`. ## Highlights ### Batch API and worker reliability - Added chat batch support with concurrent execution for OpenAI-compatible providers, `sync_microbatch` configuration, safe per-item fallback, cancellation/finalization hardening, and model UI support for chat batching parameters. (#150) - Added structured batch ret | High | 5/6/2026 |
| v0.1.23 | ## What's Changed * Include boto3 in runtime dependencies by @deltawi in https://github.com/deltawi/deltallm/pull/118 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.22...v0.1.23 | High | 4/25/2026 |
| v0.1.20-rc3 | ## What's Changed * hotfix: make audit detail lookup text-safe by @deltawi in https://github.com/deltawi/deltallm/pull/114 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.21-rc2...v0.1.20-rc3 | High | 4/24/2026 |
| v0.1.21-rc1 | ## What's Changed * Fix deployment health classification for upstream request failures by @deltawi in https://github.com/deltawi/deltallm/pull/101 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.20-rc2...v0.1.21-rc1 | High | 4/18/2026 |
| v0.1.20-rc2 | ## What's Changed * Fix batch embedding output compatibility and refactor worker internals by @deltawi in https://github.com/deltawi/deltallm/pull/99 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.20-rc1...v0.1.20-rc2 | High | 4/18/2026 |
| v0.1.20-rc2 | ## What's Changed * Fix batch embedding output compatibility and refactor worker internals by @deltawi in https://github.com/deltawi/deltallm/pull/99 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.20-rc1...v0.1.20-rc2 | High | 4/18/2026 |
| v0.1.20-rc2 | ## What's Changed * Fix batch embedding output compatibility and refactor worker internals by @deltawi in https://github.com/deltawi/deltallm/pull/99 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.20-rc1...v0.1.20-rc2 | High | 4/18/2026 |
| v0.1.20-rc1 | ## v0.1.20-rc1 v0.1.20-rc1 is a release candidate focused on a major new asynchronous embeddings batch pipeline, plus deployment hardening, audit-log improvements, and a few UI/documentation updates. ### Highlights #### New: Embeddings Batch API DeltaLLM now supports asynchronous embeddings batches end to end. Whatโs included: - Upload JSONL batch input files with purpose=batch - Create batches through /v1/batches - Queue, execute, finalize, and retain ba | High | 4/15/2026 |
| v0.1.19 | ## Highlights - **Named credentials** for provider connection settings โ share one credential across many model deployments and rotate it once. (#65, fixes #58) - **Embedding batch overhaul** โ three phases of reliability, streaming/throughput, and operator hardening, plus upstream **microbatching** to coalesce eligible embedding requests into fewer provider round-trips. (#66, #69, #70, #76, #77, #78) - **Custom upstream auth headers** for OpenAI-compatible providers that don't use `Authorizati | High | 4/11/2026 |
| v0.1.18 | This release improves request observability across the dashboard and usage logs. ### Fixed - Failed gateway requests are now included in the Home Dashboard Total Requests metric, so totals reflect all traffic instead of only successful requests. This gives operators a more accurate view of system activity and health. Issue: #52 (https://github.com/deltawi/deltallm/issues/52) - Failed gateway requests now appear in Usage โ Request Logs in addition to Audit Logs. These entries in | Medium | 4/1/2026 |
| v0.1.17 | ## What's Changed * fix: copy helm values from source workspace by @deltawi in https://github.com/deltawi/deltallm/pull/50 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.16...v0.1.17 | Medium | 3/28/2026 |
| v0.1.16 | ## What's Changed * fix: publish runnable helm quickstart overlays by @deltawi in https://github.com/deltawi/deltallm/pull/49 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.15...v0.1.16 | Medium | 3/28/2026 |
| v0.1.15 | ## What's Changed * feat: publish helm repo to github pages by @deltawi in https://github.com/deltawi/deltallm/pull/48 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.14...v0.1.15 | Medium | 3/27/2026 |
| v0.1.14 | ## What's Changed * feat: publish helm charts on releases by @deltawi in https://github.com/deltawi/deltallm/pull/47 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.13...v0.1.14 | Medium | 3/27/2026 |
| v0.1.13 | ## What's Changed * Fix/helm callback settings by @deltawi in https://github.com/deltawi/deltallm/pull/46 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.12...v0.1.13 | Medium | 3/27/2026 |
| v0.1.12 | ## What's Changed * fix: remove dead guardrails imports by @deltawi in https://github.com/deltawi/deltallm/pull/45 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.11...v0.1.12 | Medium | 3/27/2026 |
| v0.1.11 | ## What's Changed * Feat/helm release dockerhub by @deltawi in https://github.com/deltawi/deltallm/pull/43 * Feat/guardrails scoped search by @deltawi in https://github.com/deltawi/deltallm/pull/44 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.10...v0.1.11 | Medium | 3/27/2026 |
| v0.1.10 | ## What's Changed * fix: tighten guardrails presets and presidio setup by @deltawi in https://github.com/deltawi/deltallm/pull/42 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.9...v0.1.10 | Medium | 3/27/2026 |
| v0.1.9 | ## What's Changed * Fix/team self service create flow by @deltawi in https://github.com/deltawi/deltallm/pull/38 * fix: tighten admin ui access and navigation density by @deltawi in https://github.com/deltawi/deltallm/pull/39 * fix: tighten people access provisioning flows by @deltawi in https://github.com/deltawi/deltallm/pull/40 * Feat/sidebar redesign by @deltawi in https://github.com/deltawi/deltallm/pull/41 We've got also a playground functionality to test models ๐ **Full Chan | Medium | 3/26/2026 |
| v0.1.8 | ## What's Changed * The docs flow failed on first boot because the container ran prisma dโฆ by @deltawi in https://github.com/deltawi/deltallm/pull/35 * Feat/36 email lifecycle phase0 4 by @deltawi in https://github.com/deltawi/deltallm/pull/37 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.7...v0.1.8 | Medium | 3/25/2026 |
| v0.1.7 | ## What's Changed * Feat/self service keys by @deltawi in https://github.com/deltawi/deltallm/pull/34 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.6...v0.1.7 | Medium | 3/24/2026 |
| v0.1.6 | ## What's Changed * Feat/group adv tab by @deltawi in https://github.com/deltawi/deltallm/pull/25 * Harden routing policy runtime behavior by @deltawi in https://github.com/deltawi/deltallm/pull/26 * Upgrade rate-limiting by @deltawi in https://github.com/deltawi/deltallm/pull/27 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.5...v0.1.6 | Medium | 3/24/2026 |
| v0.1.5 | ## What's Changed * enhancing docs + git visiblity by @deltawi in https://github.com/deltawi/deltallm/pull/23 * Fix/runtime state and caching by @deltawi in https://github.com/deltawi/deltallm/pull/24 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.4...v0.1.5 | Low | 3/18/2026 |
| v0.1.4 | ## What's Changed * fixing forward slash inside model name being removed by normalizer by @deltawi in https://github.com/deltawi/deltallm/pull/14 * cache pricing was set and wasn't calculated by @deltawi in https://github.com/deltawi/deltallm/pull/15 * Reporting and query layer optimization by @deltawi in https://github.com/deltawi/deltallm/pull/16 * normalizing TTS and STT billing by @deltawi in https://github.com/deltawi/deltallm/pull/17 * mcp support with governance by @deltawi in https: | Low | 3/17/2026 |
| v0.1.3 | ## What's Changed * adding service account concept in keys by @deltawi in https://github.com/deltawi/deltallm/pull/12 * Fix/model names slash by @deltawi in https://github.com/deltawi/deltallm/pull/13 **Full Changelog**: https://github.com/deltawi/deltallm/compare/0.1.2...v0.1.3 | Low | 3/9/2026 |
| 0.1.2 | ## What's Changed * Harden provider uniformity by @deltawi in https://github.com/deltawi/deltallm/pull/2 * Feat/pagination by @deltawi in https://github.com/deltawi/deltallm/pull/3 * feat: add embeddings batch APIs and harden batch auth/cancel flows by @deltawi in https://github.com/deltawi/deltallm/pull/4 * Feat/batch page by @deltawi in https://github.com/deltawi/deltallm/pull/5 * Feat/audit logs by @deltawi in https://github.com/deltawi/deltallm/pull/6 * Docs/installation by @deltawi in | Low | 3/8/2026 |
| v0.1.1rc | ## What's Changed * Harden provider uniformity by @deltawi in https://github.com/deltawi/deltallm/pull/2 * Feat/pagination by @deltawi in https://github.com/deltawi/deltallm/pull/3 * feat: add embeddings batch APIs and harden batch auth/cancel flows by @deltawi in https://github.com/deltawi/deltallm/pull/4 * Feat/batch page by @deltawi in https://github.com/deltawi/deltallm/pull/5 * Feat/audit logs by @deltawi in https://github.com/deltawi/deltallm/pull/6 * Docs/installation by @deltawi in | Low | 3/4/2026 |
| v0.1 | We're excited to announce the first release of DeltaLLM, an open-source LLM gateway that provides a unified OpenAI-compatible API for multiple LLM providers with enterprise-grade features. Unified LLM Proxy OpenAI-compatible API supporting chat completions, embeddings, image generation, text-to-speech, speech-to-text, and reranking Multi-provider support: OpenAI, Anthropic, Azure OpenAI, Groq, and more Drop-in replacement โ just change your base_url in any OpenAI SDK client Routing & Re | Low | 2/23/2026 |

