deltallm
Route, manage, and analyze your LLM requests across multiple providers with a unified API interface
Description
Route, manage, and analyze your LLM requests across multiple providers with a unified API interface
README
DeltaLLM is a self-hosted AI gateway that gives you a single OpenAI-compatible API for 100+ LLM providers — with enterprise controls like routing, budgets, guardrails, and team management built in.
# Before: Direct to OpenAI
client = OpenAI(api_key="sk-...")
# After: Through DeltaLLM
client = OpenAI(
base_url="http://localhost:4002/v1", # ← Just change this
api_key="sk-deltallm-key"
)That's it. Your existing code works unchanged — now with routing, spend tracking, and guardrails.
Manage all your model deployments, API keys, teams, and usage from a clean web interface:
- Unified API — One OpenAI-compatible endpoint for 100+ LLM providers
- Virtual API Keys — Scoped keys with budgets, rate limits, and model restrictions
- MCP Gateway — Register external MCP servers, expose approved tools safely
- Routing & Failover — Multiple strategies with automatic retries
- Guardrails — Built-in PII detection and prompt injection protection
- Spend Tracking — Per-key, per-team, per-model cost attribution
- RBAC — Role-based access at platform, organization, and team levels
- Admin Dashboard — Full-featured web UI for managing everything
- Response Caching — Memory, Redis, or S3 backends for lower latency and cost
- Observability — Prometheus metrics, request logging, and spend analytics
Docs: https://deltallm.readthedocs.io/en/latest
- Option 1: Docker Compose: fastest way to run DeltaLLM locally for evaluation.
- Option 2: Kubernetes From a Released Chart: install with Helm from the public chart repository without cloning the repository.
- Option 3: Local Development From the Repo: best path for contributors and for local backend or UI work.
Use Docker Compose if you want the fastest working setup.
git clone https://github.com/deltawi/deltallm.git
cd deltallmcp config.example.yaml config.yamlFor the quickest first successful request, enable one-time model bootstrap in config.yaml:
general_settings:
model_deployment_source: db_only
model_deployment_bootstrap_from_config: trueThis seeds the sample model_list into the database on first startup. After the first successful boot, set model_deployment_bootstrap_from_config back to false.
DeltaLLM will not start with placeholder values such as change-me.
python3 -c 'import secrets; print("DELTALLM_MASTER_KEY=sk-" + secrets.token_hex(20) + "A1")'
python3 -c 'import secrets; print("DELTALLM_SALT_KEY=" + secrets.token_hex(32))'Create a .env file in the project root:
DELTALLM_MASTER_KEY=sk-your-generated-master-key
DELTALLM_SALT_KEY=your-generated-salt-key
OPENAI_API_KEY=sk-your-openai-key
PLATFORM_BOOTSTRAP_ADMIN_EMAIL=admin@example.com
PLATFORM_BOOTSTRAP_ADMIN_PASSWORD=ChangeMe123!The sample config uses OPENAI_API_KEY. If you want a different provider, edit config.yaml before starting.
docker compose --profile single up -d --buildIf you want the full Presidio engine for guardrails instead of the default regex fallback:
INSTALL_PRESIDIO=true docker compose --profile single up -d --buildThis starts:
- DeltaLLM on
http://localhost:4002 - PostgreSQL
- Redis
Check liveliness:
curl http://localhost:4002/health/livelinessList available models:
curl http://localhost:4002/v1/models \
-H "Authorization: Bearer $DELTALLM_MASTER_KEY"If this list is empty, you did not bootstrap a model and must either:
- set
model_deployment_bootstrap_from_config: trueand restart once, or - create a model deployment in the Admin UI before sending requests
curl http://localhost:4002/v1/chat/completions \
-H "Authorization: Bearer $DELTALLM_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Hello from DeltaLLM"}
]
}'Open http://localhost:4002.
If you set PLATFORM_BOOTSTRAP_ADMIN_EMAIL and PLATFORM_BOOTSTRAP_ADMIN_PASSWORD, you can log in with that initial admin account. You can also keep using the master key for gateway calls.
Released Helm charts and matching values overlays are published to the public Helm repository at https://deltawi.github.io/deltallm.
helm repo add deltallm https://deltawi.github.io/deltallm
helm repo updateGenerate the required secrets first:
export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"Quick-start install with bundled PostgreSQL and Redis:
helm install deltallm deltallm/deltallm \
--version <chart-version> \
--namespace deltallm \
--create-namespace \
-f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
--set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
--set secret.values.saltKey="$DELTALLM_SALT_KEY" \
--set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
--set-string env[0].value=admin@example.com \
--set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
--set-string env[1].value='ChangeMe123!'If you want the Presidio-enabled image variant from the same chart release:
helm install deltallm deltallm/deltallm \
--version <chart-version> \
--namespace deltallm \
--create-namespace \
-f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
--set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
--set secret.values.saltKey="$DELTALLM_SALT_KEY" \
--set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
--set-string env[0].value=admin@example.com \
--set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
--set-string env[1].value='ChangeMe123!' \
--set image.tag=v<chart-version>-presidiovalues-eval-<chart-version>.yaml is the self-contained quick-start profile. Use values-production-<chart-version>.yaml with external PostgreSQL and Redis for production.
Use the latest GitHub Release version for <chart-version>. The exact copy-paste install commands for each release live in the release notes.
For full Kubernetes examples and values, see docs/deployment/kubernetes.md.
Use this path if you want to work on the backend or UI locally instead of running the full Compose stack.
- Python 3.11+
- Node.js 20+
- PostgreSQL 15+
- Redis 7+ optional
uv is the recommended backend installer because the repo includes uv.lock.
uv sync --devIf you want the full Presidio engine locally for guardrails:
uv sync --dev --extra guardrails-presidioIn another shell for the UI:
cd ui
npm ci
cd ..export DATABASE_URL="postgresql://postgres:postgres@localhost:5432/deltallm"
export DELTALLM_CONFIG_PATH=./config.yaml
export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"
export OPENAI_API_KEY="sk-your-openai-key"
export PLATFORM_BOOTSTRAP_ADMIN_EMAIL="admin@example.com"
export PLATFORM_BOOTSTRAP_ADMIN_PASSWORD="ChangeMe123!"If Redis is available:
export REDIS_URL="redis://localhost:6379/0"cp config.example.yaml config.yamlFor a fresh database, enable one-time bootstrap in config.yaml if you want the sample model available immediately:
general_settings:
model_deployment_source: db_only
model_deployment_bootstrap_from_config: trueuv run prisma generate --schema=./prisma/schema.prisma
uv run prisma py fetch
uv run prisma db push --schema=./prisma/schema.prismauv run uvicorn src.main:app --host 0.0.0.0 --port 8000 --reloadcd ui
npm run devThe local development UI runs at http://localhost:5000 and proxies API requests to the backend on http://localhost:8000.
- Docker quick start
- Local installation
- Gateway usage examples
- Configuration reference
- Model configuration
- Authentication
uv run pytest⭐ Star this repo if you find it useful!
- 🐛 Report issues
- 💡 Request features
- 📖 Read the docs
- 🤝 PRs welcome — see Local Development to get started
See LICENSE.
Release History
| Version | Changes | Urgency | Date |
|---|---|---|---|
| v0.1.20-rc2 | ## What's Changed * Fix batch embedding output compatibility and refactor worker internals by @deltawi in https://github.com/deltawi/deltallm/pull/99 **Full Changelog**: https://github.com/deltawi/deltallm/compare/v0.1.20-rc1...v0.1.20-rc2 | High | 4/18/2026 |
| v0.1.19 | ## Highlights - **Named credentials** for provider connection settings — share one credential across many model deployments and rotate it once. (#65, fixes #58) - **Embedding batch overhaul** — three phases of reliability, streaming/throughput, and operator hardening, plus upstream **microbatching** to coalesce eligible embedding requests into fewer provider round-trips. (#66, #69, #70, #76, #77, #78) - **Custom upstream auth headers** for OpenAI-compatible providers that don't use `Authorizati | High | 4/11/2026 |

