deltallm

Home > MCP Servers > deltallm

Route, manage, and analyze your LLM requests across multiple providers with a unified API interface

ai-gateway ai-infrastructure api-gateway kubernetes llm-gateway llm-proxy llm-routing mcp model-context-protocol python

Why this rank:Release freshnessHealthy release cadenceStrong adoption

Description

Route, manage, and analyze your LLM requests across multiple providers with a unified API interface

README

What is DeltaLLM?

DeltaLLM is a self-hosted AI gateway that gives you a single OpenAI-compatible API for 100+ LLM providers — with enterprise controls like routing, budgets, guardrails, and team management built in.

One Line Change

# Before: Direct to OpenAI
client = OpenAI(api_key="sk-...")

# After: Through DeltaLLM
client = OpenAI(
    base_url="http://localhost:4002/v1",  # ← Just change this
    api_key="sk-deltallm-key"
)

That's it. Your existing code works unchanged — now with routing, spend tracking, and guardrails.

Admin UI

Manage all your model deployments, API keys, teams, and usage from a clean web interface:

Key Features

Unified API — One OpenAI-compatible endpoint for 100+ LLM providers
Virtual API Keys — Scoped keys with budgets, rate limits, and model restrictions
MCP Gateway — Register external MCP servers, expose approved tools safely
Routing & Failover — Multiple strategies with automatic retries
Guardrails — Built-in PII detection and prompt injection protection
Spend Tracking — Per-key, per-team, per-model cost attribution
RBAC — Role-based access at platform, organization, and team levels
Admin Dashboard — Full-featured web UI for managing everything
Response Caching — Memory, Redis, or S3 backends for lower latency and cost
Observability — Prometheus metrics, request logging, and spend analytics

Docs: https://deltallm.readthedocs.io/en/latest

Choose Your Install Path

Option 1: Docker Compose: fastest way to run DeltaLLM locally for evaluation.
Option 2: Kubernetes From a Released Chart: install with Helm from the public chart repository without cloning the repository.
Option 3: Local Development From the Repo: best path for contributors and for local backend or UI work.

Option 1: Docker Compose

Use Docker Compose if you want the fastest working setup.

1. Clone the repository

git clone https://github.com/deltawi/deltallm.git
cd deltallm

2. Create a local config

cp config.example.yaml config.yaml

For the quickest first successful request, enable one-time model bootstrap in config.yaml:

general_settings:
  model_deployment_source: db_only
  model_deployment_bootstrap_from_config: true

This seeds the sample model_list into the database on first startup. After the first successful boot, set model_deployment_bootstrap_from_config back to false.

3. Generate required secrets

DeltaLLM will not start with placeholder values such as change-me.

python3 -c 'import secrets; print("DELTALLM_MASTER_KEY=sk-" + secrets.token_hex(20) + "A1")'
python3 -c 'import secrets; print("DELTALLM_SALT_KEY=" + secrets.token_hex(32))'

Create a .env file in the project root:

DELTALLM_MASTER_KEY=sk-your-generated-master-key
DELTALLM_SALT_KEY=your-generated-salt-key
OPENAI_API_KEY=sk-your-openai-key
PLATFORM_BOOTSTRAP_ADMIN_EMAIL=admin@example.com
PLATFORM_BOOTSTRAP_ADMIN_PASSWORD=ChangeMe123!

The sample config uses OPENAI_API_KEY. If you want a different provider, edit config.yaml before starting.

4. Start DeltaLLM

docker compose --profile single up -d --build

If you want the full Presidio engine for guardrails instead of the default regex fallback:

INSTALL_PRESIDIO=true docker compose --profile single up -d --build

This starts:

DeltaLLM on http://localhost:4002
PostgreSQL
Redis

5. Verify the gateway

Check liveliness:

curl http://localhost:4002/health/liveliness

List available models:

curl http://localhost:4002/v1/models \
  -H "Authorization: Bearer $DELTALLM_MASTER_KEY"

If this list is empty, you did not bootstrap a model and must either:

set model_deployment_bootstrap_from_config: true and restart once, or
create a model deployment in the Admin UI before sending requests

6. Send your first request

curl http://localhost:4002/v1/chat/completions \
  -H "Authorization: Bearer $DELTALLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Hello from DeltaLLM"}
    ]
  }'

7. Open the Admin UI

Open http://localhost:4002.

If you set PLATFORM_BOOTSTRAP_ADMIN_EMAIL and PLATFORM_BOOTSTRAP_ADMIN_PASSWORD, you can log in with that initial admin account. You can also keep using the master key for gateway calls.

Option 2: Kubernetes From a Released Chart

Released Helm charts and matching values overlays are published to the public Helm repository at https://deltawi.github.io/deltallm.

helm repo add deltallm https://deltawi.github.io/deltallm
helm repo update

Generate the required secrets first:

export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"

Quick-start install with bundled PostgreSQL and Redis:

helm install deltallm deltallm/deltallm \
  --version <chart-version> \
  --namespace deltallm \
  --create-namespace \
  -f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
  --set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
  --set secret.values.saltKey="$DELTALLM_SALT_KEY" \
  --set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
  --set-string env[0].value=admin@example.com \
  --set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
  --set-string env[1].value='ChangeMe123!'

If you want the Presidio-enabled image variant from the same chart release:

helm install deltallm deltallm/deltallm \
  --version <chart-version> \
  --namespace deltallm \
  --create-namespace \
  -f https://deltawi.github.io/deltallm/values-eval-<chart-version>.yaml \
  --set secret.values.masterKey="$DELTALLM_MASTER_KEY" \
  --set secret.values.saltKey="$DELTALLM_SALT_KEY" \
  --set-string env[0].name=PLATFORM_BOOTSTRAP_ADMIN_EMAIL \
  --set-string env[0].value=admin@example.com \
  --set-string env[1].name=PLATFORM_BOOTSTRAP_ADMIN_PASSWORD \
  --set-string env[1].value='ChangeMe123!' \
  --set image.tag=v<chart-version>-presidio

values-eval-<chart-version>.yaml is the self-contained quick-start profile. Use values-production-<chart-version>.yaml with external PostgreSQL and Redis for production.

Use the latest GitHub Release version for <chart-version>. The exact copy-paste install commands for each release live in the release notes.

For full Kubernetes examples and values, see docs/deployment/kubernetes.md.

Option 3: Local Development From the Repo

Use this path if you want to work on the backend or UI locally instead of running the full Compose stack.

Requirements

Python 3.11+
Node.js 20+
PostgreSQL 15+
Redis 7+ optional

1. Install dependencies

uv is the recommended backend installer because the repo includes uv.lock.

uv sync --dev

If you want the full Presidio engine locally for guardrails:

uv sync --dev --extra guardrails-presidio

In another shell for the UI:

cd ui
npm ci
cd ..

2. Export environment variables

export DATABASE_URL="postgresql://postgres:postgres@localhost:5432/deltallm"
export DELTALLM_CONFIG_PATH=./config.yaml
export DELTALLM_MASTER_KEY="$(python3 -c 'import secrets; print(\"sk-\" + secrets.token_hex(20) + \"A1\")')"
export DELTALLM_SALT_KEY="$(openssl rand -hex 32)"
export OPENAI_API_KEY="sk-your-openai-key"
export PLATFORM_BOOTSTRAP_ADMIN_EMAIL="admin@example.com"
export PLATFORM_BOOTSTRAP_ADMIN_PASSWORD="ChangeMe123!"

If Redis is available:

export REDIS_URL="redis://localhost:6379/0"

3. Create config and enable one-time bootstrap if needed

cp config.example.yaml config.yaml

For a fresh database, enable one-time bootstrap in config.yaml if you want the sample model available immediately:

general_settings:
  model_deployment_source: db_only
  model_deployment_bootstrap_from_config: true

4. Initialize Prisma and the database

uv run prisma generate --schema=./prisma/schema.prisma
uv run prisma py fetch
uv run prisma db push --schema=./prisma/schema.prisma

5. Start the backend

uv run uvicorn src.main:app --host 0.0.0.0 --port 8000 --reload

6. Start the UI

cd ui
npm run dev

The local development UI runs at http://localhost:5000 and proxies API requests to the backend on http://localhost:8000.

Useful Links

Testing

uv run pytest

Support & Contribute

⭐ Star this repo if you find it useful!

🐛 Report issues
💡 Request features
📖 Read the docs
🤝 PRs welcome — see Local Development to get started

License

See LICENSE.

Release History

Version	Changes	Urgency	Date
v0.1.24-rc2	Release candidate for v0.1.24 including OpenAI Batch API compatibility fixes for LiteLLM.	High	5/6/2026
v0.1.24	# v0.1.24 This release focuses on batch reliability, chat batch support, model access governance, upstream/provider correctness, and admin UI polish. Compared against `v0.1.23`. ## Highlights ### Batch API and worker reliability - Added chat batch support with concurrent execution for OpenAI-compatible providers, `sync_microbatch` configuration, safe per-item fallback, cancellation/finalization hardening, and model UI support for chat batching parameters. (#150) - Added structured batch ret	High	5/6/2026
v0.1.23	## What's Changed * Include boto3 in runtime dependencies by @deltawi in https://github.com/deltawi/deltallm/pull/118 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.22...v0.1.23	High	4/25/2026
v0.1.20-rc3	## What's Changed * hotfix: make audit detail lookup text-safe by @deltawi in https://github.com/deltawi/deltallm/pull/114 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.21-rc2...v0.1.20-rc3	High	4/24/2026
v0.1.21-rc1	## What's Changed * Fix deployment health classification for upstream request failures by @deltawi in https://github.com/deltawi/deltallm/pull/101 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.20-rc2...v0.1.21-rc1	High	4/18/2026
v0.1.20-rc2	## What's Changed * Fix batch embedding output compatibility and refactor worker internals by @deltawi in https://github.com/deltawi/deltallm/pull/99 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.20-rc1...v0.1.20-rc2	High	4/18/2026
v0.1.20-rc2	## What's Changed * Fix batch embedding output compatibility and refactor worker internals by @deltawi in https://github.com/deltawi/deltallm/pull/99 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.20-rc1...v0.1.20-rc2	High	4/18/2026
v0.1.20-rc2	## What's Changed * Fix batch embedding output compatibility and refactor worker internals by @deltawi in https://github.com/deltawi/deltallm/pull/99 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.20-rc1...v0.1.20-rc2	High	4/18/2026
v0.1.20-rc1	## v0.1.20-rc1 v0.1.20-rc1 is a release candidate focused on a major new asynchronous embeddings batch pipeline, plus deployment hardening, audit-log improvements, and a few UI/documentation updates. ### Highlights #### New: Embeddings Batch API DeltaLLM now supports asynchronous embeddings batches end to end. What’s included: - Upload JSONL batch input files with purpose=batch - Create batches through /v1/batches - Queue, execute, finalize, and retain ba	High	4/15/2026
v0.1.19	## Highlights - Named credentials for provider connection settings — share one credential across many model deployments and rotate it once. (#65, fixes #58) - Embedding batch overhaul — three phases of reliability, streaming/throughput, and operator hardening, plus upstream microbatching to coalesce eligible embedding requests into fewer provider round-trips. (#66, #69, #70, #76, #77, #78) - Custom upstream auth headers for OpenAI-compatible providers that don't use `Authorizati	High	4/11/2026
v0.1.18	This release improves request observability across the dashboard and usage logs. ### Fixed - Failed gateway requests are now included in the Home Dashboard Total Requests metric, so totals reflect all traffic instead of only successful requests. This gives operators a more accurate view of system activity and health. Issue: #52 (https://github.com/deltawi/deltallm/issues/52) - Failed gateway requests now appear in Usage → Request Logs in addition to Audit Logs. These entries in	Medium	4/1/2026
v0.1.17	## What's Changed * fix: copy helm values from source workspace by @deltawi in https://github.com/deltawi/deltallm/pull/50 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.16...v0.1.17	Medium	3/28/2026
v0.1.16	## What's Changed * fix: publish runnable helm quickstart overlays by @deltawi in https://github.com/deltawi/deltallm/pull/49 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.15...v0.1.16	Medium	3/28/2026
v0.1.15	## What's Changed * feat: publish helm repo to github pages by @deltawi in https://github.com/deltawi/deltallm/pull/48 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.14...v0.1.15	Medium	3/27/2026
v0.1.14	## What's Changed * feat: publish helm charts on releases by @deltawi in https://github.com/deltawi/deltallm/pull/47 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.13...v0.1.14	Medium	3/27/2026
v0.1.13	## What's Changed * Fix/helm callback settings by @deltawi in https://github.com/deltawi/deltallm/pull/46 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.12...v0.1.13	Medium	3/27/2026
v0.1.12	## What's Changed * fix: remove dead guardrails imports by @deltawi in https://github.com/deltawi/deltallm/pull/45 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.11...v0.1.12	Medium	3/27/2026
v0.1.11	## What's Changed * Feat/helm release dockerhub by @deltawi in https://github.com/deltawi/deltallm/pull/43 * Feat/guardrails scoped search by @deltawi in https://github.com/deltawi/deltallm/pull/44 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.10...v0.1.11	Medium	3/27/2026
v0.1.10	## What's Changed * fix: tighten guardrails presets and presidio setup by @deltawi in https://github.com/deltawi/deltallm/pull/42 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.9...v0.1.10	Medium	3/27/2026
v0.1.9	## What's Changed * Fix/team self service create flow by @deltawi in https://github.com/deltawi/deltallm/pull/38 * fix: tighten admin ui access and navigation density by @deltawi in https://github.com/deltawi/deltallm/pull/39 * fix: tighten people access provisioning flows by @deltawi in https://github.com/deltawi/deltallm/pull/40 * Feat/sidebar redesign by @deltawi in https://github.com/deltawi/deltallm/pull/41 We've got also a playground functionality to test models 👍 **Full Chan	Medium	3/26/2026
v0.1.8	## What's Changed * The docs flow failed on first boot because the container ran prisma d… by @deltawi in https://github.com/deltawi/deltallm/pull/35 * Feat/36 email lifecycle phase0 4 by @deltawi in https://github.com/deltawi/deltallm/pull/37 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.7...v0.1.8	Medium	3/25/2026
v0.1.7	## What's Changed * Feat/self service keys by @deltawi in https://github.com/deltawi/deltallm/pull/34 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.6...v0.1.7	Medium	3/24/2026
v0.1.6	## What's Changed * Feat/group adv tab by @deltawi in https://github.com/deltawi/deltallm/pull/25 * Harden routing policy runtime behavior by @deltawi in https://github.com/deltawi/deltallm/pull/26 * Upgrade rate-limiting by @deltawi in https://github.com/deltawi/deltallm/pull/27 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.5...v0.1.6	Medium	3/24/2026
v0.1.5	## What's Changed * enhancing docs + git visiblity by @deltawi in https://github.com/deltawi/deltallm/pull/23 * Fix/runtime state and caching by @deltawi in https://github.com/deltawi/deltallm/pull/24 Full Changelog: https://github.com/deltawi/deltallm/compare/v0.1.4...v0.1.5	Low	3/18/2026
v0.1.4	## What's Changed * fixing forward slash inside model name being removed by normalizer by @deltawi in https://github.com/deltawi/deltallm/pull/14 * cache pricing was set and wasn't calculated by @deltawi in https://github.com/deltawi/deltallm/pull/15 * Reporting and query layer optimization by @deltawi in https://github.com/deltawi/deltallm/pull/16 * normalizing TTS and STT billing by @deltawi in https://github.com/deltawi/deltallm/pull/17 * mcp support with governance by @deltawi in https:	Low	3/17/2026
v0.1.3	## What's Changed * adding service account concept in keys by @deltawi in https://github.com/deltawi/deltallm/pull/12 * Fix/model names slash by @deltawi in https://github.com/deltawi/deltallm/pull/13 Full Changelog: https://github.com/deltawi/deltallm/compare/0.1.2...v0.1.3	Low	3/9/2026
0.1.2	## What's Changed * Harden provider uniformity by @deltawi in https://github.com/deltawi/deltallm/pull/2 * Feat/pagination by @deltawi in https://github.com/deltawi/deltallm/pull/3 * feat: add embeddings batch APIs and harden batch auth/cancel flows by @deltawi in https://github.com/deltawi/deltallm/pull/4 * Feat/batch page by @deltawi in https://github.com/deltawi/deltallm/pull/5 * Feat/audit logs by @deltawi in https://github.com/deltawi/deltallm/pull/6 * Docs/installation by @deltawi in	Low	3/8/2026
v0.1.1rc	## What's Changed * Harden provider uniformity by @deltawi in https://github.com/deltawi/deltallm/pull/2 * Feat/pagination by @deltawi in https://github.com/deltawi/deltallm/pull/3 * feat: add embeddings batch APIs and harden batch auth/cancel flows by @deltawi in https://github.com/deltawi/deltallm/pull/4 * Feat/batch page by @deltawi in https://github.com/deltawi/deltallm/pull/5 * Feat/audit logs by @deltawi in https://github.com/deltawi/deltallm/pull/6 * Docs/installation by @deltawi in	Low	3/4/2026
v0.1	We're excited to announce the first release of DeltaLLM, an open-source LLM gateway that provides a unified OpenAI-compatible API for multiple LLM providers with enterprise-grade features. Unified LLM Proxy OpenAI-compatible API supporting chat completions, embeddings, image generation, text-to-speech, speech-to-text, and reranking Multi-provider support: OpenAI, Anthropic, Azure OpenAI, Groq, and more Drop-in replacement — just change your base_url in any OpenAI SDK client Routing & Re	Low	2/23/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

blender-mcp🛠️ Connect Blender to Claude AI for seamless 3D modeling and scene manipulation using the Model Context Protocol for enhanced creative workflows.main@2026-06-06

mcp-taiwan-legal-db台灣司法院判決 + 全國法規資料庫 MCP server · Query Taiwan legal data from any MCP AI agentmain@2026-06-05

arcade-mcpThe best way to create, deploy, and share MCP Serversmain@2026-06-04

dbt-mcpA MCP (Model Context Protocol) server for interacting with dbt.v1.20.0

More in MCP Servers

PlanExeCreate a plan from a description in minutes

agentroveYour own Claude Code UI, sandbox, in-browser VS Code, terminal, multi-provider support (Anthropic, OpenAI, GitHub Copilot, OpenRouter), custom skills, and MCP servers.

ProxmoxMCP-PlusEnhanced Proxmox MCP server with advanced virtualization management and full OpenAPI integration.

node9-proxyThe Execution Security Layer for the Agentic Era. Providing deterministic "Sudo" governance and audit logs for autonomous AI agents.