freshcrate
Skin:/

llmio

LLM API load-balancing gateway. LLM API 负载均衡网关.

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

LLM API load-balancing gateway. LLM API 负载均衡网关.

README

LLMIO

English | 中文

LLMIO is a Go-based LLM load‑balancing gateway that provides a unified REST API, weighted scheduling, logging, and a modern admin UI for LLM clients (openclaw / claude code / codex / gemini cli / cherry studio / open webui). It helps you integrate OpenAI, Anthropic, Gemini, and other model capabilities in a single service.

QQ group: 1083599685

Architecture

LLMIO Architecture

Features

  • Unified API: Compatible with OpenAI Chat Completions, OpenAI Responses, Gemini Native, and Anthropic Messages. Supports both streaming and non‑streaming passthrough.
  • Weighted scheduling: balancers/ provides two strategies (random by weight / priority by weight). You can route based on tool calling, structured output, and multimodal capability.
  • Admin Web UI: React + TypeScript + Tailwind + Vite console for providers, models, associations, logs, and metrics.
  • Rate limiting & failure handling: Built‑in rate‑limit fallback and provider connectivity checks for fault isolation.
  • Local persistence: Pure Go SQLite (db/llmio.db) for config and request logs, ready to use out of the box.

Deployment

Docker Compose (Recommended)

services:
  llmio:
    image: atopos31/llmio:latest
    ports:
      - 7070:7070
    volumes:
      - ./db:/app/db
    environment:
      - GIN_MODE=release
      - TOKEN=<YOUR_TOKEN>
      - TZ=Asia/Shanghai
docker compose up -d

Docker

docker run -d \
  --name llmio \
  -p 7070:7070 \
  -v $(pwd)/db:/app/db \
  -e GIN_MODE=release \
  -e TOKEN=<YOUR_TOKEN> \
  -e TZ=Asia/Shanghai \
  atopos31/llmio:latest

Local Run

Download the release package for your OS/arch from releases (version > 0.5.13). Example for linux amd64:

wget https://github.com/atopos31/llmio/releases/download/v0.5.13/llmio_0.5.13_linux_amd64.tar.gz

Extract:

tar -xzf ./llmio_0.5.13_linux_amd64.tar.gz

Start:

GIN_MODE=release TOKEN=<YOUR_TOKEN> ./llmio

The service will create ./db/llmio.db in the current directory as the SQLite persistence file.

Environment Variables

Variable Description Default Notes
TOKEN Console login and API auth for /openai /anthropic /gemini /v1 None Required for public access
GIN_MODE Gin runtime mode debug Use release in production
LLMIO_SERVER_PORT Server listen port 7070 Service listen port
TZ Timezone for logs and scheduling Host default Recommend explicit setting in containers (e.g. Asia/Shanghai)
DB_VACUUM Run SQLite VACUUM on startup Disabled Set to true to reclaim space

Development

Clone:

git clone https://github.com/atopos31/llmio.git
cd llmio

Build frontend (pnpm required):

make webui

Run backend (Go >= 1.26.1):

TOKEN=<YOUR_TOKEN> make run

Web UI: http://localhost:7070/

API Endpoints

LLMIO provides a multi‑provider REST API with the following endpoints:

Provider Path Method Description Auth
OpenAI /openai/v1/models GET List available models Bearer Token
OpenAI /openai/v1/chat/completions POST Create chat completion Bearer Token
OpenAI /openai/v1/responses POST Create response Bearer Token
Anthropic /anthropic/v1/models GET List available models x-api-key
Anthropic /anthropic/v1/messages POST Create message x-api-key
Anthropic /anthropic/v1/messages/count_tokens POST Count tokens x-api-key
Gemini /gemini/v1beta/models GET List available models x-goog-api-key
Gemini /gemini/v1beta/models/{model}:generateContent POST Generate content x-goog-api-key
Gemini /gemini/v1beta/models/{model}:streamGenerateContent POST Stream content x-goog-api-key
Generic /v1/models GET List models (compat) Bearer Token
Generic /v1/chat/completions POST Create chat completion (compat) Bearer Token
Generic /v1/responses POST Create response (compat) Bearer Token
Generic /v1/messages POST Create message (compat) x-api-key
Generic /v1/messages/count_tokens POST Count tokens (compat) x-api-key

Authentication

LLMIO uses different auth headers depending on the endpoint:

1. OpenAI‑style endpoints (Bearer Token)

Applies to /openai/v1/* and OpenAI‑compatible endpoints under /v1/*.

curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:7070/openai/v1/models

2. Anthropic‑style endpoints (x-api-key)

Applies to /anthropic/v1/* and Anthropic‑compatible endpoints under /v1/*.

curl -H "x-api-key: YOUR_TOKEN" http://localhost:7070/anthropic/v1/messages

3. Gemini Native endpoints (x-goog-api-key)

Applies to /gemini/v1beta/* endpoints.

curl -H "x-goog-api-key: YOUR_TOKEN" http://localhost:7070/gemini/v1beta/models

For claude code or codex, use these environment variables:

export OPENAI_API_KEY=<YOUR_TOKEN>
export ANTHROPIC_API_KEY=<YOUR_TOKEN>
export GEMINI_API_KEY=<YOUR_TOKEN>

Note: /v1/* paths are kept for compatibility. Prefer the provider‑specific routes.

Project Structure

.
├─ main.go              # HTTP server entry and routes
├─ handler/             # REST handlers
├─ service/             # Business logic and load‑balancing
├─ middleware/          # Auth, rate limit, streaming middleware
├─ providers/           # Provider adapters
├─ balancers/           # Weight and scheduling strategies
├─ models/              # GORM models and DB init
├─ common/              # Shared helpers
├─ webui/               # React + TypeScript admin UI
└─ docs/                # Ops & usage docs

Screenshots

Dashboard

Associations

Logs

License

This project is released under the MIT License.

Star History

Stargazers over time

Release History

VersionChangesUrgencyDate
v0.8.12## Changelog * f5e97e3fde56e5b02d45ed6f1bc5ec295dca2413 fix: add max_tokens=1 for anthropic test High5/29/2026
v0.8.11## Changelog * c1a04b04c4681c4885d3882fc54dfca8acd4adb4 fix: omit session_id from upstream body High5/20/2026
v0.8.9## Changelog * a6dcae0050e27738bfe2cddd033066f663725fee fix: anthropci total tokens显示修正,代码格式化 * 7a56d4870cde7ef9d854df1a53655843521102de docs: 添加会话 IO 截图 * a0429117c6ed3b52b7f7a6b0b7d2db7519976fdc docs: 优化 README 截图区块,改为 2x2 布局并添加图片说明 * 27a73841750e97d878a2edc40f79389d147b0eb7 docs: 在 README 中补充可观测性功能介绍 High5/14/2026
v0.8.8## Changelog * b780b46d6efd8ffe7cfb0f2a6e4e38b4655e9368 fix(billing): 修复价格字段读取与表单体验问题 High5/8/2026
v0.8.5## Changelog * a7265f96b8b3e5dcb392166266a3a9704fa30aa4 feat(logs): 添加 session_id 字段记录与检索 High5/7/2026
v0.8.4## Changelog * 9fd55da43521cd0433aab331adde3b8cb58892dd feat: update tsconfig * b576e2574ea3006be6d7d1ae0b5fb90231195ab3 chore: add ignoreDeprecations to tsconfig * 5ebccd9dcd5b5313e71bbf52a01a1678edd25846 feat(config): 添加日志清理执行历史查看功能 * 06df4a137b68a5419b0a2dd013011b799f1776e1 docs: 全面更新 CLAUDE.md,补充 Gemini、熔断器、Before 管道等架构说明 High4/24/2026
v0.8.3## Changelog * 24e8376631bb445cf4eed9ae13ebb0c41dfb73d8 feat(logs): 日志列表展示 ID 并支持按 ID 搜索 High4/21/2026
v0.8.2## Changelog * 67367be66326b7f09b21e0fe996b6015aab395b4 fix(logs): 日志页筛选和分页状态同步到 URL High4/16/2026
v0.8.1## Changelog * 73015a49261fe64354d3677a70b4898443a89baa fix(logs): assistant tool_calls 消息正确解析展示 High4/14/2026
v0.8.0## Changelog * c8af04067935e36cd4530dfe7962e1caa1b4481f feat(logs): 重写 ChatIO 详情页,按 API Style 解析展示日志 * 9bb510d67a4860e75833ddf976de6f2f33c90a5a ci: speed up docker build caching High4/14/2026
v0.7.11## Changelog * d6f8247b585cf084cbedc7b80e66f5af93fa15df feat(logs): add scheduled log cleanup config High4/5/2026
v0.7.10## Changelog * 3c1aa474ee5c4a8e168874e1606a0893ea3ff9a0 feat: IOlog开关移动到apiKey 优化Key显示页面 Medium4/4/2026
v0.7.9## Changelog * 6bd24119fab6371933b9cf3755a1b41fa245a021 feat: 新增模型关联 ExtraBody 额外请求体参数 (#66) Medium4/3/2026
v0.7.8## Changelog * d942d8bf19b83ffde8983da009aa395a04e3c43f feat: auth key返回详细响应消息 * 66796a2bd4571edb8178d15fa75c97b6696df1a3 feat: balancer 书写 Medium3/27/2026
v0.7.7## Changelog * 833c10e1e0219efc7b907ef916289fa933e9ec2b feat: pop err加入traceID Medium3/27/2026
v0.7.6## Changelog * 70c637fd2dc275abc97f9910f56862d113e2ff7b feat(webui): quickstart 页面新增流式输出开关 * 69a596124f20287c254a6d6e06ac96db43f7de9a fix: 调整测试超时时间为6分钟 Medium3/24/2026
v0.7.5## Changelog * ef57d878a9dd00fddc988c1adaf152c517763c8b fix: 修正快速开始页面主题切换修复 Medium3/23/2026
v0.7.4## Changelog * b5f86e33a7c5cff3fb2911403a02d15079092945 fix: 修正anthropic curl 加入/v1 Low3/22/2026
v0.7.3## Changelog * c0e911c56dbca78c257fefa8b119b789d1fed6da fix(webui): 优化 quickstart 页面移动端布局 Low3/22/2026
v0.7.2## Changelog * d6c9702dddfff6cc161cb6ac45e9283fdabe7fef fix: 修正anthropic baseUrl Low3/22/2026
v0.7.1## Changelog * d9893f1e9805d08267d3356fe3ffcf2f2e131691 chore(webui): trim quickstart copy Low3/22/2026
v0.7.0## Changelog * b7365a0e2c6e8d03f985a609e6f42312a0e2bc22 feat(webui): add quickstart examples page Low3/22/2026
v0.6.38## Changelog * bc1d8f8a1bb9d451415d8b7f9120f3d646f5146b fix: 错误返回加入trace ID * 66ae20b1d1832986c5255035fa5f30ceda215db2 fix: 回复原有错误重试写法 * fef582aa19bf3e1e39e232dc629fc8338d487a5b feat: 使用lo库 精简代码 * 966b786aa27d299d52c162925cc648fd3958adfd feat: 错误时返回trace ID * 9711a065719c1e9bad2dac5879f852ebe8417b6b feat: 精简代码 * 5f33d0b81effeda75a68144fb68ad436c1dffd87 feat: 模型关联页面默认权重降序排列 * d7c0822f4a68c2609f2272f9bf5b8816387671db fix: 恢复原有函数 * 7db7b36ab4eb7559dd8ded1526fb9dc324cc3fdb feat(service): add genericLow3/21/2026
v0.6.37## Changelog * a9a6e05a86bc5a2cd8530bf459a33cb7b3bba5ec fix(webui): constrain model test error layout Low3/11/2026
v0.6.36## Changelog * 6a1ab0a37cafaa6fcd8898d392a810f7f5b1474e fix(webui): refine model test error display * 22322456f76730dd6e216886d7b3c21f5ad2c13a dcos: update 1.26.1 * 2771cfad2e7308f5c2a6b11dc5a3b22240b8f04f dcos: add oopenclaw Low3/11/2026
v0.6.35## Changelog * 0fd62044535c83cc615ad9c98c7c5201c9c70768 dcos: default english readme * e4d2c751722c3e1d4d0c47ffa59c65549a5f0ebd feat: 新增i18n国际化支持 Low3/11/2026
v0.6.34## Changelog * 68141a6c94a0866614a571aa1c4399a06687a199 feat: 新增traceID快速追踪log Low3/11/2026
v0.6.33## Changelog * 46fadc3e1b4f09140c2a67c4fd7755b46228ed6d feat: TPS计算改为全部时间 Low3/10/2026
v0.6.32## Changelog * a87f56b6c87227171847e8774af0b5ddc7da0bbd feat: TPS计算改为仅计算输出tokens Low3/10/2026
v0.6.31## Changelog * f93b038a0f11a7d30afbb5c333b802afc42fb81b feat: 适配go1.26新特性new * bc228a8d20bf1d240fe85b34c9cfde311bde5606 fix(webui): 统一移动端日志耗时显示逻辑 * f4c095c3cf3e9dba4933939cc9dc3ccf928a7d5d feat(webui): 移除模型管理页面顶部提示文字 * 938eb5330e0944392687a21d028f962f033c12a1 fix(webui): 补回模型管理添加模型入口(补充 #61) * 867793b8a03b73c8eca4255e3c64e730b0a3a837 feat: 合并模型与关联管理并支持拖动排序持久化 Low3/8/2026
v0.6.30## Changelog * 56968b48a4de6d5600ed671dda559b7122024165 fix: Dockerfile RUN cmd err * a6e7320af9878b5187453d261b295b16fb50a56e fix: Dockerfile go proxy err * 4f7135bde987a3e8175bebbca20968554f80647c Merge pull request #59 from enoch-robinson/fix/model-provider-status-exclude-running * 61f4057f2ab357dea17e8141934bfb1e2173065c fix: exclude running status from model provider status query Low3/1/2026
v0.6.29## Changelog * f767d13d57281c5197fb118b2ab20ff7afb4bc63 Merge pull request #58 from enoch-robinson/fix/sse-response * d3df38522a221160150f9173e81857c40728f296 fix: 实现流式响应自动 Flush,优化 SSE 首字延迟 Low2/25/2026
v0.6.28## Changelog * 7d0f4f6b17270083eef67e1ee6dfddeda0c75082 Merge pull request #57 from FlintyLemming/master * 634ee79b73e7aa67218284bc4698e38afecfba6d feat(provider): support body error matcher for 200 responses * 67f32855c68b4be094226255926dc61245297ca5 fix: 修复搜索残留问题 * 6940b54e3dfe915c5e168f40e152e2b0fadd322a Merge pull request #56 from goehou/master * bb3838a17245ed49c32ff78232e34369a8e0ce39 feat: 重构 providers 和 model-providers 页面组件 减少代码太长的问题 Low2/24/2026
v0.6.27## Changelog * 597eaef08e92cd23f69fb22027e0bdee5d2bce6e feat: 提供商管理加入模型获取失败错误提示 Low2/11/2026
v0.6.26## Changelog * 41c309a98596259ee16e993f9e57d83982446073 feat: 日志新增running状态 优化客户端主动取消错误显示 * 2b87bbbe9af9de8ba563a608cb198595a1198666 fix: 使用lo优化代码行数 * 45e5d2331d062b5776a202c50efec6f1a22d6018 fix: 移动setwebui位置 * b6f1b3e734a9b24058ddf1add4dc5ecdf6c578a9 fix: 修改函数名 Low2/11/2026
v0.6.25## Changelog * d369fe45d5409b6434ea215055f392af9dcd3799 Merge pull request #53 from FlintyLemming/master * 4b3903478c79b46ead5b4f39dd9eea46d2fb8285 feat: add per-provider HTTP proxy support * 170bcf66017afa07eb0f7aa9bb1d0c0ef537b952 feat: 封装token生成 * 75c43765ce0ca5bace988c312fa071434de1a5c5 feat: 封装env读取 * 904cf41b27a2f0071db5ee26cdebffb21dfcf349 feat: 移动跨域func 到middleware Low2/8/2026
v0.6.24## Changelog * 2bba3fbf0406bb1d92dd4600c1466083370b0b36 feat: 移除QWEN.md * 8a1d2902ca4f397fe2a3661d52a32416ba78e8b4 feat: 会话详情页面添加内容复制按钮 Low2/5/2026
v0.6.23## Changelog * f314657d4bb4761078214da7ce513f9307c8ed9e feat: 新增DB_VACUUM用于控制启动时是否执行VACUUM Low1/17/2026
v0.6.22## Changelog * ceae4600b70f08bbb9c58f9a9a4a7074c9584a5a feat: 优化路由鉴权 Low1/16/2026
v0.6.21## Changelog * 1c9bdf4fb544da148816256f953c26a45cb7be17 feat: 跨域访问允许Authorization Low1/16/2026
v0.6.20## Changelog * ec9f87757d8b12daa8964486254d1c28450286af feat: 加入跨域访问支持 * 2195d573091b5aaccf891abd83b9584ea1439bec fix: 重命名变量 Low1/16/2026
v0.6.19## Changelog * bce80b4278b338826cc3f0a59c02c6a67f0b8cbe feat: 根据不同key权限返回模型列表 Low1/8/2026
v0.6.18## Changelog * 997572785f990508c9a3c911a608c05c18b2546b feat: 提供商列表改为id倒序 Low1/3/2026
v0.6.17## Changelog * e85000a1880186c9224ff26ae9c10b0edc987b9a feat: 兼容claude code的日志上报接口 * 73d26e957a3b5bafb287431476753680b6fce111 feat: 调整请求日志中 会话管理的图标 Low12/31/2025
v0.6.16## Changelog * 396a24c48ba70b587ef9c45ca83e9057d135e9b1 feat: 重命名API Key管理为密钥管理 * 0859c91e775c3d320e61d37ec801f3faf1117d0f feat: 重命名模型提供商关联为关联管理 * a785aa8f9b7509f7601230b079723cfd19f730d2 fix: 修复关联管理表单报错问题 Low12/31/2025
v0.6.15## Changelog * 9bb0e1d16a255b6b3bf2b4cd8033eca7d422281e feat: 版本更新跳过dev检查 Low12/30/2025
v0.6.14## Changelog * ec414a50357f8a5ab018a669e05145f0371d9e81 feat: 版本更新提醒优化 * b539fbb0587f44f72ef2aa2a6b0876138958057d feat: 加入版本更新检测 Low12/30/2025
v0.6.13## Changelog * 55dbe0c44f0a2b80caa72a8b4b7532306012410a feat: 新增版本显示 Low12/22/2025
v0.6.12## Changelog * 6692b6b095183de5455bd323ef9cec66b718b355 fix: 注释 * 84c894dc19e833d713bd5883d494ed846320a8c1 fix: 统一操作图标 * 580be31bb5aec3e7d68f63b986e3c141e207f5ee fix: 规范写法 Low12/22/2025
v0.6.11## Changelog * 0ffa92c002ff64a3052444c9cf5b0a25d997184e feat: 优化连通性测试提示词 * db9356961326df8df6feda690c984725a3c27d77 feat: 优化提供商管理的图标 Low12/22/2025
v0.6.10## Changelog * cad19e4dbaa91ebb25ddb7686ef96ae1653ea78a feat: 优化连通性测试 Low12/22/2025
v0.6.9## Changelog * bc61846bc48f50eefd8cdc2dac310d1832a80f8f feat: 调整主页显示页面 Low12/21/2025
v0.6.8## Changelog * a202a45d0a58cdc3787783e35672c689bb471d25 fix: 修复前端构建报错 Low12/20/2025
v0.6.7## Changelog * 55ae3516e844d87db71479c27f19bb58f1a94e98 feat: gemini路由注释 * b52bb1a69317de3351841a9348633579ece05a0c feat: 移除gemini auth兼容 * 580fa5ffbba1105197b8f7641bd62aa98144b070 feat: AuthKey更新改为异步 * f6a4a00b6f4235b698a638d1896c8b06d3847d13 feat: 更新README.md * 39ea70e737c850b81594246721938dcf395ec048 feat: 加入对gemini支持 * 9d16167c2e9601488420b31045a3b51e1ea50eba fix: 优化mu 使用defer * e373d561da5902983746c8b8c31db9ff024f48e1 fix: 优化代码风格 * 54bfe2f17db1b2ebed414f70f9ba666b175fc117 fix: 优化熔断节点初始化 * fLow12/19/2025
v0.6.6## Changelog * dddc9f2b3ddf394c04be56571a679b6169cb077a feat: 加入熔断开关 * 01c595517813e7b11e58e509c474a9ee6da554fa feat: 加入熔断实现 Low12/18/2025
v0.6.5## Changelog * ef09d36fc787a1948b41221b03cd2c22ee6f48ff feat: 更改日志列表耗时显示为总和 * 5c882714f9777c5a6dd1d4ed56142891b6a3dcb3 feat: 格式化代码 * 5d39b43040c4f6c0360633d541f7ed23e7f07e28 feat: 扩展负载均衡接口实现 Low12/18/2025

Dependencies & License Audit

Loading dependencies...

Similar Packages

awesome-ai-tools🔴 VERY LARGE AI TOOL LIST! 🔴 Curated list of AI Tools - Updated 2026main@2026-06-05
APIParkCloud native, ultra-high performance AI&API gateway, LLM API management, distribution system, open platform, supporting all AI APIs.🦄云原生、超高性能 AI&API网关,LLM API 管理、分发系统、开放平台,支持所有AI API,不限于OpenAI、Azure、v1.9.6-beta
claude-code-proxyMonitor and visualize your Claude Code API interactions with Claude Code Proxy. Easily set up a transparent proxy and live dashboard. 🛠️🚀main@2026-06-06
claude-code-guideClaude Code Guide - Setup, Commands, workflows, agents, skills & tips-n-tricks go from beginner to power user!main@2026-06-06
outputThe open-source TypeScript framework for building AI workflows and agents. Designed for Claude Code describe what you want, Claude builds it, with all the best practices already in place.main@2026-06-05

More in Infrastructure

tensorzeroTensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
planoPlano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic.
modelsThis repository contains comprehensive pricing and configuration data for LLMs. It powers cost attribution for 200+ enterprises running 400B+ tokens through Portkey AI Gateway every day.
edgeeOpen-source AI gateway written in Rust, with token compression for Claude Code, Codex... and any other LLM client.