freshcrate
Skin:/
Home > Infrastructure > coding-proxy

coding-proxy

A High-Availability, Transparent, and Smart Multi-Vendor Proxy for Claude Code. Support Claude Plans, GitHub Copilot, Google Antigravity, ZAI/GLM, MiniMax, Qwen, Xiaomi, Kimi, Doubao...

Why this rank:Recent releaseStrong adoptionHealthy release cadence

Description

A High-Availability, Transparent, and Smart Multi-Vendor Proxy for Claude Code. Support Claude Plans, GitHub Copilot, Google Antigravity, ZAI/GLM, MiniMax, Qwen, Xiaomi, Kimi, Doubao...

README

English | 简体中文

⚡ coding-proxy

A High-Availability, Transparent, and Smart Multi-Vendor Proxy for Claude Code

Python VersionLicensePackage ManagerArchitecture

💡 Why Do We Even Need coding-proxy?

When you're deeply immersed in your coding "zone" with Claude Code (or any AI assistant relying on Anthropic's Messages API), there's nothing quite as soul-crushing as having your flow violently interrupted by:

  • 🛑 Rate Limiting: High-frequency pings trigger the dreaded 429 rate_limit_error. Forced to stare at the screen and rethink your life choices.
  • 💸 Usage Cap: Aggressive code generation drains your daily/monthly quota, slamming you with a cold, heartless 403 error.
  • 🌋 Overloaded Servers: Anthropic's official servers melt down during peak hours, tossing back a merciless 503 overloaded_error.

coding-proxy was forged in the developer fires to terminate these exact pain points. Serving as a purely transparent intermediate layer, it blesses your Claude Code with millisecond-level "N-tier chained fallback disaster recovery." When your primary vendor goes belly up, it seamlessly and instantly switches your requests to the next smartest available fallback (like GitHub Copilot, Google Antigravity, or even Zhipu GLM)—with zero manual intervention, and zero perceived interruption.


🌟 Core Features

  • ⛓️ N-tier Chained Failover: Autonomous descending sequence, supporting Claude's official plans, as well as Coding Plans from GitHub Copilot, Z AI, MiniMax, Alibaba Qwen, Xiaomi, Kimi, Doubao, etc.
  • 🛡️ Smart Resilience & Quota Guardians: Every single vendor node comes fully armed with an independent Circuit Breaker and Quota Guard to proactively dodge avalanches without breaking a sweat.
  • 👻 Phantom-like Transparency: 100% transparent to the client! No code tweaks required. Overwrite ANTHROPIC_BASE_URL with a single line, and you're good to go.
  • 🔄 Universal Alchemy (Formats & Models): Native support for two-way request/streaming (SSE) translations between Anthropic ←→ Gemini. Plus, auto/DIY model name mapping (e.g., effortlessly morphing claude-* into glm-*).
  • 📊 Extreme Observability: Built-in, zero-BS local monitoring powered by a SQLite WAL. The CLI provides a one-click detailed Token usage dashboard (coding-proxy usage).
  • ⚡ Featherweight Standalone Deployment: A fully asynchronous architecture (FastAPI + httpx). Zero dependency on Redis, message queues, or other heavy machinery—absolutely no extra baggage for your dev rig.

🚀 Quick Start

1. Prerequisite Checks

Make sure your rig has Python 3.12+ and the uv package manager installed (highly recommended, because life is too short for slow package managers).

2. Lightning Install

uv add coding-proxy

3. Ignite the Proxy Server

## (Optional) Highly recommended to enable Zhipu GLM. Use env vars to defensively inject your keys
# export ZHIPU_API_KEY="your-api-key-here"

# Start coding-proxy with the default configuration
# The default config lives at: ~/.coding-proxy/config.yaml.
uv run coding-proxy start

## Use the `-c` flag to gracefully point to a custom config path
# uv run coding-proxy start -c ./coding-proxy.yaml

# INFO:     Started server process [1403]
# INFO:     Waiting for application startup.
# ...
# INFO:     coding-proxy started: host=127.0.0.1 port=8046
# INFO:     Application startup complete.
# INFO:     Uvicorn running on http://127.0.0.1:8046 (Press CTRL+C to quit)

4. Seamless Claude Code Integration

Open a fresh terminal tab and route your traffic through coding-proxy before firing up Claude Code:

export ANTHROPIC_BASE_URL=http://127.0.0.1:8046

# Enjoy blissful, silky-smooth, and uninterrupted coding nirvana:
claude

🛠️ The CLI Console Guide

coding-proxy comes equipped with a badass suite of CLI tools to help you boss around your proxy state.

Command Description Example Usage
start Fire up the proxy server. Supports custom ports and configuration paths. coding-proxy start -p 8080 -c ~/config.yaml
status Check proxy health. Shows circuit breaker states (OPEN/CLOSED) and quota status across all tiers. coding-proxy status
usage Token Stats Dashboard. Stalks every single token consumed, failovers triggered, and latency across day/vendor/model dimensions. coding-proxy usage -d 7 -v anthropic
reset The emergency flush button. Force-reset all circuit breakers and quotas instantly when you've confirmed the main vendor is back from the dead. coding-proxy reset

📐 Architectural Panorama

When a request inevitably hits the fan, the RequestRouter slides gracefully down the N-tier tree, juggling circuit breakers and token quotas to decide the ultimate destination:

graph RL
    %% 样式定义 (支持明暗双色模式的高对比色彩)
    classDef client fill:#1E3A8A,stroke:#60A5FA,stroke-width:2px,color:#EFF6FF,rx:8,ry:8
    classDef router fill:#4C1D95,stroke:#A78BFA,stroke-width:2px,color:#F5F3FF,rx:8,ry:8
    classDef gateway fill:#7C2D12,stroke:#FB923C,stroke-width:2px,color:#FFF7ED
    classDef api fill:#14532D,stroke:#4ADE80,stroke-width:2px,color:#F0FDF4
    classDef fallback fill:#27272A,stroke:#A1A1AA,stroke-width:2px,color:#F4F4F5

    Client["💻<br/>Client (Claude Code)"]:::client

    subgraph CodingProxy["⚡ coding-proxy"]
        direction RL
        
        Router["RequestRouter<br/><code>routing/router.py</code>"]:::router

        Router -->NTier

        subgraph NTier["N-tier"]
            direction TB

            subgraph Tier0 ["Tier 0: Anthropic"]
                direction RL
                G0{"CB / Quota"}:::gateway -- "✅ Pass" --> API0(("Anthropic API")):::api
            end

            subgraph Tier1 ["Tier 1: GitHub Copilot"]
                direction RL
                G1{"CB / Quota"}:::gateway -- "✅ Pass" --> API1(("Copilot API")):::api
            end

            subgraph Tier2 ["Tier 2: Google Antigravity"]
                direction RL
                G2{"CB / Quota"}:::gateway -- "✅ Pass" --> API2(("Gemini API")):::api
            end

            subgraph TierN ["Tier N: Zhipu"]
                direction RL
                APIN(("GLM API")):::fallback
            end

            Tier0 -. "❌ Blocked / API Error" .-> Tier1
            Tier1 -. "❌ Blocked / API Error" .-> Tier2
            Tier2 -. "🆘 Safety Net Downgrade" .-> TierN
        end

    end    

    Client -->|"POST /v1/messages"| CodingProxy
Loading

For a deep dive into the architecture and under-the-hood wizardry, consult framework.md (Currently in Chinese).


📚 Detailed Documentation Map

To ensure this project outlives us all (long-term maintainability), we offer exhaustive, Evidence-Based documentation:

  • 📖 User Guide — From installation and bare-minimum configs to the semantic breakdown of every config.yaml field and common troubleshooting manuals. (Currently in Chinese)
  • 🏗️ Architecture Framework — A meticulous decoding of underlying design patterns (Template Method, Circuit Breaker, State Machine, etc.), targeted at devs who want to peek into the matrix or contribute new vendors. (Currently in Chinese)
  • 🤝 Engineering Guidelines (AGENTS.md) — The systemic context mindset and AI Agent collaboration protocol. It preaches refactoring, reuse, and orthogonal abstractions and serves as the ultimate guiding light for all development in this repository.

💡 Inspiration & Acknowledgements

During our chaotic yet rewarding exploration of engineering practices, we were heavily inspired by cutting-edge tech ecosystems and brilliant designs. Special shoutouts:

  • A massive thank you to Claude Code for sparking our obsession with crafting the ultimate, seamless programming assistant experience.
  • Endless gratitude to the open-source community's myriad of API Proxy projects. Your trailblazing in reverse proxies, high-availability setups (circuit breakers/streaming proxies), and dynamic routing provided the rock-solid theoretical foundation for coding-proxy's elastic N-Tier mechanisms.

Built with 🧠, ❤️, and an absurd amount of coffee by ThreeFish-AI

Release History

VersionChangesUrgencyDate
v0.5.1a7- feat(dashboard): Model Calling 实时监控扩展至全 vendor,修复排队队列长度不可见 (#253) - fix(session-title): 剥离 标签前缀并放大标题存储长度至 600 字符 (#257) - feat(session-title): 多层级回退标题提取与延迟补写,消除 Dashboard 空标题 (#258) - feat(session-routing): 新增基于 Session 标题前缀的可配置 Vendor 自动绑定 (#260)High6/7/2026
v0.5.1a4- feat(dashboard): Model Calling 实时监控扩展至全 vendor,修复排队队列长度不可见 (#253) - fix(session-title): 剥离 标签前缀并放大标题存储长度至 600 字符 (#257)High5/31/2026
v0.4.1a6- feat(zhipu): 为 429 Rate Limit 添加指数退避重试挽回机制 (#242) - fix(antigravity): 修复 v1internal 模式检测逻辑并新增 E2E 测试; (#234) - fix(routes): 修复 count_tokens 路由对 target_vendor.name 的错误属性访问; (#235) - fix(vendor-channels): 修复 zhipu→anthropic 通道 tool_use/tool_result 配对漏洞; (#236) - fix(native-api): 修复 Gemini :verb 路径中 %3A URL 编码导致上游 400 的兼容问题; (#237) - fix(zhipu): 诊断首选 tier 语义拒绝降级问题,增强可观测性并提取跨供应商清洗共享函数 (#243)High5/23/2026
0.4.1a4- fix(antigravity): 修复 v1internal 模式检测逻辑并新增 E2E 测试; (#234) - fix(routes): 修复 count_tokens 路由对 target_vendor.name 的错误属性访问; (#235) - fix(vendor-channels): 修复 zhipu→anthropic 通道 tool_use/tool_result 配对漏洞; (#236) - fix(native-api): 修复 Gemini :verb 路径中 %3A URL 编码导致上游 400 的兼容问题; (#237)High5/16/2026
v0.4.1a2- fix(antigravity): 修复 v1internal 模式检测逻辑并新增 E2E 测试; (#234) - fix(routes): 修复 count_tokens 路由对 target_vendor.name 的错误属性访问; (#235) - fix(vendor-channels): 修复 zhipu→anthropic 通道 tool_use/tool_result 配对漏洞; (#236) - fix(native-api): 修复 Gemini :verb 路径中 %3A URL 编码导致上游 400 的兼容问题; (#237)High5/13/2026
v0.4.0> [!IMPORTANT] > > **🚀 Session 级专属路由策略!** > > 给每个 Session 指定专属的 vendor,动态调节不同 vendors 间的 LLM 流量。 ![session](assets/session-v0.4.0.png) ### ✨ 核心亮点 - feat(session-policy): 新增 Session 级专属路由策略 (#219) - feat(dashboard): 新增会话活动面板 (#222) ### 🔧 更多特性 - refactor(logging): 移除已被 ModelCall 汇总行覆盖的冗余 DEBUG 日志 (#203) - style(dashboard): 加宽图表 tooltip 令模型名称与用量值单行显示 (#211) - fix(usage-parser): 补充 OpenAI/Gemini SSE 流式分支的 model_served 提取 (#214) - fix(usage-parser): 兼容 SSE chunk 中 usage 字段为 nHigh5/1/2026
v0.3.1a1- refactor(logging): 移除已被 ModelCall 汇总行覆盖的冗余 DEBUG 日志;High4/26/2026
v0.3.0> [!IMPORTANT] > > **🚀 OpenAI、Anthropic、Gemini 原生 API 进驻 Coding Proxy!** > > 服务对象不在局限于 Claude Code,凡兼容 OpenAI、Anthropic、Gemini 三巨头 API 协议的客户端,出口 LLM 流量可统一收敛到 Coding Proxy。 ### ✨ 核心亮点 - feat(native-api): 新增 `/api/{openai,gemini,anthropic}/**` 原生 LLM API 全量 catch-all 透传通道; - feat(dashboard): 新增实时 Web Dashboard 页面,聚合展示流量与用量统计; - feat(usage): `usage` 区分 Claude Code 场景(`'cc'`)与原生 API 场景(`'api'`); - refactor(vendor-channels): 将供应商转换通道从目标专属重构为源→目标绑定模型; - docs(user-guide): 补充 POST /v1/mesHigh4/20/2026
v0.3.0a3- feat(native-api): 新增 `/api/{openai,gemini,anthropic}/**` 原生 LLM API 全量 catch-all 透传通道; - feat(usage): `usage_log` 新增 `client_category` / `operation` / `endpoint` / `extra_usage_json` 四列,区分 Claude Code 场景(`'cc'`)与原生 API 场景(`'api'`),承载规范化操作名与非规范 token 字段(reasoning / audio / thoughts / server_tool_use 等); - refactor(vendor-channels): 将供应商转换通道从目标专属重构为源→目标绑定模型; - refactor(thinking-strip): 条件化 thinking block 剥离,仅跨供应商场景触发; - style(dashboard): 缩小「今日费用估算」卡片费用数字字体使其单行显示;High4/20/2026
v0.3.0a2- feat(native-api): 新增 `/api/{openai,gemini,anthropic}/**` 原生 LLM API 全量 catch-all 透传通道; - feat(usage): `usage_log` 新增 `client_category` / `operation` / `endpoint` / `extra_usage_json` 四列,区分 Claude Code 场景(`'cc'`)与原生 API 场景(`'api'`),承载规范化操作名与非规范 token 字段(reasoning / audio / thoughts / server_tool_use 等); - refactor(vendor-channels): 将供应商转换通道从目标专属重构为源→目标绑定模型; - refactor(thinking-strip): 条件化 thinking block 剥离,仅跨供应商场景触发; - style(dashboard): 缩小「今日费用估算」卡片费用数字字体使其单行显示;High4/19/2026
v0.2.3- feat(dashboard): 新增实时 Web Dashboard 页面,聚合展示流量与用量统计; ![dashboard](assets/dashboard-v0.2.3.png) - docs(user-guide): 补充 POST /v1/messages 完整 API 参考文档; - fix(request-normalizer): misplaced tool_result 从剥离改为重定位,修复跨供应商降级后 Anthropic 恢复失败;High4/16/2026
v0.2.2- feat(reset): CLI reset 命令新增 -v/--vendor 参数,支持运行时 N-tier 链路重排序(逗号分隔的 vendor 列表); - fix(logging): 修复 uvicorn.error 日志在文件中重复打印的问题; High4/13/2026
v0.2.1- feat(logging): 实现日志双写(控制台 + 本地文件),日志文件支持 5MB 自动轮转及 gzip 压缩备份;ModelCall 日志降级为 DEBUG 级别; - feat(circuit-breaker): 补全熔断器状态转换日志的 vendor 上下文信息;High4/11/2026
v0.2.0> [!IMPORTANT] > > **🚀 供应商大扩军 × 用量仪表盘全面进化,双线暴击!** > > 卡在一家供应商的限额天花板下抬不起头?现在你手握 **九条命**——新增 MiniMax、小米 MiMo、阿里千问、Kimi、豆包五路援军,全部原生讲 Anthropic 话,无缝接入 N-tier。 Token 烧到哪儿心里没数?新版 `usage` 命令解锁日/周/月/全量四档视角,多供应商并排比,汇总行一行看全局。**备用仓更满,账单更透,从此宕机只是别人家的故事。** ### ✨ 核心亮点 - **5 家供应商集体入场**:MiniMax、小米 MiMo、阿里千问、Kimi、豆包(火山引擎)正式入编 N-tier。备用通道数量直接翻倍,不怕堵; - **`usage` 命令全面升级**:从"只有天数"进化为**日 / 周 / 月 / 全量**四档时间维度(`-d 7` / `-w` / `-m` / `-t`)。支持多值过滤——`-v anthropic,kimi` 或 `--model claude-opus-4-6,glm-5.1` 用逗号隔开Medium4/10/2026
v0.1.3> [!IMPORTANT] > > **🔥 跨供应商"身份危机" + 熔断器"装死"双杀!** > > Zhipu 的 thinking blocks 偷渡到 Anthropic 被当场识破 → 400 无限循环降级?斩了。429 限流后熔断器嘴上说"我没事"身体却已躺平?修了。两大隐蔽 Bug 一锅端,跨供应商丝滑切换从此告别"薛定谔的可用性"。 ### ✨ 核心亮点 - **Thinking Blocks "安检门"**:Anthropic 对请求体 deepcopy 后,**精准剥离** assistant messages 中的 `thinking` / `redacted_thinking` blocks。Zhipu → Anthropic 迁移时历史思考签名不再越界,400 `invalid_request_error` 彻底根除,其他供应商零影响; - **熔断器 Force-Open 闪电响应**:为 `record_failure()` 新增 `force_open` 参数——当检测到 429/403 携带 `retry_after_secoMedium4/7/2026
v0.1.2> [!IMPORTANT] > > **🔓 count_tokens 终于不再"偏心" Anthropic 了!** > > 全面拥抱多供应商泛化透传。配合全局活跃 Vendor 状态追踪机制,智能跟随 Vendor 当前移位,熔断降级?无缝切换,零感知! ### ✨ 核心亮点 - **全局活跃 Vendor 状态追踪**:Router 新增活跃 Vendor 属性,Executor 在每次流式/非流式请求成功后自动写入当前活跃供应商名称。精准锁定"此刻谁在干活",完美适配熔断降级等动态切换场景; ### 🔧 更多特性 - **CI 三合一修复**:一次性根治 ruff lint 的 F821/F401 导入幽灵、formatter 行长规范对齐,CI 绿灯常亮!Medium4/6/2026
v0.1.1> [!IMPORTANT] > > **🎉 coding-proxy MVP 惊艳登场!** > > 仅需配置一行环境变量,立刻为你的 Claude Code 接入“永不宕机”的多源智能引擎。主供应商打盹?毫秒级自动无缝切换备用通道,全天候护航你的编码心流,向打断大声说不! ### ✨ 核心亮点 - **N-tier 高可用接力**:随心编排供应商优先级;默认内置 `Claude → GitHub Copilot → Antigravity → GLM` 丝滑降级链路,天塌下来有 Proxy 顶着; - **自愈式智能熔断**:微秒级状态机严防“雪崩效应”,搭配指数退避重试,一旦主干回血,静默自愈切回; - **账单刺客克星**:极客专属的 SQLite 本地账本 + CLI 多维看板(按维度:日/模型/供应商),把 Token 消耗拆解到每一比特,精打细算不背锅; - **OAuth2 丝滑接入**:原生集成 GitHub Device Flow 与 Google OAuth。告别干枯的断更密钥,令牌到期自动接力轮转,专注写码不分心; - **多Medium4/5/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

ComfyUI-AudioSR🎶 Enhance audio quality with ComfyUI-AudioSR, a versatile tool for upscaling sounds to 48kHz for better clarity and listening experience.main@2026-06-07
sqltools_mcp🔌 Access multiple databases seamlessly with SQLTools MCP, a versatile service supporting MySQL, PostgreSQL, SQL Server, DM8, and SQLite without multiple servers.main@2026-06-07
aiA productive AI coworker that learns, self-improves, and ships work.main@2026-06-06
antigravity-awesome-skills🌌 Explore 255+ essential skills for AI coding assistants like Claude Code and GitHub Copilot to enhance your development workflow.main@2026-06-05
ps2-recomp-Agent-SKILLEnable autonomous reverse engineering and recompilation of PlayStation 2 games using a structured OS for LLM agents with persistent memory and workflowsmain@2026-06-04

More in Infrastructure

tensorzeroTensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
modelsThis repository contains comprehensive pricing and configuration data for LLMs. It powers cost attribution for 200+ enterprises running 400B+ tokens through Portkey AI Gateway every day.
edgeeOpen-source AI gateway written in Rust, with token compression for Claude Code, Codex... and any other LLM client.
patent_mcp_serverFastMCP Server for USPTO data