# smg

> Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across SGLang, vLLM, TRT-LLM, OpenAI, Gemini & more. Industry-first gRPC pipeline, KV cache-aware routing, chat history, 

- **URL**: https://www.freshcrate.ai/projects/smg
- **Author**: lightseekorg
- **Category**: MCP Servers
- **Latest version**: `v1.4.1` (2026-04-09)
- **License**: Apache-2.0
- **Source**: https://github.com/lightseekorg/smg
- **Homepage**: https://lightseek.org/smg
- **Language**: Rust
- **GitHub**: 179 stars, 50 forks
- **Registry**: github
- **Tags**: `anthropic`, `anthropic-api`, `chat`, `claude`, `gemini`, `inference-gateway`, `lightseek`, `llm`, `rust`

## Description

Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across SGLang, vLLM, TRT-LLM, OpenAI, Gemini & more. Industry-first gRPC pipeline, KV cache-aware routing, chat history, tokenization caching, Responses API, embeddings, WASM plugins, MCP, and multi-tenant auth.

## Recent releases

| Version | Date | Urgency | Changes |
| --- | --- | --- | --- |
| `v1.4.1` | 2026-04-09 | High | ## 🚀  Shepherd Model Gateway v1.4.1 Released  Patch release with **mesh HA stability fix**, **DP rank scheduling**, **reasoning parser fixes**, and engine version bumps.  ### Mesh HA Stability Fix  **Fixed premature worker removal during rolling deploys:**  - Workers synced via mesh with `health: false` were being removed by the health checker before they had a chance to pass local health checks - Fix: health checker now only removes workers whose health check **actually failed this ti |
| `v1.4.0` | 2026-04-02 | High | ## 🚀 Shepherd Model Gateway v1.4.0 Released  The biggest SMG release yet -- **Kubernetes-native deployment via Helm**, a **terminal dashboard**, **200x mesh memory reduction**, **7-11x faster multimodal preprocessing**, **native Completion API over gRPC**, and **per-model retry configuration**.  ### Kubernetes-Native Deployment with Helm  **Production-ready Helm chart for deploying SMG on Kubernetes:**  - **One-command deployment** -- `helm install smg oci://ghcr.io/lightseekorg/smg-hel |
| `v1.3.3` | 2026-03-21 | Low | ## 🚀 Shepherd Model Gateway v1.3.3 Released  Major performance release with **7x faster mesh synchronization** and critical bug fixes.  ### ⚡ Mesh Performance Revolution  **Switched mesh serialization from JSON to bincode with dramatic performance improvements:**  **Benchmark Results** (production workload - 1024 operations, 4000 tokens): - **Serialization**: 7.1x faster (35.5ms → 5.0ms) - **Deserialization**: 14.8x faster (63.4ms → 4.3ms) - **Wire size**: 4.3x smaller (67.9MB → 15.7 |
| `v1.3.2` | 2026-03-17 | Low | ## 🚀 Shepherd Model Gateway v1.3.2 Released  Feature release adding **multimodal support to Messages API** and Python mesh bindings.  ### 🎨 Multimodal Support for Messages API  **Complete vision/image support in Messages API gRPC pipeline:**  - Native image processing for Messages API requests - Works across all gRPC backends (SGLang, vLLM, TensorRT-LLM) - Full feature parity with Anthropic's Messages API including vision  **Impact**: Messages API now supports both text and vision |
| `v1.3.1` | 2026-03-16 | Low | ## 🚀 Shepherd Model Gateway v1.3.1 Released  Minor release with operational improvements and bug fixes.  ### 🛠️ New Features  **Operational improvements:**  - `--remove-unhealthy-workers` flag - Automatically remove workers that fail health checks - `disable_tokenizer_autoload` support - Skip automatic tokenizer loading for custom configurations  ### 🐛 Bug Fixes  - **Gateway**: Index external workers by all discovered models (not just primary model) - **CI**: Added VERSION_OVERR |
| `v1.3.0` | 2026-03-15 | Low | ## 🚀 Shepherd Model Gateway v1.3.0 Released  We're excited to announce **Shepherd Model Gateway v1.3.0** – a major release bringing **native Messages API support** and expanding our agentic workload capabilities.  ### 🎯 Messages API: First-Class Implementation  **Native Messages API implementation with core protocol support:** credits to @CatherineSue   - **True first-class support** — Direct protocol implementation, not a translation layer - **Extended thinking** — Native ThinkingCon |
| `v1.2.0` | 2026-03-10 | Low | ## 🚀 Shepherd Model Gateway v1.2.0 Released!  We're thrilled to announce **Shepherd Model Gateway v1.2.0** – a transformative release featuring **enhanced event-driven cache-aware routing**, **production-ready client SDKs**, **Google Gemini integration**, and **vLLM gRPC server adoption**!  ### ⚡ **Enhanced Event-Driven Cache-Aware Routing**  **Inspired by Amazon Dynamo's distributed caching principles**, SMG extends its existing cache-aware routing with real-time KV cache event subscript |
| `v1.1.0` | 2026-02-23 | Low | ## 🚀 Shepherd Model Gateway v1.1.0 Released!  We're excited to announce **Shepherd Model Gateway v1.1.0** – a major feature release bringing **universal multimodal support**, **Messages API MCP integration**, and **critical production hardening** across the entire stack!  ### 🎨 **Universal Multimodal Support** 🔥  **Industry-leading multimodal processing across all major inference engines:**  - **SGLang gRPC** - Full multimodal pipeline with vision processing - **vLLM gRPC** - Fetch + |
| `v1.0.1` | 2026-02-13 | Low | ## 🎉 Introducing Shepherd Model Gateway v1.0.1!  We're thrilled to announce **Shepherd Model Gateway v1.0.1** – formerly SGLang Model Gateway. This major release marks a new chapter with a complete architectural overhaul, new enterprise features, and production-grade improvements!  ### 🐑 **Welcome to Shepherd**  **SGLang Model Gateway is now Shepherd Model Gateway (SMG)**.  **Truly Engine-Agnostic Architecture**: Shepherd is your universal gateway supporting **all major inference engin |

## Citation

- HTML: https://www.freshcrate.ai/projects/smg
- Markdown: https://www.freshcrate.ai/projects/smg.md
- Dependencies JSON: https://www.freshcrate.ai/api/projects/smg/deps

_Generated by freshcrate.ai. Indexes github releases for AI-agent ecosystem packages._