Home > Infrastructure > inference-gateway

inference-gateway

An open-source, cloud-native, high-performance gateway unifying multiple LLM providers, from local solutions like Ollama to major cloud providers such as OpenAI, Groq, Cohere, Anthropic, Cloudflare an

agnostic anthropic api cohere deepseek-v3-2 gateway gateway-api go golang

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

README

Inference Gateway

Version License The Inference Gateway is a proxy server designed to facilitate access to various language model APIs. It allows users to interact with different language models through a unified interface, simplifying the configuration and the process of sending requests and receiving responses from multiple LLMs, enabling an easy use of Mixture of Experts.

Key Features
Overview
Installation
Middleware Control and Bypass Mechanisms
Model Context Protocol (MCP) Integration
Metrics and Observability
Supported API's
Configuration
Examples
SDKs
CLI Tool
Contributing
License

Key Features

📜 Open Source: Available under the MIT License.
🚀 Unified API Access: Proxy requests to multiple language model APIs, including OpenAI, Ollama, Ollama Cloud, Groq, Cohere etc.
⚙️ Environment Configuration: Easily configure API keys and URLs through environment variables.
🔧 Tool-use Support: Enable function calling capabilities across supported providers with a unified API.
🌐 MCP Support: Full Model Context Protocol integration - automatically discover and expose tools from MCP servers to LLMs without client-side tool management.
🌊 Streaming Responses: Stream tokens in real-time as they're generated from language models.
🖼️ Vision/Multimodal Support: Process images alongside text with vision-capable models.
🐳 Docker Support: Use Docker and Docker Compose for easy setup and deployment.
☸️ Kubernetes Support: Ready for deployment in Kubernetes environments.
📊 OpenTelemetry: Monitor and analyze performance.
🛡️ Production Ready: Built with production in mind, with configurable timeouts and TLS support.
🌿 Lightweight: Includes only essential libraries and runtime, resulting in smaller size binary of ~10.8MB.
📉 Minimal Resource Consumption: Designed to consume minimal resources and have a lower footprint.
📚 Documentation: Well documented with examples and guides.
🧪 Tested: Extensively tested with unit tests and integration tests.
🛠️ Maintained: Actively maintained and developed.
📈 Scalable: Easily scalable and can be used in a distributed environment with HPA in Kubernetes.
🔒 Compliance and Data Privacy: This project does not collect data or analytics, ensuring compliance and data privacy.
🏠 Self-Hosted: Can be self-hosted for complete control over the deployment environment.
⌨️ CLI Tool: Improved command-line interface for managing and interacting with the Inference Gateway

Overview

You can horizontally scale the Inference Gateway to handle multiple requests from clients. The Inference Gateway will forward the requests to the respective provider and return the response to the client.

Note: MCP middleware components can be easily toggled on/off via environment variables (MCP_ENABLE) or bypassed per-request using headers (X-MCP-Bypass), giving you full control over which capabilities are active.

Note: Vision/multimodal support is disabled by default for security and performance. To enable image processing with vision-capable models (GPT-4o, Claude 4.5, Gemini 2.5, etc.), set ENABLE_VISION=true in your environment configuration.

The following diagram illustrates the flow:

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#326CE5', 'primaryTextColor': '#fff', 'lineColor': '#5D8AA8', 'secondaryColor': '#006100' }, 'fontFamily': 'Arial', 'flowchart': {'nodeSpacing': 50, 'rankSpacing': 70, 'padding': 15}}}%%


graph TD
    %% Client nodes
    A["👥 Clients / 🤖 Agents"] --> |POST /v1/chat/completions| Auth

    %% Auth node
    Auth["🔒 Optional OIDC"] --> |Auth?| IG1
    Auth --> |Auth?| IG2
    Auth --> |Auth?| IG3

    %% Gateway nodes
    IG1["🖥️ Inference Gateway"] --> P
    IG2["🖥️ Inference Gateway"] --> P
    IG3["🖥️ Inference Gateway"] --> P

    %% Middleware Processing and Direct Routing
    P["🔌 Proxy Gateway"] --> MCP["🌐 MCP Middleware"]
    P --> |"Direct routing bypassing middleware"| Direct["🔌 Direct Providers"]
    MCP --> |"Middleware chain complete"| Providers["🤖 LLM Providers"]

    %% MCP Tool Servers
    MCP --> MCP1["📁 File System Server"]
    MCP --> MCP2["🔍 Search Server"]
    MCP --> MCP3["🌐 Web Server"]

    %% LLM Providers (Middleware Enhanced)
    Providers --> C1["🦙 Ollama"]
    Providers --> D1["🚀 Groq"]
    Providers --> E1["☁️ OpenAI"]

    %% Direct Providers (Bypass Middleware)
    Direct --> C["🦙 Ollama"]
    Direct --> D["🚀 Groq"]
    Direct --> E["☁️ OpenAI"]
    Direct --> G["⚡ Cloudflare"]
    Direct --> H1["💬 Cohere"]
    Direct --> H2["🧠 Anthropic"]
    Direct --> H3["🐋 DeepSeek"]

    %% Define styles
    classDef client fill:#9370DB,stroke:#333,stroke-width:1px,color:white;
    classDef auth fill:#F5A800,stroke:#333,stroke-width:1px,color:black;
    classDef gateway fill:#326CE5,stroke:#fff,stroke-width:1px,color:white;
    classDef provider fill:#32CD32,stroke:#333,stroke-width:1px,color:white;
    classDef mcp fill:#FF69B4,stroke:#333,stroke-width:1px,color:white;

    %% Apply styles
    class A client;
    class Auth auth;
    class IG1,IG2,IG3,P gateway;
    class C,D,E,G,H1,H2,H3,C1,D1,E1,Providers provider;
    class MCP,MCP1,MCP2,MCP3 mcp;
    class Direct direct;

Client is sending:

curl -X POST http://localhost:8080/v1/chat/completions
  -d '{
    "model": "openai/gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a pirate."
      },
      {
        "role": "user",
        "content": "Hello, world! How are you doing today?"
      }
    ],
  }'

** Internally the request is proxied to OpenAI, the Inference Gateway inferring the provider by the model name.

You can also send the request explicitly using ?provider=openai or any other supported provider in the URL.

Finally client receives:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Ahoy, matey! 🏴‍☠️ The seas be wild, the sun be bright, and this here pirate be ready to conquer the day! What be yer business, landlubber? 🦜",
        "role": "assistant"
      }
    }
  ],
  "created": 1741821109,
  "id": "chatcmpl-dc24995a-7a6e-4d95-9ab3-279ed82080bb",
  "model": "N/A",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 0,
    "prompt_tokens": 0,
    "total_tokens": 0
  }
}

For streaming the tokens simply add to the request body stream: true.

Installation

Recommended: For production deployments, running the Inference Gateway as a container is recommended. This provides better isolation, easier updates, and simplified configuration management. See Docker or Kubernetes deployment examples.

The Inference Gateway can also be installed as a standalone binary using the provided install script or by downloading pre-built binaries from GitHub releases.

Using Install Script

The easiest way to install the Inference Gateway is using the automated install script:

Install latest version:

curl -fsSL https://raw.githubusercontent.com/inference-gateway/inference-gateway/main/install.sh | bash

Install specific version:

curl -fsSL https://raw.githubusercontent.com/inference-gateway/inference-gateway/main/install.sh | VERSION=v0.22.3 bash

Install to custom directory:

# Install to custom location
curl -fsSL https://raw.githubusercontent.com/inference-gateway/inference-gateway/main/install.sh | INSTALL_DIR=~/.local/bin bash

# Install to current directory
curl -fsSL https://raw.githubusercontent.com/inference-gateway/inference-gateway/main/install.sh | INSTALL_DIR=. bash

What the script does:

Automatically detects your operating system (Linux/macOS) and architecture (x86_64/arm64/armv7)
Downloads the appropriate binary from GitHub releases
Extracts and installs to /usr/local/bin (or custom directory)
Verifies the installation

Supported platforms:

Linux: x86_64, arm64, armv7
macOS (Darwin): x86_64 (Intel), arm64 (Apple Silicon)

Manual Download

Download pre-built binaries directly from the releases page:

Download the appropriate archive for your platform

Extract the binary:

tar -xzf inference-gateway_<OS>_<ARCH>.tar.gz

Move to a directory in your PATH:

sudo mv inference-gateway /usr/local/bin/
chmod +x /usr/local/bin/inference-gateway

Verify Installation

inference-gateway --version

Running the Gateway

Once installed, start the gateway with your configuration:

# Set required environment variables
export OPENAI_API_KEY="your-api-key"

# Start the gateway
inference-gateway

For detailed configuration options, see the Configuration section below.

Middleware Control and Bypass Mechanisms

The Inference Gateway uses middleware to process requests and add capabilities like MCP (Model Context Protocol). Clients can control which middlewares are active using bypass headers:

Bypass Headers

X-MCP-Bypass: Skip MCP middleware processing

Client Control Examples

# Use only standard tool calls (skip MCP)
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "X-MCP-Bypass: true" \
  -d '{
    "model": "anthropic/claude-3-haiku",
    "messages": [{"role": "user", "content": "Connect to external agents"}]
  }'

# Skip both middlewares for direct provider access
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "X-MCP-Bypass: true" \
  -d '{
    "model": "groq/llama-3-8b",
    "messages": [{"role": "user", "content": "Simple chat without tools"}]
  }'

When to Use Bypass Headers

For Performance:

Skip middleware processing when you don't need tool capabilities
Reduce latency for simple chat interactions

For Selective Features:

Use only standard tool calls (skip MCP): Add X-MCP-Bypass: true
Direct provider access

For Development:

Test middleware behavior in isolation
Debug tool integration issues
Ensure backward compatibility with existing applications

How It Works Internally

The middlewares use these same headers to prevent infinite loops during their operation:

MCP Processing:

When tools are detected in a response, the MCP agent makes up to 10 follow-up requests
Each follow-up request includes X-MCP-Bypass: true to skip middleware re-processing
This allows the agent to iterate without creating circular calls

Note: These bypass headers only affect middleware processing. The core chat completions functionality remains available regardless of header values.

Model Context Protocol (MCP) Integration

Enable MCP to automatically provide tools to LLMs without requiring clients to manage them:

# Enable MCP and connect to tool servers
export MCP_ENABLE=true
export MCP_SERVERS="http://filesystem-server:3001/mcp,http://search-server:3002/mcp"

# LLMs will automatically discover and use available tools
curl -X POST http://localhost:8080/v1/chat/completions \
  -d '{
    "model": "openai/gpt-4",
    "messages": [{"role": "user", "content": "List files in the current directory"}]
  }'

The gateway automatically injects available tools into requests and handles tool execution, making external capabilities seamlessly available to any LLM.

Learn more: Model Context Protocol Documentation | MCP Integration Example

Metrics and Observability

The Inference Gateway provides comprehensive OpenTelemetry metrics for monitoring performance, usage, and function/tool call activity. Metrics are automatically exported to Prometheus format and available on port 9464 by default.

Enabling Metrics

# Enable telemetry and set metrics port (default: 9464)
export TELEMETRY_ENABLE=true
export TELEMETRY_METRICS_PORT=9464

# Access metrics endpoint
curl http://localhost:9464/metrics

Available Metrics

Token Usage Metrics

Track token consumption across different providers and models:

llm_usage_prompt_tokens_total - Counter for prompt tokens consumed
llm_usage_completion_tokens_total - Counter for completion tokens generated
llm_usage_total_tokens_total - Counter for total token usage

Labels: provider, model

# Total tokens used by OpenAI models in the last hour
sum(increase(llm_usage_total_tokens_total{provider="openai"}[1h])) by (model)

Request/Response Metrics

Monitor API performance and reliability:

llm_requests_total - Counter for total requests processed
llm_responses_total - Counter for responses by HTTP status code
llm_request_duration - Histogram for end-to-end request duration (milliseconds)

Labels: provider, request_method, request_path, status_code (responses only)

# 95th percentile request latency by provider
histogram_quantile(0.95, sum(rate(llm_request_duration_bucket{provider=~"openai|anthropic"}[5m])) by (provider, le))

# Error rate percentage by provider
100 * sum(rate(llm_responses_total{status_code!~"2.."}[5m])) by (provider) / sum(rate(llm_responses_total[5m])) by (provider)

Function/Tool Call Metrics

Comprehensive tracking of tool executions for MCP, and standard function calls:

llm_tool_calls_total - Counter for total function/tool calls executed
llm_tool_calls_success_total - Counter for successful tool executions
llm_tool_calls_failure_total - Counter for failed tool executions
llm_tool_call_duration - Histogram for tool execution duration (milliseconds)

Labels: provider, model, tool_type, tool_name, error_type (failures only)

Tool Types:

mcp - Model Context Protocol tools (prefix: mcp_)
standard_tool_use - Other function calls

# Tool call success rate by type
100 * sum(rate(llm_tool_calls_success_total[5m])) by (tool_type) / sum(rate(llm_tool_calls_total[5m])) by (tool_type)

# Average tool execution time by provider
sum(rate(llm_tool_call_duration_sum[5m])) by (provider) / sum(rate(llm_tool_call_duration_count[5m])) by (provider)

# Most frequently used tools
topk(10, sum(increase(llm_tool_calls_total[1h])) by (tool_name))

Monitoring Setup

Docker Compose Example

Complete monitoring stack with Grafana dashboards:

cd examples/docker-compose/monitoring/
cp .env.example .env  # Configure your API keys
docker compose up -d

# Access Grafana at http://localhost:3000 (admin/admin)

Kubernetes Example

Production-ready monitoring with Prometheus Operator:

cd examples/kubernetes/monitoring/
task deploy-infrastructure
task deploy-inference-gateway

# Access via port-forward or ingress
kubectl port-forward svc/grafana-service 3000:3000

Grafana Dashboard

The included Grafana dashboard provides:

Real-time Metrics: 5-second refresh rate for immediate feedback
Tool Call Analytics: Success rates, duration analysis, and failure tracking
Provider Comparison: Performance metrics across all supported providers
Usage Insights: Token consumption patterns and cost analysis
Error Monitoring: Failed requests and tool call error classification

Learn more: Docker Compose Monitoring | Kubernetes Monitoring | OpenTelemetry Documentation

Supported API's

Configuration

The Inference Gateway can be configured using environment variables. The following environment variables are supported.

Vision/Multimodal Support

To enable vision capabilities for processing images alongside text:

ENABLE_VISION=true

Supported Providers with Vision:

OpenAI (GPT-4o, GPT-5, GPT-4.1, GPT-4 Turbo)
Anthropic (Claude 3, Claude 4, Claude 4.5 Sonnet, Claude 4.5 Haiku)
Google (Gemini 2.5)
Cohere (Command A Vision, Aya Vision)
Ollama (LLaVA, Llama 4, Llama 3.2 Vision)
Groq (vision models)
Mistral (Pixtral)

Note: Vision support is disabled by default for performance and security reasons. When disabled, requests with image content will be rejected even if the model supports vision.

Examples

Using Docker Compose
- Basic setup - Simple configuration with a single provider
- MCP Integration - Model Context Protocol with multiple tool servers
- Hybrid deployment - Multiple providers (cloud + local)
- Authentication - OIDC authentication setup
- Tools - Tool integration examples
Using Kubernetes
- Basic setup - Simple Kubernetes deployment
- MCP Integration - Model Context Protocol in Kubernetes
- Agent deployment - Standalone agent deployment
- Hybrid deployment - Multiple providers in Kubernetes
- Authentication - OIDC authentication in Kubernetes
- Monitoring - Observability and monitoring setup
- TLS setup - TLS/SSL configuration
Using standard REST endpoints

SDKs

More SDKs could be generated using the OpenAPI specification. The following SDKs are currently available:

CLI Tool

The Inference Gateway CLI provides a powerful command-line interface for managing and interacting with the Inference Gateway. It offers tools for configuration, monitoring, and management of inference services.

CLI Key Features

Status Monitoring: Check gateway health and resource usage
Interactive Chat: Chat with models using an interactive interface
Configuration Management: Manage gateway settings via YAML config
Project Initialization: Set up local project configurations
Tool Execution: LLMs can execute whitelisted commands and tools

CLI Installation

Using Go Install

go install github.com/inference-gateway/cli@latest

Using CLI Install Script

curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash

Manual CLI Download

Download the latest release from the releases page.

Quick Start

Initialize project configuration:
```
infer init
```
Check gateway status:
```
infer status
```
Start an interactive chat:
```
infer chat
```

For more details, see the CLI documentation.

License

This project is licensed under the MIT License.

Contributing

Found a bug, missing provider, or have a feature in mind?
You're more than welcome to submit pull requests or open issues for any fixes, improvements, or new ideas!

Please read the CONTRIBUTING.md for more details.

Motivation

My motivation is to build AI Agents without being tied to a single vendor. By avoiding vendor lock-in and supporting self-hosted LLMs from a single interface, organizations gain both portability and data privacy. You can choose to consume LLMs from a cloud provider or run them entirely offline with Ollama.

Release History

Version	Changes	Urgency	Date
v0.37.0	## [0.37.0](https://github.com/inference-gateway/inference-gateway/compare/v0.36.1...v0.37.0) (2026-07-21) ### ✨ Features * implement POST /v1/messages Anthropic Messages API (schemas v0.11.1) ([#482](https://github.com/inference-gateway/inference-gateway/issues/482)) ([4fa28a6](https://github.com/inference-gateway/inference-gateway/commit/4fa28a66c6f4f5f3f78c91c3d067cc201b5a702f)) --- ## 📦 Quick Installation ### Binary Installation Install latest version: ```bash curl -fsSL https://	High	7/21/2026
v0.31.0	## [0.31.0](https://github.com/inference-gateway/inference-gateway/compare/v0.30.0...v0.31.0) (2026-07-09) ### ✨ Features * release: add Windows release assets to goreleaser build matrix ([#440](https://github.com/inference-gateway/inference-gateway/issues/440)) ([12aa00a](https://github.com/inference-gateway/inference-gateway/commit/12aa00aafd5af1f65505c4e7bc84cf8ebc2a7e05)) ### ♻️ Improvements * codegen: Consolidate OpenAPI Go type generation on oapi-codegen ([#437](https://github.	Medium	7/9/2026
v0.27.0	## [0.27.0](https://github.com/inference-gateway/inference-gateway/compare/v0.26.1...v0.27.0) (2026-07-04) ### ✨ Features * add opt-in OTLP metrics push endpoint and migrate metrics to GenAI semconv ([#409](https://github.com/inference-gateway/inference-gateway/issues/409)) ([455a745](https://github.com/inference-gateway/inference-gateway/commit/455a7452a0f4767cae229e1ef4eb0678971c337b)), closes [#404](https://github.com/inference-gateway/inference-gateway/issues/404) ### 🔧 Miscellaneous *	High	7/4/2026
v0.26.1	## [0.26.1](https://github.com/inference-gateway/inference-gateway/compare/v0.26.0...v0.26.1) (2026-06-23) ### 👷 CI * claude: centralize claude.yml via reusable workflow ([#398](https://github.com/inference-gateway/inference-gateway/issues/398)) ([34e7226](https://github.com/inference-gateway/inference-gateway/commit/34e7226a7d79fe21a226c029b3735e178c9c1e1b)) * deps: upgrade actions/checkout from v6.0.3 to v7.0.0 across workflows ([0a4bf3d](https://github.com/inference-gateway/inferen	High	6/23/2026
v0.26.0	## [0.26.0](https://github.com/inference-gateway/inference-gateway/compare/v0.25.2...v0.26.0) (2026-06-18) ### ✨ Features * add MCP tools selector via include/exclude lists ([#380](https://github.com/inference-gateway/inference-gateway/issues/380)) ([1f4aa7f](https://github.com/inference-gateway/inference-gateway/commit/1f4aa7f06f9ecaffe89a5b90cb90d8f59b46a2de)) * re-vendor openapi schema and regenerate chat types ([#390](https://github.com/inference-gateway/inference-gateway/issues/390)) ([9c	High	6/18/2026
v0.25.2	## [0.25.2](https://github.com/inference-gateway/inference-gateway/compare/v0.25.1...v0.25.2) (2026-06-11) ### 🐛 Bug Fixes * examples: correct broken links, env vars, and setup docs ([#360](https://github.com/inference-gateway/inference-gateway/issues/360)) ([b7a38ce](https://github.com/inference-gateway/inference-gateway/commit/b7a38ce16bf0d0693a2fb71eee832d79455e1585)) ### 📚 Documentation * examples: add docker-compose authentication example ([#366](https://github.com/inference-g	High	6/11/2026
v0.24.6	## [0.24.6](https://github.com/inference-gateway/inference-gateway/compare/v0.24.5...v0.24.6) (2026-05-21) ### 🐛 Bug Fixes * mcp: Keep gateway running when all MCP servers are unreachable at startup ([#306](https://github.com/inference-gateway/inference-gateway/issues/306)) ([a10910b](https://github.com/inference-gateway/inference-gateway/commit/a10910bc92218e5cb54c7f305bbd794be1e1fe98)), closes [#304](https://github.com/inference-gateway/inference-gateway/issues/304) ### 🔧 Miscellaneou	High	5/21/2026
v0.24.4	## [0.24.4](https://github.com/inference-gateway/inference-gateway/compare/v0.24.3...v0.24.4) (2026-05-13) ### 🔧 Miscellaneous * deps: Bump SDK to v1.16.3 and refresh generated types ([#299](https://github.com/inference-gateway/inference-gateway/issues/299)) ([baa564b](https://github.com/inference-gateway/inference-gateway/commit/baa564bc5675d5d0b0cf37d1e71e76c4c6779a7b)), closes [#298](https://github.com/inference-gateway/inference-gateway/issues/298) --- ## 📦 Quick Installation ###	High	5/13/2026
v0.24.3	## [0.24.3](https://github.com/inference-gateway/inference-gateway/compare/v0.24.2...v0.24.3) (2026-05-13) ### 🐛 Bug Fixes * dependabot: Replace assignees with CODEOWNERS ([f360f03](https://github.com/inference-gateway/inference-gateway/commit/f360f03c427cce3def5f39fee63e758158bc3bf0)) ### 🔧 Miscellaneous * dependabot: Simplify config to root gomod, docker, and github-actions ([644e958](https://github.com/inference-gateway/inference-gateway/commit/644e9582cff674a3c68fbb613e23e5abed	High	5/13/2026
v0.24.1	## [0.24.1](https://github.com/inference-gateway/inference-gateway/compare/v0.24.0...v0.24.1) (2026-05-07) ### 👷 CI * deps: Bump golangci-lint to latest ([6924c38](https://github.com/inference-gateway/inference-gateway/commit/6924c3865e90178446142eb0ac622a93421e1950)) * Update Claude Code action version and refine system prompt instructions ([3e2fae3](https://github.com/inference-gateway/inference-gateway/commit/3e2fae3dfbc50ac8b789d76ba812f339cbc8ee49)) * Update golangci-lint installatio	High	5/7/2026
v0.24.0	## [0.24.0](https://github.com/inference-gateway/inference-gateway/compare/v0.23.6...v0.24.0) (2026-04-28) ### ✨ Features * Add Google's extra content for thought_signature passing ([#275](https://github.com/inference-gateway/inference-gateway/issues/275)) ([b31659b](https://github.com/inference-gateway/inference-gateway/commit/b31659b3d83b587077baa26f11d17d6040d4c080)) ### 🐛 Bug Fixes * ci: Infer workflow should be skipped when the actor was a bot ([327a86c](https://github.com/inferenc	High	4/28/2026
v0.23.6	## [0.23.6](https://github.com/inference-gateway/inference-gateway/compare/v0.23.5...v0.23.6) (2026-04-08) ### 🔧 Miscellaneous * Bump dev dependencies and tool versions ([#266](https://github.com/inference-gateway/inference-gateway/issues/266)) ([d851119](https://github.com/inference-gateway/inference-gateway/commit/d85111908b4063b1d1ac5651f6a8d1c00ae9f426)) * deps(examples): Bump @hono/node-server ([#267](https://github.com/inference-gateway/inference-gateway/issues/267)) ([ec87055]	High	4/8/2026
v0.23.5	## [0.23.5](https://github.com/inference-gateway/inference-gateway/compare/v0.23.4...v0.23.5) (2026-04-01) ### 🔧 Miscellaneous * Add stale issues workflow to auto-close inactive issues ([0989d22](https://github.com/inference-gateway/inference-gateway/commit/0989d225d61f7ba414f1619f5c13058ed5033503)) * Bump CI and dev containers dependencies ([#255](https://github.com/inference-gateway/inference-gateway/issues/255)) ([aca7a05](https://github.com/inference-gateway/inference-gateway/commit/aca7a	Medium	4/1/2026
v0.23.4	## [0.23.4](https://github.com/inference-gateway/inference-gateway/compare/v0.23.3...v0.23.4) (2026-03-05) ### 🐛 Bug Fixes * Do not run infer agent if the comment was made by dependabot ([2ea9aa9](https://github.com/inference-gateway/inference-gateway/commit/2ea9aa9f92bb75cf1e8fad8f2721389c8021e092)) --- ## 📦 Quick Installation ### Binary Installation Install latest version: ```bash curl -fsSL https://raw.githubusercontent.com/inference-gateway/inference-gateway/main/install.sh \| ba	Low	3/5/2026
v0.23.3	## [0.23.3](https://github.com/inference-gateway/inference-gateway/compare/v0.23.2...v0.23.3) (2026-03-05) ### ♻️ Improvements * Remove deprecated web UI ([#233](https://github.com/inference-gateway/inference-gateway/issues/233)) ([1738332](https://github.com/inference-gateway/inference-gateway/commit/1738332230bc5092206f352f087442ae245e34db)) ### 🔧 Miscellaneous * deps)(deps: Bump go.opentelemetry.io/otel/sdk ([#242](https://github.com/inference-gateway/inference-gateway/issues/242)) (	Low	3/5/2026
v0.23.2	## [0.23.2](https://github.com/inference-gateway/inference-gateway/compare/v0.23.1...v0.23.2) (2026-01-23) ### 🐛 Bug Fixes * examples: Add missing ghcr.io prefix to oci images ([02a84b8](https://github.com/inference-gateway/inference-gateway/commit/02a84b89680744534aa37c705f941bbd84b56869)) --- ## 📦 Quick Installation ### Binary Installation Install latest version: ```bash curl -fsSL https://raw.githubusercontent.com/inference-gateway/inference-gateway/main/install.sh \| bash ```	Low	1/23/2026
v0.23.1	## [0.23.1](https://github.com/inference-gateway/inference-gateway/compare/v0.23.0...v0.23.1) (2026-01-23) ### ♻️ Improvements * Replace interface{} with any and add multimodal type support ([#232](https://github.com/inference-gateway/inference-gateway/issues/232)) ([bfae43d](https://github.com/inference-gateway/inference-gateway/commit/bfae43d3903ae0f1f542aab6fb563fecb0f976e6)) --- ## 📦 Quick Installation ### Binary Installation Install latest version: ```bash curl -fsSL https://raw	Low	1/23/2026
v0.23.0	## [0.23.0](https://github.com/inference-gateway/inference-gateway/compare/v0.22.10...v0.23.0) (2026-01-22) ### ✨ Features * providers: Add Moonshot AI provider ([#225](https://github.com/inference-gateway/inference-gateway/issues/225)) ([43e5816](https://github.com/inference-gateway/inference-gateway/commit/43e5816051ba15e79f54d16b610a08275dde267f)) ### 🔧 Miscellaneous * deps: Bump hono in /examples/docker-compose/mcp/pizza-server ([#228](https://github.com/inference-gateway/infere	Low	1/22/2026
v0.22.10	## [0.22.10](https://github.com/inference-gateway/inference-gateway/compare/v0.22.9...v0.22.10) (2026-01-07) ### 🔧 Miscellaneous * deps: Update qs to 6.14.1 to resolve security vulnerability ([6f67514](https://github.com/inference-gateway/inference-gateway/commit/6f675146d87f03734e70f5696f17854ee8b6fad0)) * deps: Bump @modelcontextprotocol/sdk ([#226](https://github.com/inference-gateway/inference-gateway/issues/226)) ([e0802fd](https://github.com/inference-gateway/inference-gate	Low	1/7/2026
v0.22.9	## [0.22.9](https://github.com/inference-gateway/inference-gateway/compare/v0.22.8...v0.22.9) (2025-12-14) ### 🐛 Bug Fixes * install: Prefix the path with the INSTALL_DIR variable ([78ed26b](https://github.com/inference-gateway/inference-gateway/commit/78ed26bd5279ed629e9ca18dad7e96bdbd8d2496)) --- ## 📦 Quick Installation ### Binary Installation Install latest version: ```bash curl -fsSL https://raw.githubusercontent.com/inference-gateway/inference-gateway/main/install.sh \| bash	Low	12/14/2025
v0.22.8	## [0.22.8](https://github.com/inference-gateway/inference-gateway/compare/v0.22.7...v0.22.8) (2025-12-12) ### ♻️ Improvements * Gracefully handle images sent to non-vision models ([#223](https://github.com/inference-gateway/inference-gateway/issues/223)) ([06fb970](https://github.com/inference-gateway/inference-gateway/commit/06fb970c48563e74f954786adc1c6dc941d5f34f)) ### 👷 CI * Setup infer workflow ([#222](https://github.com/inference-gateway/inference-gateway/issues/222)) ([2c235be](http	Low	12/12/2025
v0.22.7	## [0.22.7](https://github.com/inference-gateway/inference-gateway/compare/v0.22.6...v0.22.7) (2025-12-11) ### 📚 Documentation * Add AGENTS.md for AI agent guidance ([91c2131](https://github.com/inference-gateway/inference-gateway/commit/91c2131a64b93066e1d692947c538b3a13903df6)) ### 🔧 Miscellaneous * deps: Bump infer CLI version to its latest ([df3f040](https://github.com/inference-gateway/inference-gateway/commit/df3f04069df85cefcaa9ec8a544c4803f15951d2)) * deps: Bump semantic-re	Low	12/11/2025
v0.22.6	## [0.22.6](https://github.com/inference-gateway/inference-gateway/compare/v0.22.5...v0.22.6) (2025-12-11) ### 🔧 Miscellaneous * deps: Bump github.com/quic-go/quic-go from 0.54.1 to 0.57.0 ([#221](https://github.com/inference-gateway/inference-gateway/issues/221)) ([af3fbb5](https://github.com/inference-gateway/inference-gateway/commit/af3fbb5be02170aac6a5130f67c451dbb172894e)) * deps: Bump github.com/quic-go/quic-go from 0.54.1 to 0.57.0 in /examples//mcp/ ([#220](https://github.co	Low	12/11/2025
v0.22.5	## [0.22.5](https://github.com/inference-gateway/inference-gateway/compare/v0.22.4...v0.22.5) (2025-12-04) ### 🐛 Bug Fixes * docs: Remove v prefix ([135b3b6](https://github.com/inference-gateway/inference-gateway/commit/135b3b6e274db1208dbca35f3d93bf47a0c98f09)) --- ## 📦 Quick Installation ### Binary Installation Install latest version: ```bash curl -fsSL https://raw.githubusercontent.com/inference-gateway/inference-gateway/main/install.sh \| bash ``` Install this version: `	Low	12/4/2025
v0.22.4	## [0.22.4](https://github.com/inference-gateway/inference-gateway/compare/v0.22.3...v0.22.4) (2025-12-04) ### 📚 Documentation * Move installation right after the overview section ([a3fda4a](https://github.com/inference-gateway/inference-gateway/commit/a3fda4a0ac3675d510d71935e68d6967335dd3a6)) ### 🔧 Miscellaneous * Add markdownlint and improve docs ([#218](https://github.com/inference-gateway/inference-gateway/issues/218)) ([f351330](https://github.com/inference-gateway/inference-	Low	12/4/2025
v0.22.3	## [0.22.3](https://github.com/inference-gateway/inference-gateway/compare/v0.22.2...v0.22.3) (2025-12-04) ### ♻️ Improvements * Propagate the actual error from the provider to the client ([e3a9feb](https://github.com/inference-gateway/inference-gateway/commit/e3a9febd6c889a6325e0e4d2f02ee30a53e1c6cb)) ### 🔧 Miscellaneous * deps: Bump golang.org/x/crypto from 0.40.0 to 0.45.0 in the examples ([b475adf](https://github.com/inference-gateway/inference-gateway/commit/b475adf99b897c8a2c50ce2	Low	12/4/2025
v0.22.2	## [0.22.2](https://github.com/inference-gateway/inference-gateway/compare/v0.22.1...v0.22.2) (2025-11-29) ### ♻️ Improvements * providers: Improve Claude 4 vision model detection ([#213](https://github.com/inference-gateway/inference-gateway/issues/213)) ([6a64730](https://github.com/inference-gateway/inference-gateway/commit/6a647300afde31e1933838d5f7a24df547552917))	Low	11/29/2025
v0.22.1	## [0.22.1](https://github.com/inference-gateway/inference-gateway/compare/v0.22.0...v0.22.1) (2025-11-29) ### ♻️ Improvements * providers: Improve vision model detection ([#212](https://github.com/inference-gateway/inference-gateway/issues/212)) ([cc2b619](https://github.com/inference-gateway/inference-gateway/commit/cc2b61931763ce77a99fdcd9dcf001d78b444630)) ### 🔨 Miscellaneous * deps: Bump golang.org/x/crypto ([#210](https://github.com/inference-gateway/inference-gateway/issues/2	Low	11/29/2025
v0.22.0	## [0.22.0](https://github.com/inference-gateway/inference-gateway/compare/v0.21.0...v0.22.0) (2025-11-21) ### ✨ Features * config: Add DISALLOWED_MODELS configuration option ([#209](https://github.com/inference-gateway/inference-gateway/issues/209)) ([189a57a](https://github.com/inference-gateway/inference-gateway/commit/189a57ade8c74a009ed02849f5c2675661d4c30a)), closes [#208](https://github.com/inference-gateway/inference-gateway/issues/208) ### 🐛 Bug Fixes * providers: Add Ollam	Low	11/21/2025
v0.21.0	## [0.21.0](https://github.com/inference-gateway/inference-gateway/compare/v0.20.2...v0.21.0) (2025-11-20) ### ✨ Features * providers: Add support for Ollama Cloud provider ([#205](https://github.com/inference-gateway/inference-gateway/issues/205)) ([bd6cf2d](https://github.com/inference-gateway/inference-gateway/commit/bd6cf2d8a4cd0105a752bd8416791ace8e4bc8db)), closes [#204](https://github.com/inference-gateway/inference-gateway/issues/204) ### 🔧 Miscellaneous * ci: Update Claude	Low	11/20/2025
v0.20.2	## [0.20.2](https://github.com/inference-gateway/inference-gateway/compare/v0.20.1...v0.20.2) (2025-11-17) ### 📚 Documentation * readme: Add installation guide and version/help flags ([#198](https://github.com/inference-gateway/inference-gateway/issues/198)) ([8d03fd5](https://github.com/inference-gateway/inference-gateway/commit/8d03fd5efcffdaff90529cfcfca0fe697291fa19))	Low	11/17/2025
v0.20.1	## [0.20.1](https://github.com/inference-gateway/inference-gateway/compare/v0.20.0...v0.20.1) (2025-11-15) ### ♻️ Improvements * config: Add missing fields to startup debug log ([#197](https://github.com/inference-gateway/inference-gateway/issues/197)) ([354cdc7](https://github.com/inference-gateway/inference-gateway/commit/354cdc71e0dd41c1a9a0d32e479064d531981f2b))	Low	11/15/2025
v0.20.0	## [0.20.0](https://github.com/inference-gateway/inference-gateway/compare/v0.19.8...v0.20.0) (2025-11-15) ### ✨ Features * api: Add multimodal image content support to Chat Completion API ([#177](https://github.com/inference-gateway/inference-gateway/issues/177)) ([6882aa2](https://github.com/inference-gateway/inference-gateway/commit/6882aa2113954e7205b358dc73d450446f98eb4c)), closes [#176](https://github.com/inference-gateway/inference-gateway/issues/176) ### 🔧 Miscellaneous * Add mi	Low	11/15/2025
v0.19.8	## [0.19.8](https://github.com/inference-gateway/inference-gateway/compare/v0.19.7...v0.19.8) (2025-11-15) ### 🔧 Miscellaneous * ci: Add Docker image description label to GoReleaser config ([#194](https://github.com/inference-gateway/inference-gateway/issues/194)) ([61733d4](https://github.com/inference-gateway/inference-gateway/commit/61733d4627da295617aac0ee7968b9ec00d69225)) * ci: Improve GoReleaser configuration for reproducibility and OCI compliance ([#195](https://github.com/inf	Low	11/15/2025
v0.19.7	## [0.19.7](https://github.com/inference-gateway/inference-gateway/compare/v0.19.6...v0.19.7) (2025-11-15) ### 🔧 Miscellaneous * ci: Migrate GoReleaser to dockers_v2 format ([#193](https://github.com/inference-gateway/inference-gateway/issues/193)) ([a491149](https://github.com/inference-gateway/inference-gateway/commit/a4911498bcb19f711035a5708f671a86151f4a89)), closes [#192](https://github.com/inference-gateway/inference-gateway/issues/192)	Low	11/15/2025
v0.19.6	## [0.19.6](https://github.com/inference-gateway/inference-gateway/compare/v0.19.5...v0.19.6) (2025-11-15) ### 🔧 Miscellaneous * docs: Update Kubernetes examples to use k3s v1.34.1 and ingress-nginx v4.14.0 ([#191](https://github.com/inference-gateway/inference-gateway/issues/191)) ([65a1d53](https://github.com/inference-gateway/inference-gateway/commit/65a1d53e51e2d5d8fa0103de27412d79e14ad944))	Low	11/15/2025
v0.19.5	## [0.19.5](https://github.com/inference-gateway/inference-gateway/compare/v0.19.4...v0.19.5) (2025-11-15) ### 🔧 Miscellaneous * deps: Update cosign-installer to v4.0.0 in artifacts workflow ([#190](https://github.com/inference-gateway/inference-gateway/issues/190)) ([fae964c](https://github.com/inference-gateway/inference-gateway/commit/fae964c6b5a811d3523337f416160bb61552d796))	Low	11/15/2025
v0.19.4	## [0.19.4](https://github.com/inference-gateway/inference-gateway/compare/v0.19.3...v0.19.4) (2025-11-15) ### 🔧 Miscellaneous * deps: Update dependencies and delete claude code review workflow ([#186](https://github.com/inference-gateway/inference-gateway/issues/186)) ([184ed0d](https://github.com/inference-gateway/inference-gateway/commit/184ed0d91370365d51a0c322106e02c09c40ac3a)) * deps: Update quic-go dependency to v0.54.1 across all modules ([#189](https://github.com/inference-ga	Low	11/15/2025
v0.19.3	## [0.19.3](https://github.com/inference-gateway/inference-gateway/compare/v0.19.2...v0.19.3) (2025-09-29) ### ♻️ Improvements * a2a: Remove A2A middleware and all related components ([#183](https://github.com/inference-gateway/inference-gateway/issues/183)) ([a32c7e4](https://github.com/inference-gateway/inference-gateway/commit/a32c7e4a08eae01868b645dc93521130db79de17)) ### 🐛 Bug Fixes * a2a: Prevent gateway crash when A2A agents fail to initialize ([#179](https://github.com/infer	Low	9/29/2025
v0.19.2	## [0.19.2](https://github.com/inference-gateway/inference-gateway/compare/v0.19.1...v0.19.2) (2025-09-02) ### 🐛 Bug Fixes * a2a: Improve handleStreamingTaskSubmission to process text parts ([#180](https://github.com/inference-gateway/inference-gateway/issues/180)) ([a02db25](https://github.com/inference-gateway/inference-gateway/commit/a02db2597eae918fc60911d92bd9def02f84d4f4)) ### 🔧 Miscellaneous * cli: Set owner to 'inference-gateway' in config.yaml ([3b1d35e](https://github.com	Low	9/2/2025
v0.19.2-rc.2	## [0.19.2-rc.2](https://github.com/inference-gateway/inference-gateway/compare/v0.19.2-rc.1...v0.19.2-rc.2) (2025-08-29) ### 🐛 Bug Fixes * a2a: Improve handleStreamingTaskSubmission to support both SSE and raw JSON formats for event parsing ([26dd6e5](https://github.com/inference-gateway/inference-gateway/commit/26dd6e5a65bcba1abb31678ca27958ffb68bd969))	Low	8/29/2025
v0.19.2-rc.1	## [0.19.2-rc.1](https://github.com/inference-gateway/inference-gateway/compare/v0.19.1...v0.19.2-rc.1) (2025-08-29) ### 🐛 Bug Fixes * a2a: Improve handleStreamingTaskSubmission to process text parts from task status updates ([2ea7276](https://github.com/inference-gateway/inference-gateway/commit/2ea72761f07cd95bf0de63f095b7bbc124ab592d)) ### 🔧 Miscellaneous * cli: Set owner to 'inference-gateway' in config.yaml ([3b1d35e](https://github.com/inference-gateway/inference-gateway/comm	Low	8/29/2025
v0.19.1	## [0.19.1](https://github.com/inference-gateway/inference-gateway/compare/v0.19.0...v0.19.1) (2025-08-23) ### 👷 CI * Update golangci-lint installation script to version v2.4.0 in CI workflows ([5f00ebe](https://github.com/inference-gateway/inference-gateway/commit/5f00ebecfb2f04ab5fc4afd415e6a814a143408f)) ### 📚 Documentation * Add Google models to REST endpoints documentation ([658aaf9](https://github.com/inference-gateway/inference-gateway/commit/658aaf9734ce78a3dcf92644cf5c5c6433eee9cd	Low	8/23/2025
v0.19.0	## [0.19.0](https://github.com/inference-gateway/inference-gateway/compare/v0.18.0...v0.19.0) (2025-08-07) ### ✨ Features * providers: Add Mistral AI as a provider ([#173](https://github.com/inference-gateway/inference-gateway/issues/173)) ([65d46dd](https://github.com/inference-gateway/inference-gateway/commit/65d46dd8971043dce320e21da7d3abe4c7062509))	Low	8/7/2025
v0.18.0	## [0.18.0](https://github.com/inference-gateway/inference-gateway/compare/v0.17.2...v0.18.0) (2025-08-02) ### ✨ Features * debug: Improve debug logging for better development experience ([#171](https://github.com/inference-gateway/inference-gateway/issues/171)) ([4bb1a47](https://github.com/inference-gateway/inference-gateway/commit/4bb1a47f056151c3d32e2fa632c2500a7b37d46f)) ### 🔧 Miscellaneous * Update inference-gateway image to version 0.17.2 ([a54adb4](https://github.com/inference-g	Low	8/2/2025
v0.18.0-rc.1	## [0.18.0-rc.1](https://github.com/inference-gateway/inference-gateway/compare/v0.17.2...v0.18.0-rc.1) (2025-08-02) ### ✨ Features * debug: Refactor debug logging for better development experience ([ae399ae](https://github.com/inference-gateway/inference-gateway/commit/ae399aee9c4e5251c796b92d8299cab1a611dfda)) ### ♻️ Improvements * debug: Run task generate ([c4c8b69](https://github.com/inference-gateway/inference-gateway/commit/c4c8b69ec144001c5756fed7fd074d3a45c0feca)) ### 🔧 Mis	Low	8/2/2025
v0.17.2	## [0.17.2](https://github.com/inference-gateway/inference-gateway/compare/v0.17.1...v0.17.2) (2025-08-01) ### ♻️ Improvements * a2a: Refactor service discovery to use Agent CRDs instead of A2A ([#169](https://github.com/inference-gateway/inference-gateway/issues/169)) ([d827be7](https://github.com/inference-gateway/inference-gateway/commit/d827be79afe39e10ee80a5c77276ced86b1286ed)), closes [#168](https://github.com/inference-gateway/inference-gateway/issues/168) ### 🔧 Miscellaneous * A	Low	8/1/2025
v0.17.1	## [0.17.1](https://github.com/inference-gateway/inference-gateway/compare/v0.17.0...v0.17.1) (2025-07-30) ### ♻️ Improvements * Update import paths from a2a to adk for consistency ([#167](https://github.com/inference-gateway/inference-gateway/issues/167)) ([99f57ed](https://github.com/inference-gateway/inference-gateway/commit/99f57ed6c66d627bc0bf01ce31747ea4a41030a9))	Low	7/30/2025
v0.17.0	## [0.17.0](https://github.com/inference-gateway/inference-gateway/compare/v0.16.1...v0.17.0) (2025-07-29) ### ✨ Features * a2a: Add Kubernetes service discovery for A2A agents ([#166](https://github.com/inference-gateway/inference-gateway/issues/166)) ([9be30ff](https://github.com/inference-gateway/inference-gateway/commit/9be30ff8c4a22fa26c4aef9bd00579e3cbb9b44a)), closes [#142](https://github.com/inference-gateway/inference-gateway/issues/142) ### 📚 Documentation * Add a section to e	Low	7/29/2025
v0.16.1	## [0.16.1](https://github.com/inference-gateway/inference-gateway/compare/v0.16.0...v0.16.1) (2025-07-26) ### ♻️ Improvements * workflow: Remove security-events permission and scan_containers job ([ff0f482](https://github.com/inference-gateway/inference-gateway/commit/ff0f482feaf143ebfe8da036acdde9f948f3e5e2)) ### 🔧 Miscellaneous * deps: Add Dependabot configuration for gomod, docker, and GitHub Actions ([48d368a](https://github.com/inference-gateway/inference-gateway/commit/48d368	Low	7/26/2025
v0.16.0	## [0.16.0](https://github.com/inference-gateway/inference-gateway/compare/v0.15.1...v0.16.0) (2025-07-26) ### ✨ Features * mcp: Add health checks and retry mechanisms ([#165](https://github.com/inference-gateway/inference-gateway/issues/165)) ([f1ce6af](https://github.com/inference-gateway/inference-gateway/commit/f1ce6af42a04cb6cc450702873862db17e509b3d)), closes [#164](https://github.com/inference-gateway/inference-gateway/issues/164) ### 👷 CI * Change permissions for Claude Code Rev	Low	7/26/2025
v0.15.1	## [0.15.1](https://github.com/inference-gateway/inference-gateway/compare/v0.15.0...v0.15.1) (2025-07-26) ### ♻️ Improvements * Download the latest A2A schema and generate new Go types ([#163](https://github.com/inference-gateway/inference-gateway/issues/163)) ([3c88346](https://github.com/inference-gateway/inference-gateway/commit/3c88346610b625f0858babf5e30bbde533aea7a6)) * Download the latest MCP schema, generate new Go types and update the code ([#162](https://github.com/inference-ga	Low	7/26/2025
v0.15.0	## [0.15.0](https://github.com/inference-gateway/inference-gateway/compare/v0.14.1...v0.15.0) (2025-07-26) ### ✨ Features * providers: Add Google OpenAI-compatible API provider ([#161](https://github.com/inference-gateway/inference-gateway/issues/161)) ([d367fe9](https://github.com/inference-gateway/inference-gateway/commit/d367fe924091291a62013bbe5e1cd3d83a4a6082)), closes [#146](https://github.com/inference-gateway/inference-gateway/issues/146) ### ♻️ Improvements * auth: Rename au	Low	7/26/2025
v0.14.1	## [0.14.1](https://github.com/inference-gateway/inference-gateway/compare/v0.14.0...v0.14.1) (2025-07-25) ### ♻️ Improvements * config: Refactor authentication config to use AUTH_ prefix ([#159](https://github.com/inference-gateway/inference-gateway/issues/159)) ([c97bdd1](https://github.com/inference-gateway/inference-gateway/commit/c97bdd15a14f067a5d46eb501dabc22e3f816cb6)) * docs: Remove outdated section on function/tool call metrics from README ([9ec242c](https://github.com/infere	Low	7/25/2025
v0.14.0	## [0.14.0](https://github.com/inference-gateway/inference-gateway/compare/v0.13.0...v0.14.0) (2025-07-25) ### ✨ Features * config: Add configurable TELEMETRY_METRICS_PORT setting ([#152](https://github.com/inference-gateway/inference-gateway/issues/152)) ([daa066e](https://github.com/inference-gateway/inference-gateway/commit/daa066e08e45c5e17a61319e0fcf3724ddf79259)), closes [#151](https://github.com/inference-gateway/inference-gateway/issues/151) * mcp: Add "mcp_" prefix to MCP tool	Low	7/25/2025
v0.13.0	## [0.13.0](https://github.com/inference-gateway/inference-gateway/compare/v0.12.0...v0.13.0) (2025-07-25) ### ✨ Features * a2a: Implement retry mechanism for agent connections ([#140](https://github.com/inference-gateway/inference-gateway/issues/140)) ([54033e8](https://github.com/inference-gateway/inference-gateway/commit/54033e8ef4a5489bb6212b715e7f34a7e1d3931a)), closes [#139](https://github.com/inference-gateway/inference-gateway/issues/139) * Implement A2A agent status polling with b	Low	7/25/2025
v0.13.0-rc.1	## [0.13.0-rc.1](https://github.com/inference-gateway/inference-gateway/compare/v0.12.0...v0.13.0-rc.1) (2025-06-24) ### ✨ Features * Add support for configuration file path and update related documentation ([e0bceb6](https://github.com/inference-gateway/inference-gateway/commit/e0bceb6042999cba3ee36923f86f42207af94237)) * Implement SIGHUP signal handling for dynamic config reload ([470e7f3](https://github.com/inference-gateway/inference-gateway/commit/470e7f3992ae8799700a366f2cf233f7ad873c07)	Low	6/24/2025
v0.12.0	## [0.12.0](https://github.com/inference-gateway/inference-gateway/compare/v0.11.2...v0.12.0) (2025-06-18) ### ✨ Features * A2A - Add ListAgentsHandler and GetAgentHandler endpoint to retrieve specific agent details by ID ([#129](https://github.com/inference-gateway/inference-gateway/issues/129)) ([2250fba](https://github.com/inference-gateway/inference-gateway/commit/2250fbabb15746a21a508806f73d5941d7588091)) ### 📚 Documentation * Update README with new A2A integration examples for Docker	Low	6/18/2025
v0.11.2	## [0.11.2](https://github.com/inference-gateway/inference-gateway/compare/v0.11.1...v0.11.2) (2025-06-15) ### ♻️ Improvements * Move all MCP related logic to the Middleware ([#127](https://github.com/inference-gateway/inference-gateway/issues/127)) ([6e4375e](https://github.com/inference-gateway/inference-gateway/commit/6e4375eb0e890ce770958023b4fca0cf3e7294e2))	Low	6/15/2025
v0.11.1	## [0.11.1](https://github.com/inference-gateway/inference-gateway/compare/v0.11.0...v0.11.1) (2025-06-15) ### ♻️ Improvements * Rename internal headers to use 'Bypass' terminology ([#126](https://github.com/inference-gateway/inference-gateway/issues/126)) ([c93c75c](https://github.com/inference-gateway/inference-gateway/commit/c93c75c975565e0190f0ff0ee00763a31a582dd7))	Low	6/15/2025
v0.11.0	## [0.11.0](https://github.com/inference-gateway/inference-gateway/compare/v0.10.2...v0.11.0) (2025-06-12) ### ✨ Features * a2a: Implement A2A streaming mode and response handling ([#124](https://github.com/inference-gateway/inference-gateway/issues/124)) ([269c74f](https://github.com/inference-gateway/inference-gateway/commit/269c74f38ceeb8bbab89c516df2a7810a29d534d))	Low	6/12/2025

Dependencies & License Audit

Loading dependencies...

Similar Packages

voidllmPrivacy-first LLM proxy and AI gateway — load balancing, multi-provider routing, API key management, usage tracking, rate limiting. Self-hosted. Zero knowledge of your prompts.v0.0.23

AgenvoyAgentic framework | Self-improving memory | Pluggable tool extensions | Sandbox executionv0.28.27

helix♾️ Private Agent Fleet with Spec Coding. Each agent gets their own GPU-accelerated desktop. Run Claude, Codex, Gemini and open models on a full private AI Stack ♾️2.11.57

ai-gatewayOne API for 25+ LLMs, OpenAI, Anthropic, Bedrock, Azure. Caching, guardrails & cost controls. Go-native LiteLLM & Kong AI Gateway alternative.v1.3.2

axonhub⚡️ Open-source AI Gateway — Use any SDK to call 100+ LLMs. Built-in failover, load balancing, cost control & end-to-end tracing.v1.0.0-beta5

More in Infrastructure

llm7.ioLLM7.io offers a single API gateway that connects you to a wide array of leading AI models from various providers.

modelsThis repository contains comprehensive pricing and configuration data for LLMs. It powers cost attribution for 200+ enterprises running 400B+ tokens through Portkey AI Gateway every day.

control-layerThe world’s fastest AI model gateway (450x less overhead than LiteLLM). Unified access to LLMs across endpoints (openAI, self-hosted, etc.) behind a single authentication layer - with API key generati

chak-aiA simple, yet handy, LLM gateway.