| 2026.6.0 | > [!CAUTION] > **Security Advisory** > > This release fixed a high-risk vulnerability affecting the TensorZero Gateway. > > Please refer to the security advisory for more details: https://github.com/tensorzero/tensorzero/security/advisories/GHSA-824w-x939-6cmc | High | 6/4/2026 |
| 2026.5.2 | **New Features** - Accept both strings and array of strings for `stop` in the OpenAI-compatible inference endpoint (thanks @pragnyanramtha). - Emit additional OpenInference attributes for Arize compatibility. | High | 5/20/2026 |
| 2026.5.1 | **Bug Fixes** - Treat SSE body decoding errors as fatal. | High | 5/15/2026 |
| 2026.5.0 | > [!CAUTION] > **Breaking Changes** > > - The UI will now require authentication when the gateway requires authentication. Previously, the UI only required authentication for gateway usage. **New Features** - Improve error handling (e.g. status code propagation) and logging for complex streaming inferences (e.g. fallbacks). _& multiple under-the-hood and UI improvements (thanks @arisp)_ | High | 5/8/2026 |
| 2026.4.1 | > [!CAUTION] > **Breaking Changes** > > - The gateway now defaults to async observability writes to reduce tail latency: inferences are sent to the client before they are persisted in the database. To restore the previous behavior, set `observability.async_writes = false`. **[[docs]](https://www.tensorzero.com/docs/gateway/configuration-reference)** > [!WARNING] > **Deprecations** > > - Removed the TensorZero Autopilot "Sessions" page from the UI. We recently added a TensorZero MCP that | High | 4/24/2026 |
| 2026.4.0 | **New Features** - Add an MCP server to the gateway exposing its API in `/mcp`. - Report provider prompt caching statistics via API and UI. - Report usage statistics (e.g. tokens, latency, cost) for inference evaluations via CLI tool, API, and UI. - Add the Prometheus metrics `tensorzero_input_tokens_total` and `tensorzero_output_tokens_total`. - Add configuration field `content_type_overrides` to handle file inputs for long-tail providers. _& multiple under-the-hood and UI improvement | High | 4/2/2026 |
| 2026.3.4 | > [!WARNING] > **Planned Deprecations** > > - The configuration for inference evaluations should be nested under the relevant functions moving forward **[[docs]](https://www.tensorzero.com/docs/evaluations/inference-evaluations/tutorial)**. You can run evaluations by providing a function name and a list of evaluators. The legacy format will be removed in a future release. > ``` > [functions.write_haiku.evaluators.exact_match] > type = "exact_match" > ``` > - The legacy implementa | Medium | 3/26/2026 |
| 2026.3.3 | **Bug Fixes** - Fixed two edge cases affecting batch inference. - Fixed a UI bug affecting "Try with..." with inputs that include base64 files. - Removed assistant message prefill for JSON functions + Anthropic (deprecated by Anthropic). **New Features** - Added an implementation of GEPA (automated prompt engineering) based on durable workflows. - Allow users to specify duplicate tool calls in `all_of` tool evaluators to evaluate parallel tool calling. - Allow users to specify an ex | Low | 3/18/2026 |
| 2026.3.2 | **Bug Fixes** - Fixed an UI issue that prevented certain pages from rendering when depending on historical configuration. **New Features** - Added Postgres as an alternative observability backend to ClickHouse. Postgres is the simplest way to get started; we recommend ClickHouse if you're handling >100 RPS. - Added the `openrouter::xxx` short-hand for embedding models. - Added support for per-session API keys in the browser (instead of a global environment variable) when auth is enabl | Low | 3/13/2026 |
| 2026.3.1 | > [!WARNING] > **Completed Deprecations** > > - Removed the deprecated `model_provider_name` filter for `extra_body` and `extra_headers`. Please use `model_name` and `provider_name` instead. > - Removed the legacy experimental `list_inferences` endpoint and method. Please use the new endpoint instead. **[[docs]](https://www.tensorzero.com/docs/observability/query-historical-inferences)** > - Removed several long-deprecated types and methods from the TensorZero Python SDK. > [!WARNING] > | Low | 3/5/2026 |
| 2026.3.0 | > [!WARNING] > **Completed Deprecations** > > - The deprecated Prometheus metric `tensorzero_inference_latency_overhead_seconds_histogram` was removed. Use `tensorzero_inference_latency_overhead_seconds` instead. > [!WARNING] > **Planned Deprecations** > > - The configuration for experimentation (e.g. `static_weights`, `track_and_stop`) was simplified. The old notation will be removed in a future release. See **[Run adaptive A/B tests](https://www.tensorzero.com/docs/experimentation/run | Low | 3/4/2026 |
| 2026.2.2 | > [!CAUTION] > **Breaking Changes** > > - The `--config-file` globbing behavior has changed: single-level wildcards (`*`) no longer match files across directory boundaries. To match files across directory boundaries, use recursive wildcards (`**`). This aligns the behavior with standard glob semantics. For example: > - `--config-file *.toml` matches `tensorzero.toml`, but not `subdir/tensorzero.toml`. > - `--config-file **/*.toml` matches both `tensorzero.toml` and `subdir/tensorzero.to | Low | 2/26/2026 |
| 2026.2.1 | > [!CAUTION] > **Breaking Changes** > > - The default value for `cache_options.enabled` changed from `write_only` to `off`. **New Features** - Support reasoning models from Groq, Mistral, and vLLM. - Support multi-turn reasoning with Gemini and OpenAI-compatible models. - Support embedding models from Together AI. - Add configurable `total_ms` timeout to streaming inferences. - Display charts with top-k evaluation results in the TensorZero Autopilot UI. - Add "Ask Autopilot" button | Low | 2/16/2026 |
| 2026.2.0 | > [!WARNING] > **Planned Deprecations** > > - Anthropic's structured output feature is out of beta, so the TensorZero configuration field `beta_structured_outputs` is now ignored and deprecated. It'll be removed in a future release. **Bug Fixes** - Fix a regression in the `aws_bedrock` provider that affected long-term bearer API keys. - Fix a horizontal overflow issue for tool calls and results in the inference detail UI page. **New Features** - Add YOLO Mode for TensorZero Autop | Low | 2/5/2026 |
| 2026.1.8 | **Bug Fixes** - Fix a race condition in the TensorZero Autopilot UI that could disable the chat input. - Increase timeouts for slow tool calls triggered by TensorZero Autopilot (e.g. evaluations). _& multiple under-the-hood and UI improvements!_ | Low | 1/30/2026 |
| 2026.1.7 | **New Features** - [Preview] TensorZero Autopilot — an automated AI engineer that analyzes LLM observability data, optimizes prompts and models, sets up evals, and runs A/B tests. **[Learn more โ](https://www.tensorzero.com/)** **[Join the waitlist โ](https://tensorzero.com/autopilot-waitlist)** - Support multi-turn reasoning for xAI (`reasoning_content` only). _& multiple under-the-hood and UI improvements!_ | Low | 1/30/2026 |
| 2026.1.6 | > [!CAUTION] > **Breaking Changes** > > - Moving forward, TensorZero will use the OpenAI API's error format (`{"error": {"message": "Bad!"}`) instead of TensorZero's error format (`{"error": "Bad!"}`) in the OpenAI-compatible endpoints. > [!WARNING] > **Planned Deprecations** > > - When using `unstable_error_json` with the OpenAI-compatible inference endpoint, use `tensorzero_error_json` instead of `error_json`. For now, TensorZero will emit both fields with identical data. The TensorZe | Low | 1/30/2026 |
| 2026.1.5 | > [!CAUTION] > **Breaking Changes** > > - TensorZero will normalize the reported `usage` from different model providers. Moving forward, `input_tokens` and `output_tokens` include all token variations (provider prompt caching, reasoning, etc.), just like OpenAI. Tokens cached by TensorZero remain excluded. You can still access the raw usage reported by providers with `include_raw_usage`. > [!WARNING] > **Planned Deprecations** > > - Migrate `include_original_response` to `include_raw_re | Low | 1/24/2026 |
| 2026.1.2 | **New Features** - Support appending to arrays with `extra_body` using the `/my_array/-` notation. - Handle cross-model thought signatures in GCP Vertex AI Gemini and Google AI Studio. _& multiple under-the-hood and UI improvements (thanks @ecalifornica!)_ | Low | 1/15/2026 |
| 2026.1.1 | > [!WARNING] > **Planned Deprecations** > > - In a future release, the parameter `model` will be required when initializing `DICLOptimizationConfig`. The parameter remains optional (defaults to `openai::gpt-5-mini`) in the meantime. **Bug Fixes** - Stop buffering `raw_usage` when streaming with the OpenAI-compatible inference endpoint; instead, emit `raw_usage` as soon as possible, just like in the native endpoint. - Stop reporting zero usage in every chunk when streaming a cached infe | Low | 1/14/2026 |
| 2026.1.0 | > [!CAUTION] > **Breaking Changes** > > - The Prometheus metric `tensorzero_inference_latency_overhead_seconds` will report a histogram instead of a summary. You can customize the buckets using `gateway.metrics.tensorzero_inference_latency_overhead_seconds_buckets` in the configuration (default: 1ms, 10ms, 100ms). > [!WARNING] > **Planned Deprecations** > > - Deprecate the `TENSORZERO_CLICKHOUSE_URL` environment variable from the UI. Moving forward, the UI will query data through the ga | Low | 1/10/2026 |
| 2025.12.6 | > [!CAUTION] > **Breaking Changes** > > - Migrated the following optimization fields from the TensorZero Python SDK to the configuration: > - **`DICLOptimizationConfig`:** removed `credential_location`. > - **`FireworksSFTConfig`:** moved `account_id` to `[provider_types.fireworks.sft]`; removed `api_base` and `credential_location`. > - **`GCPVertexGeminiSFTConfig`:** moved `bucket_name`, `bucket_path_prefix`, `kms_key_name`, `project_id`, `region`, and `service_account` to to `[prov | Low | 12/26/2025 |
| 2025.12.5 | > [!WARNING] > **Planned Deprecations** > > - The variant type `experimental_chain_of_thought` will be deprecated in `2026.2+`. As reasoning models are becoming prevalent, please use their native reasoning capabilities. > - The `timeout_s` configuration field for best/mixture-of-N variants will be deprecated in `2026.2+`. Please use the `[timeouts]` block in the configuration for their candidates instead. **New Features** - Expand the dataset builder in the UI to support complex querie | Low | 12/23/2025 |
| 2025.12.3 | **Bug Fixes** - Fix a bug where negative tag filters (e.g. `user_id != 1`) matched inferences and datapoints without that tag. - Fix a bug where metric filters covering default values (e.g. `exact_match = false`) matched inferences without that metric. - Fix a regression affecting the logger in the UI. **New Features** - Improve the performance of the inference and datapoint list pages in the UI. - Support filtering inferences by whether they have a demonstration. _& multiple unde | Low | 12/17/2025 |
| 2025.12.2 | **Bug Fixes** - Fix a performance regression affecting the inference table in the UI. **New Features** - Allow users to customize the log level in the UI (`TENSORZERO_UI_LOG_LEVEL`). _& multiple under-the-hood and UI improvements_ | Low | 12/12/2025 |
| 2025.12.1 | **Bug Fixes** - Fixed a regression that broke the dataset builder in the UI. _& multiple under-the-hood and UI improvements_ | Low | 12/12/2025 |
| 2025.12.0 | > [!CAUTION] > **Breaking Changes** > > - Unknown content blocks now return the scope as `model_name` and `provider_name` instead of the fully-qualified `model_provider_name`. > [!WARNING] > **Planned Deprecations** > > - The TensorZero UI now reads the configuration from the gateway (instead of reading directly from the filesystem). The environment variables `TENSORZERO_UI_CONFIG_PATH` and `TENSORZERO_UI_DEFAULT_CONFIG` are deprecated and ignored. You no longer need to mount the config | Low | 12/11/2025 |
| 2025.11.6 | **Bug Fixes** - Handle a regression in ClickHouse `latest` that affected the endpoint for deleting datapoints. **New Features** - Support running evaluations programmatically on specific datapoints (`datapoint_ids`). - Generate `values.schema.json` for the Helm chart. (thanks @Erin-Boehmer!) | Low | 11/27/2025 |
| 2025.11.5 | > [!CAUTION] > **Breaking Changes** > > - Moving forward, explicit `tensorzero::params` will take precedence over conflicting native parameters when using the OpenAI-compatible inference endpoint. > [!WARNING] > **Planned Deprecations** > > - Rename `json_mode="implicit_tool"` to `json_mode="tool"`. > - Set `model_name` and optionally `provider_name` instead of `model_provider_name` in `extra_body` and `extra_headers` objects supplied at inference time. Alternatively, don't include a s | Low | 11/21/2025 |
| 2025.11.4 | > [!CAUTION] > **Breaking Changes** > > - Moving forward, `allowed_tools` must include dynamic tools (tools specified at inference time rather than in configuration). This matches the OpenAI API behavior. Previously, TensorZero assumed that dynamic tools were always allowed. > [!WARNING] > **Planned Deprecations** > > - Use `limit` instead of `page_size` with the programmatic observability methods. Previously, the methods mixed these two fields. > - Don't nest fields in `metadata` or ` | Low | 11/19/2025 |
| 2025.11.3 | **Bug Fixes** - Enable TLS support for Postgres connections. - Fix handling of user-defined tags in batch inference. _& multiple under-the-hood and UI improvements_ | Low | 11/11/2025 |
| 2025.11.2 | > [!CAUTION] > **Breaking Changes** > > - Moving forward, the gateway will attempt any `fallback_variants` in order rather than randomly sample them. **Bug Fixes** - Fix a bug that prevented some model inferences from being rendered correctly in the UI. - Handle non-image base64 file inputs consistently in the OpenAI-compatible inference endpoint. - Handle `raw_response` correctly for batch inference with GCP Vertex AI Gemini. **New Features** - Apply the `tensorzero::api_key_pu | Low | 11/6/2025 |
| 2025.11.1 | **Bug Fixes** - Fix a regression that prevented batch inferences from being rendered in the UI. - Handle missing Postgres credentials gracefully in the UI. **New Features** - Support rate limiting by API key (`api_key_public_id`). - Add native `service_tier` inference parameter (supported providers: Anthropic, Azure, Groq, OpenAI). `extra_body` is no longer necessary. - Add native `detail` parameter for input images (supported providers: Azure, OpenAI, xAI). `extra_body` is no longer | Low | 11/5/2025 |
| 2025.11.0 | > [!WARNING] > **Completed Deprecations** > > - Completed the planned deprecation of the configuration field `enable_template_filesystem_access` in favor of `template_filesystem_access.enabled`. **Bug Fixes** - Handle the `global` region correctly for GCP Vertex Anthropic. - Fix `output` format for JSON functions in the new endpoint for updating datapoints (`PATCH /v1/{dataset_name}/datapoints`). The `output` field now matches the inference endpoint (an object with a `raw` field; `pars | Low | 11/3/2025 |
| 2025.10.9 | > [!CAUTION] > **Notice on `2025.10.8`:** We ran into a technical issue during the release process for `2025.10.8` that resulted in a broken build for the TensorZero Python SDK on PyPI. We've yanked that release and recommend upgrading to this version. > [!CAUTION] > **Breaking Changes** > > - This release includes small breaking changes to the programmatic observability/dataset APIs (e.g. `list_datapoints`, `experimental_list_inferences`) and the underlying data schema. Moving forward, T | Low | 10/31/2025 |
| 2025.10.7 | > [!CAUTION] > **Breaking Changes** > > - The default value for `fetch_and_encode_input_files_before_inference` is changing from `true` to `false`. As a result, the gateway will no longer fetch input files before inference, but instead will fetch them in parallel with inference (for observability). In rare cases, this may cause the gateway to receive different input files than those received by model providers. > [!WARNING] > **Planned Deprecations** > > - Migrate file content blocks fr | Low | 10/23/2025 |
| 2025.10.6 | > [!WARNING] > **Planned Deprecations** > > - We're renaming "static evaluations" to "inference evaluations" and "dynamic evaluations" to "workflow evaluations". The only action needed is to update `type = "static"` in the configuration to `type = "inference"`. Both versions will be supported until `2026.2+`. **Bug Fixes** - Fix a bug that dropped tool IDs in output `tool_call` content blocks when updating datapoints. - Prefer magic bytes over the `Content-Type` HTTP response header to | Low | 10/21/2025 |
| 2025.10.5 | **Bug Fixes** - Add `FinishReason.STOP_SEQUENCE` to the TensorZero Python SDK. | Low | 10/20/2025 |
| 2025.10.4 | > [!WARNING] > **Planned Deprecations** > > - The `bulk_insert_datapoints` method (`POST /datasets/{dataset_name}/datapoints/bulk`) will be renamed to `create_datapoints` (`POST /datasets/{dataset_name}/datapoints`). Both methods will be available until `2026.2+`. (thanks @BrianLi23!) > [!WARNING] > **Completed Deprecations** > > - Concluded many small ongoing deprecations: > > - Python SDK: renamed the types `*InferenceDataset` โ `*InferenceDatapoint` and `*Node` โ `*Filter` > - | Low | 10/17/2025 |
| 2025.10.3 | **Bug Fixes** - Fix bug in the Playground UI that caused inferences containing static tools with custom names (`tools.my_tool.name`) to fail. | Low | 10/11/2025 |
| 2025.10.2 | > [!WARNING] > **Planned Deprecations** > > - Currently, the gateway automatically includes all dynamic tools in the list of allowed tools. In a near-future release, dynamic tools will no longer be included automatically. If you intend for your dynamic tools to be allowed, please allow them explicitly. > [!WARNING] > **Completed Deprecations** > > - Finish renaming `datapoint_name` โ `task_name` for dynamic evaluations. > - Stop including `--config-file` in the `Dockerfile` for `tensor | Low | 10/10/2025 |
| 2025.10.1 | **New Features** - Increase default body limit to 100MB for `patch_openai_client`. _& multiple under-the-hood and UI improvements_ | Low | 10/4/2025 |
| 2025.10.0 | > [!WARNING] > **Planned Deprecations** > > - Configure timeouts for embedding models and embedding model providers with `timeout_ms` instead of `timeouts.non_streaming.total_ms`. The latter will be removed in a future release (`2026.1+`). > - Use the gateway CLI flags `--run-clickhouse-migrations` and `--run-postgres-migrations` instead of `--run-migrations-only`. `--run-migrations-only` requires credentials for both databases, even though Postgres is an optional dependency, so it will be r | Low | 10/2/2025 |
| 2025.9.6 | **Bug Fixes** - Implemented a workaround for an upstream bug in `opentelemetry-otlp` that caused our OTLP exporter to fail to send data to encrypted endpoints. **New Features** - Added multiple small improvements to the evaluations UI to streamline common workflows and simplify debugging. _& multiple under-the-hood and UI improvements_ | Low | 9/29/2025 |
| 2025.9.5 | **New Features** - Add model observability page to the UI with model throughput and latency analytics. - Add support for OpenInference format when exporting OpenTelemetry traces. - Expand support of UI features for the default function (e.g. "Try with model"). - Add support for supervised fine-tuning (SFT) with GCP Vertex AI Gemini in the UI. - Improve the performance of episode table in the UI. - Add an example of using the programmatic workflow for dynamic in-context learning. _& mu | Low | 9/25/2025 |
| 2025.9.4 | > [!WARNING] > **Planned Deprecations** > > - Rename types from `Dicl*` to `DICL*` in the Python SDK for consistency. Both versions work for now, and the deprecated types will be removed in a future release (`2025.12+`). **Bug Fixes** - Fix a regression in the UI that prevented `chat` datapoints from being edited. **New Features** - Expand the prompt templates and schemas functionality to support unlimited templates per function. - Support appending to existing DICL variants in t | Low | 9/16/2025 |
| 2025.9.3 | **New Features** - Add support for dynamic OTLP headers when exporting OpenTelemetry traces. - Add support for `allowed_tools` field in the OpenAI-compatible inference endpoint. - Improve performance by automatically adjusting the number of HTTP2 connections to model providers based on concurrency. _& multiple under-the-hood and UI improvements (thanks @yuria-loo!)_ | Low | 9/12/2025 |
| 2025.9.1 | **Bug Fixes** - Fix a regression that prevented rendering of inferences with `thought` content blocks in the UI. - Stop logging HTTP requests and responses twice in debug mode. **New Features** - Add a programmatic API for reinforcement fine-tuning (RFT) with OpenAI. - Provide defaults for individual fields in the `retries` configuration. - Allow users to specify the Azure provider endpoint dynamically. (thanks @Dineshm-coder!) - Improve error messages when the gateway is missing cr | Low | 9/8/2025 |
| 2025.9.0 | > [!CAUTION] > > **Breaking Changes** > > - The bug fix for `feedback_id` technically introduces a breaking change in the TensorZero Python SDK. The field is no longer incorrectly doubly nested and now matches the SDK's type annotations. > [!WARNING] > **Completed Deprecations** > > - `json_mode` is now required for JSON function variants. **Bug Fixes** - Added workarounds for two ClickHouse regressions (ClickHouse/ClickHouse#86415, ClickHouse/ClickHouse#86557) introduced in Clic | Low | 9/3/2025 |
| 2025.8.5 | **Bug Fixes** - Reduce the ClickHouse memory footprint in large deployments with human feedback for evaluations. **New Features** - Add a programmatic optimization interface for dynamic in-context learning. - Expose more hyperparameters for programmatic supervised fine-tuning with Together AI. _& many under-the-hood and UI improvements (thanks @quangIO!)_ | Low | 8/29/2025 |
| 2025.8.4 | > [!WARNING] > **Planned Deprecations** > > * The OpenAI-compatible embeddings endpoint will require the prefix `tensorzero::embedding_model_name::` for model names (e.g. `tensorzero::embedding_model_name::openai::text-embedding-3-small`). Support for unprefixed names will be removed in a future release (`2025.12+`). **Bug Fixes** - Fix a ClickHouse warning that occurred when a model inference had input tokens set to null and output tokens non-null, or vice versa. This issue only caused | Low | 8/27/2025 |
| 2025.8.3 | > [!CAUTION] > **Breaking Changes** > > * **Temporarily removing support for batching writes to ClickHouse with the embedded gateway in Python:** In the previous release, we added support for batching writes to ClickHouse to boost ingest throughput and reduce insert overhead at scale (default off). Later, we discovered that in rare scenarios, the Python GIL could interfere with this setting in embedded clients and cause a deadlock. While we investigate a solution, we are removing support for | Low | 8/21/2025 |
| 2025.8.2 | **New Features** - Add a Playground to the UI to compare variants side-by-side, iterate on prompts quickly, and replay inference requests. - Support batching writes to ClickHouse to boost ingest throughput and reduce insert overhead at scale. - Add a Jupyter notebook recipe for supervised fine-tuning with Unsloth. _& many under-the-hood and UI improvements (thanks @contrun @lblack00!)_ | Low | 8/12/2025 |
| 2025.8.1 | **New Features** * Add an OpenAI-compatible endpoint for embeddings, with support for OpenAI (& OpenAI-compatible) and Azure OpenAI Service model providers. * Add support for self-hosted replicated ClickHouse databases. * Parse `reasoning_content` from Fireworks and vLLM model providers. * Improve error messages for AWS Bedrock and AWS SageMaker model providers. **Bug Fixes** * Allow configuration to specify `description` for JSON functions. * Fix a regression where function descrip | Low | 8/11/2025 |
| 2025.8.0 | **New Features** - Add `gateway.observability.skip_completed_migrations` configuration option to reduce gateway startup time and database load. When enabled, the gateway will skip running the ClickHouse migration workflow (i.e. verifying and potentially applying every migration) on startup for migrations that are already present in a database table that tracks migration history. - Support `raw_text` content blocks in the OpenAI-compatible inference endpoint. (Thanks @hongantran3804 @pykm05 @ | Low | 8/6/2025 |