| 2.1.601 | # ๐ Arthur Engine Release **June 3, 2026** This release delivers significant multi-tenant security hardening, a thoroughly refined onboarding tour experience, and improved agent trace visibility in the prompts playground. --- ## Multi-Tenancy & Access Control ### Security Fixes * Closed five critical **multi-tenant security and correctness gaps** including reCAPTCHA fail-open rejection, notebook ownership validation before experiment linking, org-scoped session trace pagination at the SQL | High | 6/3/2026 |
| 2.1.579 | # ๐ Arthur Engine Release **May 28, 2026** Arthur Engine 2.1.579 delivers Azure ecosystem integrations, a dedicated prompt injection validation endpoint, improved PII detection accuracy, onboarding workflows, and multi-tenant UI support โ alongside quality-of-life improvements across the platform. --- ## Guardrails & Validation ### Prompt Injection * Added a new **validate endpoint** that enables easy, standalone prompt injection checks against incoming prompts (#1633) ### PII Detection | High | 5/28/2026 |
| 0.0.11-lts | # ๐ Arthur Engine Release **May 20, 2026** This release lays the groundwork for full multi-tenancy, introduces a guided onboarding experience for new users, expands model provider and observability integrations, and strengthens compliance automation โ making Arthur Engine ready for larger, organization-aware deployments. --- ## Multi-Tenancy and Access Control ### Organization-Scoped Data and API Keys * **Organizations table and tenant isolation** are now supported at the database level โ | High | 5/20/2026 |
| 2.1.563 | # ๐ Arthur Engine Release **May 14, 2026** This release brings powerful new capabilities for onboarding, compliance automation, and observability โ including an interactive onboarding agent, Azure OpenAI support, transform version history, and expanded SDK instrumentors for popular AI frameworks. --- ## Onboarding and Getting Started ### Interactive Onboarding Agent * New **interactive onboarding CLI tool** automates setup of observability, model configuration, Python instrumentation, and | High | 5/14/2026 |
| 0.0.8-lts | # ๐ Arthur Engine Release **May 6, 2026** This release introduces the first Long-Term Support (LTS) channel for Arthur Engine, giving teams a stable, versioned deployment path alongside improvements to container security and airgapped deployment compatibility. --- ## Long-Term Support (LTS) Release Channel ### LTS Versioning and Distribution * Arthur Engine is now available through a dedicated **Long-Term Support (LTS) release channel**, providing a stable, predictable version track for p | High | 5/6/2026 |
| 2.1.548 | # ๐ Arthur Engine Release **May 5, 2026** This release strengthens deployment flexibility and security, enabling seamless operation in airgapped environments and improving container security across all supported platforms. --- ## Deployment & Infrastructure Enhancements ### Airgapped Deployment Support * **Tiktoken encodings are now cached directly on the container image**, eliminating the need for external network calls during container initialization. Users deploying in airgapped or net | High | 5/5/2026 |
| 2.1.548 | # ๐ Arthur Engine Release **May 5, 2026** This release strengthens deployment flexibility and security, enabling seamless operation in airgapped environments and improving container security across all supported platforms. --- ## Deployment & Infrastructure Enhancements ### Airgapped Deployment Support * **Tiktoken encodings are now cached directly on the container image**, eliminating the need for external network calls during container initialization. Users deploying in airgapped or net | Medium | 5/5/2026 |
| 2.1.548 | # ๐ Arthur Engine Release **May 5, 2026** This release strengthens deployment flexibility and security, enabling seamless operation in airgapped environments and improving container security across all supported platforms. --- ## Deployment & Infrastructure Enhancements ### Airgapped Deployment Support * **Tiktoken encodings are now cached directly on the container image**, eliminating the need for external network calls during container initialization. Users deploying in airgapped or net | Medium | 5/5/2026 |
| 0.0.0-lts-patch-2 | # ๐ Arthur Engine Release **May 1, 2026** This is a maintenance patch for the long-term support (LTS) branch focused on internal infrastructure improvements. There are no user-facing changes in this release. --- ## Deployment & Infrastructure Enhancements ### LTS Build Pipeline * Improved **Docker image publishing** for LTS releases by adopting a more efficient image retagging strategy, ensuring faster and more reliable delivery of patched LTS containers. This update strengthens the reli | High | 5/1/2026 |
| 2.1.529 | # ๐ Arthur Engine Release **April 17, 2026** This release strengthens compliance observability, improves trace exploration workflows, and resolves several UI and API issues that impacted pagination, task browsing, and HTTP spec compliance. --- ## Compliance and Alerting ### Accurate Violation Tracking Per Alert Rule * The **policy_alert_rule_check_count** compliance metric now reports the true number of violations per alert rule instead of always reporting 1.0, giving a more accurate pict | High | 4/17/2026 |
| 2.1.516 | # ๐ Arthur Engine Release **April 14, 2026** This release brings a redesigned Evaluate experience with unified evaluator management, bulk evaluation testing, automated compliance scheduling, and trace retention policies โ giving teams more control over evaluation workflows, compliance monitoring, and data lifecycle management. --- ## Evaluation and Continuous Evals ### Unified Evaluators and Continuous Evals UX * The Evaluate section now features a **unified two-tab layout (Ev | High | 4/14/2026 |
| 2.1.496 | # ๐ Arthur Engine Release **April 2, 2026** This release introduces significant user experience improvements and enhanced tracing capabilities, while strengthening security and system reliability across the platform. --- ## User Experience Enhancements ### Interactive AI Assistant * Added **Engine Chatbot** with intelligent query capabilities for searching API documentation and managing resources * Integrated automatic model provider detection supporting Anthropic Claude, OpenAI GPT, and | High | 4/2/2026 |
| 2.1.477 | # ๐ Arthur Engine Release **March 23, 2026** This release delivers significant enhancements to experiment creation workflows, trace analysis capabilities, and user personalization while introducing the comprehensive Arthur Observability SDK v1.0 for Python developers. --- ## Arthur Observability SDK ### Python SDK Launch * Released **Arthur Observability SDK v1.0**, a comprehensive Python package for LLM application observability * Added automatic instrumentation for **33 AI | Medium | 3/23/2026 |
| 2.1.456 | # ๐ Arthur Engine Release **March 12, 2026** This release delivers a comprehensive UI modernization, enhanced evaluation workflows, and improved agent task management alongside critical security updates and performance optimizations. --- ## User Experience & Interface Enhancements ### Navigation Consolidation * Unified all major product areas into streamlined tabbed interfaces, replacing scattered navigation with intuitive single-entry points * Consolidated **RAG functionality** into unif | Low | 3/11/2026 |
| 2.1.386 | <h1>๐ Arthur Engine Release</h1> <p><strong>February 18, 2026</strong></p> <p> This release strengthens evaluation workflows, task visibility, dataset intelligence, and enterprise deployment reliability across environments. </p> <hr /> <h2>Evaluation & Experiment Enhancements</h2> <h3>Improved Evaluation Configuration</h3> <ul> <li>Added a dedicated <strong>Evals input field</strong> for clearer configuration</li> <li>Introduced a new filtering mechanism for Conti | Low | 2/20/2026 |
| 2.1.355 | <h1> ๐ Arthur Engine Release</h1> <p><strong>January 26 โ February 5, 2026</strong></p> <p> This release significantly expands experimentation, trace visibility, model provider support, and deployment flexibility across the Agent Development Lifecycle. </p> <hr /> <h2>Agent Experiments & RAG Evaluation</h2> <h3>Agent Experiments</h3> <ul> <li>Introduced <strong>Agent Experiments</strong> with UI enhancements</li> <li>Added configurable Session ID support for repro | Low | 2/17/2026 |
| 2.1.286 | **Enhancements:** - Users can now configure where GenAI models are sourced from, enabling models to be pulled from an approved, customer-managed repository instead of the public Hugging Face Hub. - Metrics can now be segmented by user ID and conversation ID for more granular analysis. - Enhanced ODBC Connector Support: Improved handling of database views, more reliable primary key detection, and configurable connection and login timeouts. - Improved GenAI model bootstrapping reliability. | Low | 1/14/2026 |
| 2.1.237 | **New Features:** - **Test & Preview Custom Metrics Before Saving:** Users can now validate their custom metrics directly within the creation and editing workflow. Users can run the metric against available datasets to preview results and confirm the logic behaves as expected before saving. **Bug fixes:** - Custom metrics: - Sketch metrics can now be created and calculated without specifying any dimension columns. - Frontend No Longer Overwrites User-Defined Metadata for Reported Me | Low | 12/5/2025 |
| 2.1.209 | Bug Fix/Enhancements: - Fixed an issue where some metrics were missing from the selection list for custom datasets. - Increase ML engine aggregation timeout to support segmentation of larger & more complex datasets. | Low | 11/21/2025 |
| 2.1.135 | Enhancements - Made enhancements to PII detection model to improve date/time identification. - Docker configuration has been updated to use Postgres version 15, ensuring compatibility & preventing initialization errors during new engine setup. | Low | 11/6/2025 |
| 2.1.94 | **Enhancements**: - Updated telemetry ORM models, update migrations to enforce non-null timestamps. - Improved pagination handling for MSSQL. - Added `status_code` and `session_id` to spans. | Low | 10/15/2025 |
| 2.1.93 | **New Features** - **Custom Metrics:** You can now define and manage custom metrics using SQL. Custom metrics can be reused across models and projects, and integrate seamlessly with dashboards, alerts, and queries in the Arthur platform. Versioning ensures you can update metric logic while preserving historical data accuracy. [[Learn more](https://docs.arthur.ai/docs/custom-metrics)] **Enhancements** - **Agent Trace Viewer:** Improved filters โ users can now filter by metric evaluation | Low | 10/7/2025 |
| 2.1.79 | **Enhancements** - Span Query Improvements: - New GET endpoint `/v1/spans/query`: allows filtering spans by type. - Added support for span name column: improves query flexibility and performance. - Optimized span queries: added indexes to frequently queried columns. - Improved ingestion stability: fixed batch ingestion when root spans are present. - Improved developer experience by unifying our API schema and client libraries across the GenAI & Ml Engines as well as the Arthur pla | Low | 9/12/2025 |
| 2.1.71 | **New Features:** - **Agentic monitoring is now supported in the GenAI Engine**:ย Building on the recently added /traces/ API, this release introduces support for monitoring agentic behavior: - Tasks now include an is_agentic flag to enable targeted analysis and evaluation. - Metrics and traces APIs have been upgraded to support structured outputs, trace reconstruction, and intelligent defaults. - The engine selectively computes metrics for agentic tasks, improving the precision | Low | 8/28/2025 |
| 2.1.46 | **New Features:** - Added image support for metrics + visualizing inferences in the Arthur Platform. - Users can now optionally configure attributes to segment over when defining metrics. **Enhancements:** - Improved hallucination detection for numbered lists and other structured formats. - Introduced configurable max-token limit for hallucination checks, helping users fine-tune thresholds for context. | Low | 6/26/2025 |
| 2.1.44 | **New Features** - Added a '/traces/' API to support ingesting Open-Telemetry traces that meet the OpenInference (https://github.com/arize-ai/openinference/) specification. This feature is in preparation for adding agentic evaluations - more details coming soon **Enhancements:** - Added Docker Compose health checks to improve service startup reliability. - Introduced a single script to install both the GenAI and ML engines. - Initialized the Arthur Common module with CI, linting, and unit | Low | 5/23/2025 |
| 2.1.40 | Enhancements: * Patched a PyTorch vulnerability * Configured Renovate on the Arthur Engine GitHub repository for automated dependency updates * The `FETCH_RAW_DATA_ENABLED` configuration now exposed on the Helm Chart * Docker Compose always pulls the container images for the `latest` tag users * Postgres now uses a volume to persist data in Docker Compose Bug fix: * The ml-engine was not able to communicate with the genai-engine in the arthur-engine Docker Compose deployment. All servic | Low | 5/8/2025 |
| 2.1.39 | Deprecation: * Deprecated the endpoints that validate prompt and response on default rules without any task association Enhancement: * Reduced the number of configurations exposed for the first deploy experience with Docker Compose | Low | 5/2/2025 |
| 2.1.37 | Enhancements: - Open sourced the Arthur Engine full deployment scripts, comprised of both the `genai-engine` and the `ml-engine` components! You now have access to see how the `ml-engine` is deployed on Docker Compose, AWS ECS, and Kubernetes. All deployment scripts can now be found in the `/deployment` folder. - The GenAI Engine server can now start with no LLM service connected. This allows users without access to a LLM service to still use the non-LLM based evaluations. - Improved the conf | Low | 4/29/2025 |
| 2.1.23 | Enhancements: - Optimized the profanity detection function in the toxicity rule to improve latency for inferences with a large number of consecutive repeating characters. - Increased the overall concurrency of GPU deployments by using 5 Gunicorn workers by default and ensuring that the models load without encountering any race condition issues. - Improved quick deployment by adding start scripts for Docker Compose, Helm Chart, and AWS CloudFormation. Bug fix: - Disabled rules can be now a | Low | 4/16/2025 |
| 2.1.18 | We are thrilled to announce the very first release of the Arthur Engine, now available as an open source project! The Arthur Engine is a tool designed for evaluating and benchmarking machine learning models and enforcing guardrails in your LLM applications and generative AI workflows. This initial release debuts the GenAI Engine submodule and its capability to add guardrails to your LLM applications and generative AI workflows. We value your feedback and contributions. Whether you enco | Low | 3/31/2025 |