Observal

Home > Testing > Observal

Observal is an AI agent registry with first in class observabilty and eval framework

agents claude-code cli-tool cursor evaluation gemini-cli kiro large-language-models python

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Observal is an AI agent registry with first in class observabilty and eval framework

README

Discover, share, and monitor AI coding agents with full observability built in.

License Python Status Stars If you find Observal useful, please consider giving it a star. It helps others discover the project and keeps development going.

Observal is a self-hosted AI agent registry with built-in observability. Think Docker Hub, but for AI coding agents.

Browse agents created by others, publish your own, and pull complete agent configurations — all defined in a portable YAML format that templates out to Claude Code, Kiro CLI, Cursor, Gemini CLI, and more. Every agent bundles its MCP servers, skills, hooks, prompts, and sandboxes into a single installable package. One command to install, zero manual config.

Every interaction generates traces, spans, and sessions that flow into a telemetry pipeline. The built-in eval engine scores agent sessions so you can measure performance and make your agents better over time.

Documentation

Full docs live at observal.gitbook.io (sourced from /docs in this repo).

Start here	Go to
5-minute install and first trace	Quickstart
Understand the data model	Core Concepts
Instrument your existing MCP servers	Observe MCP traffic
Run Observal on your infrastructure	Self-Hosting
Look up a CLI command	CLI Reference

See CHANGELOG.md for recent updates.

Quick start

git clone https://github.com/BlazeUp-AI/Observal.git
cd Observal
cp .env.example .env

docker compose -f docker/docker-compose.yml up --build -d
uv tool install --editable .
observal auth login            # auto-creates admin on fresh server

Eight services start (API, web UI, Postgres, ClickHouse, Redis, worker, OTEL collector, Grafana). Full walkthrough in Quickstart; operator guide in Self-Hosting → Docker Compose setup.

Already have MCP servers in your IDE? Instrument them in one command:

observal scan                       # auto-detect, register, and instrument everything
observal pull <agent> --ide cursor  # install a complete agent

This detects MCP servers from your IDE config files and wraps them with observal-shim for telemetry without breaking your existing setup. A timestamped backup is created automatically. Everything happens locally — nothing is uploaded to the server.

Supported IDEs

IDE	Support
Claude Code	Full — skills, hooks, MCP, rules, OTLP telemetry
Kiro CLI	Full — superpowers, hooks, MCP, steering files, OTLP telemetry
Gemini CLI	Native OTEL + shim telemetry
Codex CLI	Native OTEL + shim telemetry
GitHub Copilot	Shim telemetry
OpenCode	Shim telemetry
Cursor	MCP + shim telemetry

Compatibility matrix and per-IDE setup: Integrations.

Tech stack

Component	Technology
Frontend	Next.js 16, React 19, Tailwind CSS 4, shadcn/ui, Recharts
Backend	Python 3.11+, FastAPI, Strawberry GraphQL, Uvicorn
Databases	PostgreSQL 16 (registry), ClickHouse (telemetry)
Queue	Redis + arq
CLI	Python, Typer, Rich
Eval engine	AWS Bedrock / OpenAI-compatible LLMs
Telemetry	OpenTelemetry Collector
Deployment	Docker Compose (8 services)

Contributing

See CONTRIBUTING.md. The short version:

Fork and clone
make hooks to install pre-commit hooks
Create a feature branch
Run make lint and make test
Open a PR

See AGENTS.md for internal codebase context.

Running tests

make test      # quick
make test-v    # verbose

All tests mock external services. No Docker needed.

Community

Have a question, idea, or want to share what you've built? Head to GitHub Discussions. Please use Discussions for questions; open Issues for confirmed bugs and concrete feature requests.

Join the Observal Discord to chat directly with the maintainers and other community members.

Security

To report a vulnerability, please use GitHub Private Vulnerability Reporting or email contact@blazeup.app. Do not open a public issue. See SECURITY.md.

License

Apache License 2.0. See LICENSE.

Star history

Release History

Version	Changes	Urgency	Date
v1.4.4	## [1.4.4] - 2026-06-01 ### Added - allow agent owners and co-authors to generate insights (insights) ([d285cd7](https://github.com/BlazeUp-AI/Observal/commit/d285cd7c3ba68cc42fb799461b0d0b4db9f8fede)) - replace custom LLM calls with LiteLLM (insights) ([0b59845](https://github.com/BlazeUp-AI/Observal/commit/0b59845a45cf3ddfaf4aae86a6e803079e876791)) ### Changed - remove global admin Insights page from sidebar (web) ([68ef1e2](https://github.com/BlazeUp-AI/Observal/commit/68ef1e2	High	6/1/2026
v1.4.0	## [1.4.0] - 2026-05-31 ### Added - make sensitive settings write-once and retractable (security) ([b6b4fc1](https://github.com/BlazeUp-AI/Observal/commit/b6b4fc1f4f90d30c3c6bceef9d62d6faa5d214c7)) ### Fixed - correct tool count and strip ANSI from thinking blocks (ui+parser) ([cb97420](https://github.com/BlazeUp-AI/Observal/commit/cb974206eb1aead4429f2345b59a80296bcd9e3b)) - fix off-by-one in byte offset tracking (pi-extension) ([39e2135](https://github.com/BlazeUp-AI/Observal/c	High	5/31/2026
v1.0.0	## [1.0.0] - 2026-05-23 ### Added - instrument uv.lock (optic) ([28dffa7](https://github.com/BlazeUp-AI/Observal/commit/28dffa70cc8b60d3f850a35da5638276f662e135)) - instrument tests.test_optic (optic) ([c30fd8e](https://github.com/BlazeUp-AI/Observal/commit/c30fd8e4fc4dbf3ade01840b3acfdd16c9fb7a4f)) - instrument ..pyproject.toml (optic) ([b230772](https://github.com/BlazeUp-AI/Observal/commit/b2307721751fcd3cf7223c6b024eefd62b1fd180)) - instrument observal-server.worker (optic)	High	5/23/2026
v0.6.0	## [0.6.0] - 2026-05-16 ### Added - realign skill model with git-first architecture (skills) ([9600d3b](https://github.com/BlazeUp-AI/Observal/commit/9600d3ba363d1e37bcb81278275913ba1bfc91ff)) - add insight tables to migration system ([cd73e1e](https://github.com/BlazeUp-AI/Observal/commit/cd73e1e9ea20d7f5ea5c388db25bf2dda65853f8)) - add best-effort disclaimer for --git flag (mcp) ([360248e](https://github.com/BlazeUp-AI/Observal/commit/360248e0c00da9c260cfb50d64bd03130e88eac8)) - JSON	High	5/16/2026
v0.5.0	## [0.5.0] - 2026-05-13 ### Added - slim curl-install to use pre-built images only (deploy) ([630a19f](https://github.com/BlazeUp-AI/Observal/commit/630a19f0752d0560e916e6376fb6751f085362a4)) - add browser-level Playwright e2e tests for 4 UI flows ([28f0274](https://github.com/BlazeUp-AI/Observal/commit/28f0274c8a6611cca9d12bbba02c0c7c2de96285)) - user profile picture is configurable (UI) ([442bc7c](https://github.com/BlazeUp-AI/Observal/commit/442bc7ce16c1869d936c2ed590bb008bd76ec7a9)	High	5/13/2026
v0.3.4	## [0.3.4] - 2026-04-25 ### Added - add copilot-cli doctor checks and SLI repair ([fe8f91a](https://github.com/BlazeUp-AI/Observal/commit/fe8f91af37c6e3d354906c022aaba2e12eb4b816)) - add copilot-cli scan and hook injection ([8956644](https://github.com/BlazeUp-AI/Observal/commit/8956644008e824ba59675be52448880170a544e8)) - add copilot-cli server-side config generation ([7c71b94](https://github.com/BlazeUp-AI/Observal/commit/7c71b94b2bdeb3827ecfb48bda0180b065d4de8f)) - add copilot-cli hook scri	High	4/25/2026
v0.3.3	## [0.3.3] - 2026-04-24 ### Fixed - filter release artifact download to skip Docker metadata (ci) ([dbd6a1b](https://github.com/BlazeUp-AI/Observal/commit/dbd6a1b66ec1dd3a6db067ecc9bc41a17174a329))	High	4/24/2026
v0.2.0	## What's Changed * docs + refactor: READMEs, eval subpackage, repo cleanup by @Haz3-jolt in https://github.com/BlazeUp-AI/Observal/pull/322 * feat: live session updates via GraphQL subscriptions by @Haz3-jolt in https://github.com/BlazeUp-AI/Observal/pull/323 * fix(web): protect registry routes with auth guard by @Kaushik-Kumar-CEG in https://github.com/BlazeUp-AI/Observal/pull/324 * feat: unify telemetry — merge hooks, shims, and OTLP by @Haz3-jolt in https://github.com/BlazeUp-AI/Observal	High	4/21/2026
v0.0.1	Tag v0.0.1	High	4/15/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

mlflowThe open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controllinv3.13.0

agentic-configProject-agnostic, composable AI workflow automation via pi packages and Claude Code plugins.v0.3.1

llm-wikiLLM-powered knowledge base from your Claude Code, Codex CLI, Copilot, Cursor & Gemini sessions. Karpathy's LLM Wiki pattern — implemented and shipped.v1.3.82

cc-sddTurn approved specs into long-running autonomous implementation. A minimal, adaptable SDD harness with Agent Skills for Claude Code, Codex, Cursor, Copilot, Windsurf, OpenCode, Gemini CLI, and Antigrav3.0.2

claude-ide-tools🛠️ Enhance Claude Code CLI’s refactoring with JetBrains IDEs, leveraging advanced semantic analysis for smarter code usage handling.master@2026-06-07

More in Testing

fspecFSPEC: The Spec-Driven, Multi-Agent Coding Factory. It is infrastructure for the "Dark Factory"—the emerging model of fully autonomous software development where AI agents handle all implementation wh

vector-db-benchmarkFramework for benchmarking vector search engines

GitoAn AI-powered GitHub code review tool that uses LLMs to detect high-confidence, high-impact issues—such as security vulnerabilities, bugs, and maintainability concerns.

mxcliMendix cli tool, a headless way to work with Mendix projects. Enables Mendix projects for use with 3rd party agentic coding tools like Claude Code and Copilot. Includes a starlark linter for quality v