freshcrate
Skin:/
Home > MCP Servers > bernstein

bernstein

Declarative Agent Orchestration. Ship while you sleep.

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Declarative Agent Orchestration. Ship while you sleep.

README

Bernstein

"To achieve great things, two things are needed: a plan and not quite enough time." โ€” Leonard Bernstein

Orchestrate any AI coding agent. Any model. One command.

Bernstein in action: parallel AI agents orchestrated in real time

CI PyPI Python 3.12+ License

Documentation ยท Getting Started ยท Glossary ยท Limitations


Bernstein takes a goal, breaks it into tasks, assigns them to AI coding agents running in parallel, verifies the output, and merges the results. When agents succeed, the janitor merges verified work into main. Failed tasks retry or route to a different model.

Why deterministic coordination

LLMs write code well. They schedule work across other LLMs badly. Most agent orchestrators use an LLM as the coordinator and hit the same failure modes: non-reproducible plans, silent coordination drift, token burn on meta-decisions a 200-line event loop does reliably. Bernstein inverts that. One LLM call upfront decomposes the goal; after that, scheduling, worktree isolation, quality gates, and HMAC-chained audit replay are all deterministic Python. Every run is bit-identically replayable.

No framework to learn. No vendor lock-in. Agents are interchangeable workers. Swap any agent, any model, any provider.

pipx install bernstein
cd your-project && bernstein init
bernstein -g "Add JWT auth with refresh tokens, tests, and API docs"
$ bernstein -g "Add JWT auth"
[manager] decomposed into 4 tasks
[agent-1] claude-sonnet: src/auth/middleware.py  (done, 2m 14s)
[agent-2] codex:         tests/test_auth.py      (done, 1m 58s)
[verify]  all gates pass. merging to main.

Also available via pip, uv tool install, brew, dnf copr, and npx bernstein-orchestrator. See install options.

Supported agents

Bernstein auto-discovers installed CLI agents. Mix them in the same run. Cheap local models for boilerplate, heavier cloud models for architecture.

18 CLI agent adapters: 17 third-party wrappers plus a generic wrapper for anything with --prompt.

Agent Models Install
Claude Code Opus 4, Sonnet 4.6, Haiku 4.5 npm install -g @anthropic-ai/claude-code
Codex CLI GPT-5, GPT-5 mini npm install -g @openai/codex
OpenAI Agents SDK v2 GPT-5, GPT-5 mini, o4 pip install 'bernstein[openai]'
Gemini CLI Gemini 2.5 Pro, Gemini Flash npm install -g @google/gemini-cli
Cursor Sonnet 4.6, Opus 4, GPT-5 Cursor app
Aider Any OpenAI/Anthropic-compatible pip install aider-chat
Amp Amp-managed npm install -g @sourcegraph/amp
Cody Sourcegraph-hosted npm install -g @sourcegraph/cody
Continue Any OpenAI/Anthropic-compatible npm install -g @continuedev/cli (binary: cn)
Goose Any provider Goose supports See Goose docs
IaC (Terraform/Pulumi) Any provider the base agent uses Built-in
Kilo Kilo-hosted See Kilo docs
Kiro Kiro-hosted See Kiro docs
Ollama + Aider Local models (offline) brew install ollama
OpenCode Any provider OpenCode supports See OpenCode docs
Qwen Qwen Code models npm install -g @qwen-code/qwen-code
Cloudflare Agents Workers AI models bernstein cloud login
Generic Any CLI with --prompt Built-in

Any adapter also works as the internal scheduler LLM. Run the entire stack without any specific provider:

internal_llm_provider: gemini            # or qwen, ollama, codex, goose, ...
internal_llm_model: gemini-2.5-pro

Tip

Run bernstein --headless for CI pipelines. No TUI, structured JSON output, non-zero exit on failure.

Quick start

cd your-project
bernstein init                    # creates .sdd/ workspace + bernstein.yaml
bernstein -g "Add rate limiting"  # agents spawn, work in parallel, verify, exit
bernstein live                    # watch progress in the TUI dashboard
bernstein stop                    # graceful shutdown with drain

For multi-stage projects, define a YAML plan:

bernstein run plan.yaml           # skips LLM planning, goes straight to execution
bernstein run --dry-run plan.yaml # preview tasks and estimated cost

How it works

  1. Decompose. The manager breaks your goal into tasks with roles, owned files, and completion signals.
  2. Spawn. Agents start in isolated git worktrees, one per task. Main branch stays clean.
  3. Verify. The janitor checks concrete signals: tests pass, files exist, lint clean, types correct.
  4. Merge. Verified work lands in main. Failed tasks get retried or routed to a different model.

The orchestrator is a Python scheduler, not an LLM. Scheduling decisions are deterministic, auditable, and reproducible.

Cloud execution (Cloudflare)

Bernstein can run agents on Cloudflare Workers instead of locally. The bernstein cloud CLI handles deployment and lifecycle.

  • Workers. Agent execution on Cloudflare's edge, with Durable Workflows for multi-step tasks and automatic retry.
  • V8 sandbox isolation. Each agent runs in its own isolate, no container overhead.
  • R2 workspace sync. Local worktree state syncs to R2 object storage so cloud agents see the same files.
  • Workers AI (experimental). Use Cloudflare-hosted models as the LLM provider, no external API keys required.
  • D1 analytics. Task metrics and cost data stored in D1 for querying.
  • Vectorize. Semantic cache backed by Cloudflare's vector database.
  • Browser rendering. Headless Chrome on Workers for agents that need to inspect web output.
  • MCP remote transport. Expose or consume MCP servers over Cloudflare's network.
bernstein cloud login      # authenticate with Bernstein Cloud
bernstein cloud deploy     # push agent workers
bernstein cloud run plan.yaml  # execute a plan on Cloudflare

A bernstein cloud init scaffold for wrangler.toml and bindings is planned.

Capabilities

Core orchestration. Parallel execution, git worktree isolation, janitor verification, quality gates (lint, types, PII scan), cross-model code review, circuit breaker for misbehaving agents, token growth monitoring with auto-intervention.

Intelligence. Contextual bandit router for model/effort selection. Knowledge graph for codebase impact analysis. Semantic caching saves tokens on repeated patterns. Cost anomaly detection (burn-rate alerts). Behavior anomaly detection with Z-score flagging.

Sandboxing. Pluggable SandboxBackend protocol โ€” run agents in local git worktrees (default), Docker containers, E2B Firecracker microVMs, or Modal serverless containers (with optional GPU). Plugin authors can register custom backends through the bernstein.sandbox_backends entry-point group. Inspect installed backends with bernstein agents sandbox-backends.

Artifact storage. .sdd/ state can stream to pluggable ArtifactSink backends: local filesystem (default), S3, Google Cloud Storage, Azure Blob, or Cloudflare R2. BufferedSink keeps the WAL crash-safety contract by writing locally with fsync first and mirroring to the remote asynchronously.

Skill packs. Progressive-disclosure skills (OpenAI Agents SDK pattern): only a compact skill index ships in every spawn's system prompt, agents pull full bodies via the load_skill MCP tool on demand. 17 built-in role packs plus third-party bernstein.skill_sources entry-points.

Controls. HMAC-chained audit logs, policy engine, PII output gating, WAL-backed crash recovery (experimental multi-worker safety), OAuth 2.0 PKCE. SSO/SAML/OIDC support is in progress.

Observability. Prometheus /metrics, OTel exporter presets, Grafana dashboards. Per-model cost tracking (bernstein cost). Terminal TUI and web dashboard. Agent process visibility in ps.

Ecosystem. MCP server mode, A2A protocol support, GitHub App integration, pluggy-based plugin system, multi-repo workspaces, cluster mode for distributed execution, self-evolution via --evolve (experimental).

Full feature matrix: FEATURE_MATRIX.md ยท Recent features: What's New

How it compares

Feature Bernstein CrewAI AutoGen 1 LangGraph
Orchestrator Deterministic code LLM-driven (+ code Flows) LLM-driven Graph + LLM
Works with Any CLI agent (18 adapters) Python SDK classes Python agents LangChain nodes
Git isolation Worktrees per agent No No No
Pluggable sandboxes Worktree, Docker, E2B, Modal No No No
Verification Janitor + quality gates Guardrails + Pydantic output Termination conditions Conditional edges
Cost tracking Built-in usage_metrics RequestUsage Via LangSmith
State model File-based (.sdd/) In-memory + SQLite checkpoint In-memory Checkpointer
Remote artifact sinks S3, GCS, Azure Blob, R2 No No No
Self-evolution Built-in (experimental) No No No
Declarative plans (YAML) Yes Yes (agents.yaml, tasks.yaml) No Partial (langgraph.json)
Model routing per task Yes Per-agent LLM Per-agent model_client Per-node (manual)
MCP support Yes (client + server) Yes Yes (client + workbench) Yes (client + server)
Agent-to-agent chat Bulletin board Yes (Crew process) Yes (group chat) Yes (supervisor, swarm)
Web UI TUI + web dashboard CrewAI AMP AutoGen Studio LangGraph Studio + LangSmith
Cloud hosted option Yes (Cloudflare) Yes (CrewAI AMP) No Yes (LangGraph Cloud)
Built-in RAG/retrieval Yes (codebase FTS5 + BM25) crewai_tools autogen_ext retrievers Via LangChain

Last verified: 2026-04-19. See full comparison pages for detailed feature matrices.

The table above compares Bernstein against LLM-orchestration frameworks (they orchestrate LLM calls). The table below covers the closer category โ€” other tools that orchestrate CLI coding agents:

Feature Bernstein ComposioHQ/agent-orchestrator emdash
Shape Python CLI + library + MCP server TypeScript CLI + local dashboard Electron desktop app
Primary language Python TypeScript TypeScript
Install pipx install bernstein npm install -g @aoagents/ao .dmg / .msi / .AppImage
Agent adapters 18 3 (Claude Code, Codex, Aider) 23
Git worktree per agent Yes Yes Yes
MCP server mode (exposes self as MCP) Yes (stdio + HTTP/SSE) No No
Coordinator Deterministic Python scheduler LLM-driven Not documented
HMAC-chained audit replay Yes No No
Autonomous CI-fix / PR flow No Yes No
Visual dashboard TUI + web Web Desktop app
Backing Solo OSS Funded (Composio.dev) YC W26
License Apache 2.0 MIT Apache 2.0

Bernstein's wedge in this category: Python-native, MCP-server-first, widest adapter coverage. If your stack is TypeScript and you want a product with a dashboard, Composio's @aoagents/ao is a better fit; if you want a polished desktop ADE, emdash is. If you want a primitive that imports into Python, exposes itself over MCP to any client, and covers the full agent breadth (including Qwen, Goose, Ollama, OpenAI Agents SDK, Cloudflare Agents, and more) โ€” Bernstein.

Monitoring

bernstein live       # TUI dashboard
bernstein dashboard  # web dashboard
bernstein status     # task summary
bernstein ps         # running agents
bernstein cost       # spend by model/task
bernstein doctor     # pre-flight checks
bernstein recap      # post-run summary
bernstein trace <ID> # agent decision trace
bernstein run-changelog --hours 48  # changelog from agent-produced diffs
bernstein explain <cmd>  # detailed help with examples
bernstein dry-run    # preview tasks without executing
bernstein dep-impact # API breakage + downstream caller impact
bernstein aliases    # show command shortcuts
bernstein config-path    # show config file locations
bernstein init-wizard    # interactive project setup
bernstein debug-bundle   # collect logs, config, and state for bug reports
bernstein skills list    # discoverable skill packs (progressive disclosure)
bernstein skills show <name>  # print a skill body with its references
bernstein fingerprint build --corpus-dir ~/oss-corpus  # build local similarity index
bernstein fingerprint check src/foo.py                 # check generated code against the index

Install

Method Command
pip pip install bernstein
pipx pipx install bernstein
uv uv tool install bernstein
Homebrew brew tap chernistry/bernstein && brew install bernstein
Fedora / RHEL sudo dnf copr enable alexchernysh/bernstein && sudo dnf install bernstein
npm (wrapper) npx bernstein-orchestrator

Optional extras

Provider SDKs are optional so the base install stays lean. Pick what you need:

Extra Enables
bernstein[openai] OpenAI Agents SDK v2 adapter (openai_agents)
bernstein[docker] Docker sandbox backend
bernstein[e2b] E2B microVM sandbox backend (needs E2B_API_KEY)
bernstein[modal] Modal sandbox backend, optional GPU (needs MODAL_TOKEN_ID / MODAL_TOKEN_SECRET)
bernstein[s3] S3 artifact sink (via boto3)
bernstein[gcs] Google Cloud Storage artifact sink
bernstein[azure] Azure Blob artifact sink
bernstein[r2] Cloudflare R2 artifact sink (S3-compatible boto3)
bernstein[grpc] gRPC bridge
bernstein[k8s] Kubernetes integrations

Combine extras with brackets, e.g. pip install 'bernstein[openai,docker,s3]'.

Editor extensions: VS Marketplace ยท Open VSX

Contributing

PRs welcome. See CONTRIBUTING.md for setup and code style.

Support

If Bernstein saves you time: GitHub Sponsors

Contact: forte@bernstein.run

Star History

Star History Chart

License

Apache License 2.0


Footnotes

  1. AutoGen is in maintenance mode; successor is Microsoft Agent Framework 1.0. โ†ฉ

Release History

VersionChangesUrgencyDate
v2.7.0# v2.7.0 Released 2026-05-24. This release focuses on making Bernstein's automation easier to verify: stricter release gates, a complete Sonar cleanup, deterministic skill authoring tools, and an opt-in maintainer-share telemetry path that stays off by default. ## Highlights - Skills are closer to end-to-end. `SKILL.md` manifests now carry a versioned schema, and the CLI has deterministic `skills init`, `skills test`, `skills diff`, and `skills bench` commands. Strict linting can block instaHigh5/24/2026
v2.4.0# v2.4.0 - Observability surfaces, single-writer run state, declarative planning gates **Release date:** 2026-05-20 **Commits since v2.3.1:** 33 ## Highlights - Unified `bernstein doctor observe` umbrella rolls the four observability backends (Sonar, GlitchTip, Dependency-Track, GitHub Code Scanning) into one aggregated table with delta-since-last-check, plus a per-PR sticky summary comment and a daily trends snapshot. Each backend soft-fails to `SKIPPED` when its env vars are unset, so a freHigh5/20/2026
v1.11.0## Lineage v1 โ€” every agent edit, signed and auditable Bernstein runs now produce a per-artefact transparency log. Two agents touching the same file no longer race silently โ€” concurrent edits surface as siblings, and the Steward writes an explicit merge entry. Compliance officers run one command to get an EU AI Act Article 12 evidence bundle. Auditors verify the bundle on an air-gapped laptop without installing Bernstein. **What's new** - `bernstein compliance pack --since โ€ฆ --until โ€ฆ --org "High5/13/2026
v1.10.7## v1.10.7 A small operator-visible release. One new CLI command โ€” `bernstein export` for shareable post-run reports โ€” and a 16-scenario expansion of the planning library. ### New: `bernstein export` โ€” shareable run reports `bernstein export` reads a finished run from `.sdd/archive/tasks.jsonl`, `.sdd/runs/{run_id}/`, and `.sdd/metrics/`, then renders either a self-contained HTML page (inline CSS, no external assets) or a Markdown summary. Output is capped at 500 KB so the artefact is paste-fHigh5/11/2026
v1.10.0## v1.10.0 ### New features - **sandbox:** vercel backend implementing SandboxBackend protocol - **sandbox:** runloop backend implementing SandboxBackend protocol - **sandbox:** daytona backend implementing SandboxBackend protocol - **sandbox:** blaxel backend implementing SandboxBackend protocol - **orchestration:** mechanical exit gates between phases with re-fire on violation - **routing:** rework-rate ledger with auto-promotion in cascade router - **planning:** per-phase artifact schemas wiHigh5/5/2026
v1.9.2## v1.9.2 โ€” per-step CLI override, six new adapters, leaf-node orchestrator delegation The cooperating-CLI-adapter count goes 31 โ†’ 37, plans can mix CLIs between stages, and there's a separate new "Bernstein orchestrates the orchestrators" delegation track. ### Per-step `cli:` in plan files (#965, closes #964) Plan steps now take a `cli:` field directly. No more inventing role-shaped wrappers in `bernstein.yaml` to switch CLIs between stages: ```yaml stages: - name: red-green-refactor High4/29/2026
v1.8.14## v1.8.14 โ€” broader coverage + the operator pack ### 31 CLI adapters Thirteen new first-class adapters: **Droid**, **GitHub Copilot**, **Hermes Agent**, **Crush**, **Auggie**, **Kimi**, **Rovo Dev**, **Cline**, **Codebuff**, **Pi**, **Mistral Vibe**, **Autohand**, **Forge**. Mix any of the 31 in one plan. ### Four new commands - **`bernstein pr`** โ€” opens a GitHub PR from the last completed session, with the janitor's gate results and a token/USD cost breakdown in the body. - **`bernstein fHigh4/23/2026
v1.8.12## v1.8.12 ### Bug fixes - **persistence:** handle Windows OSError in _pid_alive **Full changelog:** https://github.com/chernistry/bernstein/compare/v1.8.11...v1.8.12 High4/19/2026
v1.8.11## v1.8.11 โ€” fixes broken v1.8.10 **Upgrade if you're on v1.8.9 or v1.8.10.** Both shipped with missing sub-packages in the wheel. `bernstein run` crashed on install with `ModuleNotFoundError: bernstein.core.tokens`. **Cause:** `.gitignore` had `*token*` (for stray secret files). Hatchling honors `.gitignore` during wheel build, so the whole `src/bernstein/core/tokens/` package got dropped. **Fix:** - Narrowed `.gitignore` to explicit file patterns (`*.token`, `*_token.{json,yaml,txt}`, `authHigh4/19/2026
v1.8.10> โš ๏ธ **Partially broken โ€” CLI loads but `bernstein run` crashes.** v1.8.10 fixed the v1.8.9 wheel enough that `bernstein --version` works, but the task server still crashes on startup with `ModuleNotFoundError: No module named 'bernstein.core.tokens'`. A stray `.gitignore` rule (`*token*`) was matching `src/bernstein/core/tokens/**` and dropping the whole sub-package from the wheel. v1.8.11 ships the real fix. ## v1.8.10 Intended to fix the broken v1.8.9 wheel (where 18 `bernstein.core.*` sub-High4/19/2026
v1.8.9> โš ๏ธ **Broken wheel โ€” do not install.** The PyPI wheel for v1.8.9 is missing 18 `bernstein.core.*` sub-packages, so `bernstein --version` crashes with `ModuleNotFoundError: No module named 'bernstein.core.config'`. Install v1.8.10 or later. ## v1.8.9 โ€” feature drop ### OpenAI Agents SDK v2 adapter A new CLI adapter wraps OpenAI's `agents.Agent` + `Runner` so agents built on the SDK become orchestratable inside a Bernstein plan.yaml, alongside Claude, Codex, and Gemini. ```yaml steps: - cliHigh4/19/2026
v1.8.8## v1.8.8 ### CI / Infrastructure - **release:** strip internal ticket refs from generated notes ### Chores - **deps:** bump docker/login-action from 3.7.0 to 4.1.0 - **deps:** bump actions/setup-python from 5.6.0 to 6.2.0 - **deps:** bump actions/create-github-app-token from 2.2.2 to 3.1.1 - **deps:** bump docker/build-push-action from 6.19.2 to 7.1.0 - **deps:** bump reviewdog/action-actionlint **Full changelog:** https://github.com/chernistry/bernstein/compare/v1.8.7...v1.8.8 High4/19/2026
v1.8.7## v1.8.7 Architecture boundaries + CI unblock. ### Architecture - **Import-linter contracts.** A new CI gate enforces the intended boundaries between `cli/`, `core/`, `adapters/`, and their sub-packages. Violations fail the Lint job before they can land. ### CI - Fixed test collection on main โ€” `cheaper_retry` and `retry_budget` back-compat redirects pointed at `bernstein.core.cost.*` but the modules actually live in `bernstein.core.cost.planned.*`, so the Ubuntu/macOS test matrix was failinHigh4/18/2026
v1.8.6## v1.8.6 Security hardening and reliability โ€” the largest patch in the 1.8.x line. ### Security - **OAuth:** PKCE `state` parameter is validated on callback. - **SAML:** assertion signature is verified. - **Webhooks:** every POST requires a valid HMAC; stack traces stripped from hook error responses. - **MCP server:** auth required by default; bound to localhost out of the box. - **Licensing:** empty signing keys are rejected at load time. - **Agents:** real per-agent credential scoping at spHigh4/18/2026
v1.8.5## v1.8.5 Orchestration cleanup and release-pipeline hygiene โ€” the first patch of a multi-release audit sweep. ### Orchestrator - Idle recycling consolidated into a single `agent_recycling` module (previously forked across `agents/` and `orchestration/`). - Idle detection now also watches log growth, not just heartbeat timing โ€” a stuck adapter with output still flowing no longer looks "alive." - `_detect_idle_reason` folded to one implementation. ### Reliability - Orphaned task claims discoveHigh4/17/2026
v1.8.4## v1.8.4 Planning, identity, and evaluation. ### Features - **Plan-and-Execute architecture formalized.** Planning and execution are now explicit phases with typed interfaces, so you can swap planners without touching executors. - **Agent identity cards with capability enforcement.** Every spawn carries a signed identity card; the orchestrator refuses tool calls outside the card's declared capabilities. - **Built-in eval framework** with per-model accuracy reporting โ€” useful for A/B-ing plannHigh4/17/2026
v1.8.3## v1.8.3 Quality + release-pipeline polish. - Cleared four vulnerabilities and fifteen code smells flagged by SonarCloud. - Auto-release now skips the version bump when nothing outside docs/CI actually changed โ€” stops the "empty v1.8.x patch" churn. **Full changelog:** https://github.com/chernistry/bernstein/compare/v1.8.2...v1.8.3 High4/17/2026
v1.8.2## v1.8.2 Maintenance release โ€” re-tags v1.8.1 after an auto-release pipeline hiccup. No user-visible changes. **Full changelog:** https://github.com/chernistry/bernstein/compare/v1.8.1...v1.8.2 High4/16/2026
v1.8.1## v1.8.1 Follow-up bugfix on v1.8.0. - Replaced 42 float `==` comparisons in the test suite with `pytest.approx()` so CI doesn't flake on rounding (Sonar S1244). - Restored the `_render_prompt` return-type annotation so callers importing the pre-1.8.0 signature keep working. **Full changelog:** https://github.com/chernistry/bernstein/compare/v1.8.0...v1.8.1 High4/16/2026
v1.8.0## v1.8.0 Eight feature drops focused on memory, observability, and safety. ### Features - **Prompt caching for system prompts and role templates.** Repeated role-prompt runs now read from the Anthropic prompt cache, cutting input-token cost on warm paths. - **Structured memory layer.** Episodic (per-session event trace) and semantic (long-lived facts) stores, queryable from any adapter. - **Shared memory with actor-aware tagging.** Cross-agent writes are tagged with the writing agent so readsHigh4/16/2026
v1.7.4Patch release. Changes since previous version: fc29ca1e chore: auto-bump to v1.7.4High4/14/2026
v1.7.3## v1.7.3 ### New features - complete Cloudflare integration platform (cf-001 through cf-012) ### Bug fixes - remove unreachable code in MCP remote transport (vulture) - resolve SonarCloud quality gate failures in CF integration - split ambiguous regex to resolve S5850/S6395 conflict ### Documentation - update index.md โ€” use in-action.gif, fix adapter count - add Cloudflare integration documentation and update existing docs ### Chores - **deps:** bump agents **Full changelog:** https://githMedium4/14/2026
v1.7.2## v1.7.2 ### New features - improve community spotlight generator script - add spotlight auto-generator script (cherry-picked from #780) ### Bug fixes - resolve SonarCloud quality gate bugs in new code - resolve 400+ SonarCloud issues across entire codebase ### Documentation - remove retired VS Marketplace and empty codecov badges - add #783 to Alex Smith's contributor entry - fill in April 2026 Community Spotlight with real contributor data - create community spotlight template and update CMedium4/14/2026
v1.7.1Patch release. Changes since previous version: 71b4b034 chore: auto-bump to v1.7.1 1092e9e0 Merge pull request #781 from chernistry/dependabot/npm_and_yarn/packages/vscode/npm_and_yarn-85af2c71bb 4c2445c8 chore(deps-dev): bump follow-redirects 8d82e03d ci: fix npm publish when version already matches tagHigh4/14/2026
v1.7.0## v1.7.0 ### New features - add bernstein cost CLI with cache tracking and peak-hour scheduling - add API quota tracking with alerts and agent image optimization - add 'Built with Bernstein' badge to README ### Bug fixes - **ci:** eliminate tick guard race condition with event-based sync - **ci:** increase tick guard sleep for slow macOS CI VMs - cleanup workflow now deletes orphaned runs from deleted workflows ### Documentation - update README comparison table date and adapter count - updatMedium4/13/2026
v1.6.11## v1.6.11 ### SonarCloud quality gate: PASS All conditions met: - **Security Rating: A** (0 vulnerabilities, 0 unreviewed hotspots) - **Reliability Rating: A** (0 bugs on new code) - **Maintainability Rating: A** - **Duplication: 1.8%** (under 3% threshold) ### Code quality - ~300 cognitive complexity refactors across 90+ files - 14 regex simplifications, 60+ float equality fixes - HTML accessibility, JS catch clauses, CSS contrast improvements - 9 stale TODOs resolved ### Distribution AlMedium4/13/2026
v1.6.10## v1.6.10 ### Windows support Full Windows compatibility across the entire codebase, contributed by [@oldschoola](https://github.com/oldschoola): - **Agent spawn**: Windows environment variable passthrough (SYSTEMROOT, WINDIR, COMSPEC, etc.) - **Unicode safety**: `encoding='utf-8', errors='replace'` on 90+ subprocess calls - **Terminal handling**: `msvcrt` keypress detection for plan display on Windows - **Process management**: PowerShell/kernel32 fallbacks for stop command and process detecMedium4/13/2026
v1.6.9## v1.6.9 The largest internal restructuring in Bernstein's history - a full module decomposition of the monolithic `core/` directory into focused subpackages, plus 100+ CI fixes to make all 927 tests pass. ### Module decomposition The 4,000+ line god-modules have been broken into focused, maintainable subpackages: - **orchestrator.py** (4,198 lines) -> 7 sub-modules in `core/orchestration/` - **spawner.py** (2,914 lines) -> 4 sub-modules in `core/agents/` - **task_store.py** (1,853 lines) -Medium4/13/2026
v1.6.8## v1.6.8 ### Code quality Resolved all SonarCloud BLOCKER and CRITICAL issues across the codebase. - **29 BLOCKER fixes** โ€” removed redundant `response_model` params in FastAPI routes (tasks, agents, SBOM), switched to `Annotated` type hints for dependency injection, fixed a method that always returned the same value - **Cognitive complexity reduction** โ€” refactored 34 functions across 30+ modules from CC 20-85 down to <15, extracting focused helper functions while preserving all behavior - Medium4/12/2026
v1.6.7## v1.6.7 ### Dependencies - Bump `actions/labeler` from 5.0.0 to 6.0.1 - Bump `actions/download-artifact` from 4.3.0 to 8.0.1 - Bump `dependabot/fetch-metadata` to 3.0.0 Routine CI dependency updates. No functional changes.Medium4/12/2026
v1.6.6## v1.6.6 ### Multi-adapter orchestration Bernstein now runs with **any combination of CLI agents** โ€” no Claude Code dependency required. Configure per-role adapters in `bernstein.yaml`: ```yaml role_model_policy: backend: cli: qwen model: qwen3.6-plus security: cli: gemini model: gemini-3.1-pro-preview ``` The internal scheduler LLM also accepts any adapter (`internal_llm_provider: gemini`). ### 20 critical orchestration bug fixes Deep audit found and fixed 20 severe bHigh4/11/2026
v1.6.5## v1.6.5 ### Highlights **Any CLI adapter as internal LLM provider** โ€” `internal_llm_provider` in `bernstein.yaml` now accepts any registered adapter name (not just `"claude"` or `"openrouter"`). Set `internal_llm_provider: "gemini"` or `internal_llm_provider: "qwen"` and the manager/planner/decomposer will use that adapter's CLI for LLM calls. No code changes needed โ€” just config. **TUI notification center + session recorder** โ€” two new Textual panels: a notification history that surfaces oMedium4/11/2026
v1.6.4## v1.6.4 The largest patch release yet โ€” 369 files changed across cross-platform fixes, a critical server-stability bug, new workflow specs, and a security pentest harness. ### Highlights **uvicorn `--reload` disabled in production** โ€” the task server's supervisor unconditionally enabled `--reload`, so every file write by a bernstein agent triggered a uvicorn restart. On a self-modifying codebase this caused cascading failures: port collisions, dropped HTTP connections, 127-second orchestratMedium4/11/2026
v1.6.3## v1.6.3 ### CI - **Impacted-test selection for PRs** โ€” on pull requests, CI now runs `scripts/run_tests.py --affected refs/remotes/origin/$BASE_REF` which uses `git diff` to identify which test files are affected by the PR's changed source files. Unaffected tests are skipped. Reduces PR CI time by 3โ€“5ร— on focused changes while still running the full suite on push-to-main. - Added cross-platform CI assertions (`test_cross_platform_ci.py`) that verify the workflow YAML structure matches expectMedium4/9/2026
v1.6.2## v1.6.2 ### Security - **Path traversal hardening** โ€” server routes that accept file paths now validate against the workspace root before any filesystem operation. - Resolved 3 SonarCloud security hotspots: hardcoded test IPs replaced with `127.0.0.1` constants, assertion-based auth checks converted to explicit `if` guards. ### Fixed - 4 reliability bugs in test assertions flagged by SonarCloud (float equality without tolerance in `test_task_splitter`, `test_token_budget_compaction`). - SkMedium4/9/2026
v1.6.1## v1.6.1 Patch release: version infrastructure fix only. No functional changes. - Auto-release workflow was tagging but not bumping `pyproject.toml` โ€” added the version-bump step so PyPI and the CLI report the same version. **Full changelog:** https://github.com/chernistry/bernstein/compare/v1.6.0...v1.6.1Medium4/9/2026
v1.6.0## v1.6.0 ### Highlights **CLI command aliases** (#391) โ€” Type `bernstein s` instead of `bernstein status`, `bernstein r` instead of `bernstein run`. User-defined aliases via `~/.bernstein/aliases.yaml` override built-ins. **Auto-release loop fix** โ€” The GitHub App token was causing an infinite CI โ†’ release โ†’ CI loop (14 spam releases in one day). Bot commits now skip both CI and auto-release. **SonarCloud security fixes** โ€” All 16 security hotspots resolved: pinned 24 GitHub Actions to SHA Medium4/9/2026
v1.5.5## Highlights **Agent lifecycle reliability** โ€” Fixed the root cause of mass agent failures: loopback API requests were being rate-limited by our own server (28K+ 429 errors per run). Internal traffic now bypasses rate limiting. Also fixed stale claim detection using wrong timestamp, silent stderr, heartbeat race condition, and worktree failures not blocking spawn. **LinUCB bandit routing** โ€” The contextual bandit model router is now wired into the orchestrator. Learns optimal model selection Medium4/9/2026
v1.5.4## Highlights **Spawn error classification** โ€” The spawner now categorizes failures (rate limit, missing adapter, permission denied, resource exhausted) and uses the category to decide retry strategy: fail-fast for permanent errors, fallback for transient ones. (#594) **EU AI Act compliance engine** โ€” New compliance module with risk classification, conformity assessment templates, and evidence export for regulated environments. **WebSocket frontend + API versioning** โ€” Live WebSocket updates High4/8/2026
v1.5.3## v1.5.3 ### New - **Config path validation** โ€” `bernstein.yaml` paths are validated before run starts, catching typos early (#583, contributed by @Beledarian) ### Improved - Consistent error handling with `handle_cli_error` and `ExitCode.CONFIG`Medium4/8/2026
v1.5.2## v1.5.2 ### Security - Docker GitHub Actions pinned to full SHA hashes (supply chain hardening) ### Fixed - SonarCloud reliability bugs in test scripts resolved - CI workflow stabilizationMedium4/7/2026
v1.5.1## v1.5.1 ### New - **Multi-registry distribution** โ€” published to PyPI, npm (MCP server), and Docker Hub simultaneously - MCP registry `server.json` for tool discovery ### Fixed - Broken HOL workflow removedMedium4/7/2026
v1.5.0## What's New Major release with 215 new features, atomic batch operations, community contributions, and comprehensive documentation. ### Core - **Atomic batch ticket ingestion** โ€” tasks are now claimed in groups; if the POST fails, no files move. Eliminates partial state on force-stop (#241-#244) - **Parallelized test suite** โ€” isolated per-file test runner prevents OOM in CI across 2000+ tests - **WAL and crash recovery groundwork** โ€” idempotent task operations for safer restarts ### CommunMedium4/7/2026
v1.4.16Distribution and publish fixes. - Fixed VS Code Marketplace publisher (alex-chernysh) and extension naming - Fixed PyPI publish with `skip-existing` to handle duplicate versions - Removed broken screenshot placeholder from Open VSX listing - Stopped extension publish from creating cluttering GitHub releasesMedium4/5/2026
v1.4.15VS Code extension and role templates. - VS Code extension v0.2.0 โ€” approve/reject commands, cost warnings, agent badges, status icons - Published and revised all 17 agent role templates - Fixed 44 Pyright errors in token_cmd using proper TokenAnalysis typeMedium4/5/2026
v1.4.14Task lifecycle and test fixes. - SLO cap now takes precedence over minimum agent floor - Fixed task completion test for module move - Added templates to CLI allowlist and fixed role validation in testsMedium4/4/2026
v1.4.13Bug fix. - Fixed broken `rules.yaml` YAML parsing and missing `bernstein` label in PR templatesMedium4/3/2026
v1.4.12Lock management and test coverage. - Added `renew_lock()` for TTL reset during long-running memory writes - Integration tests for convergence guard blocking spawn wavesMedium4/3/2026
v1.4.11Agent lifecycle improvements. - Fixed bridge spawn skipping local adapter loop (prevented timeout and double transition) - Improved heartbeat monitoring and CPU handling for agent lifecycle - Updated idle threshold values (300s/120s)Medium4/3/2026
v1.4.10Server and orchestrator fixes. - Fixed IP allowlist middleware registration with dynamic config lookup - Orchestrator skips dependency scan after `stop()` to prevent extra API calls - Upgraded `safe-push-main.sh` with CI monitor and autorelease verificationMedium4/2/2026
v1.4.9Repo hygiene and security. - Purged accidentally tracked `.sdd/` files from git history - Resolved Vulture dead code and Pyright type errorsMedium4/1/2026
v1.4.8Smarter task timeouts. - Complexity-based task timeout buckets โ€” simple tasks get shorter timeouts, complex tasks get longer onesMedium4/1/2026
v1.4.7Email notifications and smart retries. - Email notifications for task events (completion, failure, budget alerts) - Dynamic retry limits based on failure type (transient vs permanent)Medium4/1/2026
v1.4.6No user-facing changes โ€” version bump only.Medium4/1/2026
v1.4.5Reliability and cleanup. - Hardened hard-stop cleanup of orphaned agent processes - Normalized flaky tests and lockfile metadata - Fixed repo hygiene regressions in examples and APIs - Reframed benchmark docs around verified results onlyMedium4/1/2026
v1.4.4Telemetry and testing improvements. - OpenTelemetry spans for task lifecycle events - Conventional commit validation tests - Batch test improvements and watchdog enhancementsMedium4/1/2026
v1.4.3Test isolation fixes. - Fixed env var leaks in CI (masked `PYTHONPATH` and CI variables in splash tests) - Fixed `_is_process_alive` patching in heartbeat reaping testsMedium3/31/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

Enterprise-Multi-AI-Agent-Systems-๐Ÿค– Build and deploy scalable Multi-AI Agent systems with LangGraph and Groq LLMs to enhance intelligence across enterprise applications.main@2026-06-07
mcp-audit๐ŸŒŸ Track token consumption in real-time with MCP Audit. Diagnose context bloat and unexpected spikes across MCP servers and tools efficiently.main@2026-06-06
mcp-rag-agent๐Ÿ” Build a production-ready RAG system that combines LangGraph and MCP integration for precise, context-aware AI-driven question answering.main@2026-06-06
solace-agent-meshAn event-driven framework designed to build and orchestrate multi-agent AI systems. It enables seamless integration of AI agents with real-world data sources and systems, facilitating complex, multi-s1.28.0
arifOSArifOS โ€” Constitutional MCP kernel for governed AI execution. AAA architecture: Architect ยท Auditor ยท Agent. Built for the open-source agentic era.v2026.05.22-birthday

More from chernistry

kotefAI dev that actually gets things done

More in MCP Servers

PlanExeCreate a plan from a description in minutes
automagik-genieSelf-evolving AI agent orchestration framework with Model Context Protocol support
agentroveYour own Claude Code UI, sandbox, in-browser VS Code, terminal, multi-provider support (Anthropic, OpenAI, GitHub Copilot, OpenRouter), custom skills, and MCP servers.
ProxmoxMCP-PlusEnhanced Proxmox MCP server with advanced virtualization management and full OpenAPI integration.