bernstein

Declarative Agent Orchestration. Ship while you sleep.

a2a agent-orchestration agentic-ai agentic-engineering agentic-workflow ai-agents ai-orchestrator autonomous-agents model-context-protocol python

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Declarative Agent Orchestration. Ship while you sleep.

README

"To achieve great things, two things are needed: a plan and not quite enough time." — Leonard Bernstein

Orchestrate any AI coding agent. Any model. One command.

Bernstein in action: parallel AI agents orchestrated in real time

Documentation · Getting Started · Glossary · Limitations

Bernstein takes a goal, breaks it into tasks, assigns them to AI coding agents running in parallel, verifies the output, and merges the results. When agents succeed, the janitor merges verified work into main. Failed tasks retry or route to a different model.

Why deterministic coordination

LLMs write code well. They schedule work across other LLMs badly. Most agent orchestrators use an LLM as the coordinator and hit the same failure modes: non-reproducible plans, silent coordination drift, token burn on meta-decisions a 200-line event loop does reliably. Bernstein inverts that. One LLM call upfront decomposes the goal; after that, scheduling, worktree isolation, quality gates, and HMAC-chained audit replay are all deterministic Python. Every run is bit-identically replayable.

No framework to learn. No vendor lock-in. Agents are interchangeable workers. Swap any agent, any model, any provider.

pipx install bernstein
cd your-project && bernstein init
bernstein -g "Add JWT auth with refresh tokens, tests, and API docs"

$ bernstein -g "Add JWT auth"
[manager] decomposed into 4 tasks
[agent-1] claude-sonnet: src/auth/middleware.py  (done, 2m 14s)
[agent-2] codex:         tests/test_auth.py      (done, 1m 58s)
[verify]  all gates pass. merging to main.

Also available via pip, uv tool install, brew, dnf copr, and npx bernstein-orchestrator. See install options.

Supported agents

Bernstein auto-discovers installed CLI agents. Mix them in the same run. Cheap local models for boilerplate, heavier cloud models for architecture.

18 CLI agent adapters: 17 third-party wrappers plus a generic wrapper for anything with --prompt.

Agent	Models	Install
Claude Code	Opus 4, Sonnet 4.6, Haiku 4.5	`npm install -g @anthropic-ai/claude-code`
Codex CLI	GPT-5, GPT-5 mini	`npm install -g @openai/codex`
OpenAI Agents SDK v2	GPT-5, GPT-5 mini, o4	`pip install 'bernstein[openai]'`
Gemini CLI	Gemini 2.5 Pro, Gemini Flash	`npm install -g @google/gemini-cli`
Cursor	Sonnet 4.6, Opus 4, GPT-5	Cursor app
Aider	Any OpenAI/Anthropic-compatible	`pip install aider-chat`
Amp	Amp-managed	`npm install -g @sourcegraph/amp`
Cody	Sourcegraph-hosted	`npm install -g @sourcegraph/cody`
Continue	Any OpenAI/Anthropic-compatible	`npm install -g @continuedev/cli` (binary: `cn`)
Goose	Any provider Goose supports	See Goose docs
IaC (Terraform/Pulumi)	Any provider the base agent uses	Built-in
Kilo	Kilo-hosted	See Kilo docs
Kiro	Kiro-hosted	See Kiro docs
Ollama + Aider	Local models (offline)	`brew install ollama`
OpenCode	Any provider OpenCode supports	See OpenCode docs
Qwen	Qwen Code models	`npm install -g @qwen-code/qwen-code`
Cloudflare Agents	Workers AI models	`bernstein cloud login`
Generic	Any CLI with `--prompt`	Built-in

Any adapter also works as the internal scheduler LLM. Run the entire stack without any specific provider:

internal_llm_provider: gemini            # or qwen, ollama, codex, goose, ...
internal_llm_model: gemini-2.5-pro

Tip

Run bernstein --headless for CI pipelines. No TUI, structured JSON output, non-zero exit on failure.

Quick start

cd your-project
bernstein init                    # creates .sdd/ workspace + bernstein.yaml
bernstein -g "Add rate limiting"  # agents spawn, work in parallel, verify, exit
bernstein live                    # watch progress in the TUI dashboard
bernstein stop                    # graceful shutdown with drain

For multi-stage projects, define a YAML plan:

bernstein run plan.yaml           # skips LLM planning, goes straight to execution
bernstein run --dry-run plan.yaml # preview tasks and estimated cost

How it works

Decompose. The manager breaks your goal into tasks with roles, owned files, and completion signals.
Spawn. Agents start in isolated git worktrees, one per task. Main branch stays clean.
Verify. The janitor checks concrete signals: tests pass, files exist, lint clean, types correct.
Merge. Verified work lands in main. Failed tasks get retried or routed to a different model.

The orchestrator is a Python scheduler, not an LLM. Scheduling decisions are deterministic, auditable, and reproducible.

Cloud execution (Cloudflare)

Bernstein can run agents on Cloudflare Workers instead of locally. The bernstein cloud CLI handles deployment and lifecycle.

Workers. Agent execution on Cloudflare's edge, with Durable Workflows for multi-step tasks and automatic retry.
V8 sandbox isolation. Each agent runs in its own isolate, no container overhead.
R2 workspace sync. Local worktree state syncs to R2 object storage so cloud agents see the same files.
Workers AI (experimental). Use Cloudflare-hosted models as the LLM provider, no external API keys required.
D1 analytics. Task metrics and cost data stored in D1 for querying.
Vectorize. Semantic cache backed by Cloudflare's vector database.
Browser rendering. Headless Chrome on Workers for agents that need to inspect web output.
MCP remote transport. Expose or consume MCP servers over Cloudflare's network.

bernstein cloud login      # authenticate with Bernstein Cloud
bernstein cloud deploy     # push agent workers
bernstein cloud run plan.yaml  # execute a plan on Cloudflare

A bernstein cloud init scaffold for wrangler.toml and bindings is planned.

Capabilities

Core orchestration. Parallel execution, git worktree isolation, janitor verification, quality gates (lint, types, PII scan), cross-model code review, circuit breaker for misbehaving agents, token growth monitoring with auto-intervention.

Intelligence. Contextual bandit router for model/effort selection. Knowledge graph for codebase impact analysis. Semantic caching saves tokens on repeated patterns. Cost anomaly detection (burn-rate alerts). Behavior anomaly detection with Z-score flagging.

Sandboxing. Pluggable SandboxBackend protocol — run agents in local git worktrees (default), Docker containers, E2B Firecracker microVMs, or Modal serverless containers (with optional GPU). Plugin authors can register custom backends through the bernstein.sandbox_backends entry-point group. Inspect installed backends with bernstein agents sandbox-backends.

Artifact storage. .sdd/ state can stream to pluggable ArtifactSink backends: local filesystem (default), S3, Google Cloud Storage, Azure Blob, or Cloudflare R2. BufferedSink keeps the WAL crash-safety contract by writing locally with fsync first and mirroring to the remote asynchronously.

Skill packs. Progressive-disclosure skills (OpenAI Agents SDK pattern): only a compact skill index ships in every spawn's system prompt, agents pull full bodies via the load_skill MCP tool on demand. 17 built-in role packs plus third-party bernstein.skill_sources entry-points.

Controls. HMAC-chained audit logs, policy engine, PII output gating, WAL-backed crash recovery (experimental multi-worker safety), OAuth 2.0 PKCE. SSO/SAML/OIDC support is in progress.

Observability. Prometheus /metrics, OTel exporter presets, Grafana dashboards. Per-model cost tracking (bernstein cost). Terminal TUI and web dashboard. Agent process visibility in ps.

Ecosystem. MCP server mode, A2A protocol support, GitHub App integration, pluggy-based plugin system, multi-repo workspaces, cluster mode for distributed execution, self-evolution via --evolve (experimental).

Full feature matrix: FEATURE_MATRIX.md · Recent features: What's New

How it compares

Feature	Bernstein	CrewAI	AutoGen ¹	LangGraph
Orchestrator	Deterministic code	LLM-driven (+ code Flows)	LLM-driven	Graph + LLM
Works with	Any CLI agent (18 adapters)	Python SDK classes	Python agents	LangChain nodes
Git isolation	Worktrees per agent	No	No	No
Pluggable sandboxes	Worktree, Docker, E2B, Modal	No	No	No
Verification	Janitor + quality gates	Guardrails + Pydantic output	Termination conditions	Conditional edges
Cost tracking	Built-in	`usage_metrics`	`RequestUsage`	Via LangSmith
State model	File-based (.sdd/)	In-memory + SQLite checkpoint	In-memory	Checkpointer
Remote artifact sinks	S3, GCS, Azure Blob, R2	No	No	No
Self-evolution	Built-in (experimental)	No	No	No
Declarative plans (YAML)	Yes	Yes (`agents.yaml`, `tasks.yaml`)	No	Partial (`langgraph.json`)
Model routing per task	Yes	Per-agent LLM	Per-agent `model_client`	Per-node (manual)
MCP support	Yes (client + server)	Yes	Yes (client + workbench)	Yes (client + server)
Agent-to-agent chat	Bulletin board	Yes (Crew process)	Yes (group chat)	Yes (supervisor, swarm)
Web UI	TUI + web dashboard	CrewAI AMP	AutoGen Studio	LangGraph Studio + LangSmith
Cloud hosted option	Yes (Cloudflare)	Yes (CrewAI AMP)	No	Yes (LangGraph Cloud)
Built-in RAG/retrieval	Yes (codebase FTS5 + BM25)	`crewai_tools`	`autogen_ext` retrievers	Via LangChain

Last verified: 2026-04-19. See full comparison pages for detailed feature matrices.

The table above compares Bernstein against LLM-orchestration frameworks (they orchestrate LLM calls). The table below covers the closer category — other tools that orchestrate CLI coding agents:

Feature	Bernstein	ComposioHQ/agent-orchestrator	emdash
Shape	Python CLI + library + MCP server	TypeScript CLI + local dashboard	Electron desktop app
Primary language	Python	TypeScript	TypeScript
Install	`pipx install bernstein`	`npm install -g @aoagents/ao`	`.dmg` / `.msi` / `.AppImage`
Agent adapters	18	3 (Claude Code, Codex, Aider)	23
Git worktree per agent	Yes	Yes	Yes
MCP server mode (exposes self as MCP)	Yes (stdio + HTTP/SSE)	No	No
Coordinator	Deterministic Python scheduler	LLM-driven	Not documented
HMAC-chained audit replay	Yes	No	No
Autonomous CI-fix / PR flow	No	Yes	No
Visual dashboard	TUI + web	Web	Desktop app
Backing	Solo OSS	Funded (Composio.dev)	YC W26
License	Apache 2.0	MIT	Apache 2.0

Bernstein's wedge in this category: Python-native, MCP-server-first, widest adapter coverage. If your stack is TypeScript and you want a product with a dashboard, Composio's @aoagents/ao is a better fit; if you want a polished desktop ADE, emdash is. If you want a primitive that imports into Python, exposes itself over MCP to any client, and covers the full agent breadth (including Qwen, Goose, Ollama, OpenAI Agents SDK, Cloudflare Agents, and more) — Bernstein.

Monitoring

bernstein live       # TUI dashboard
bernstein dashboard  # web dashboard
bernstein status     # task summary
bernstein ps         # running agents
bernstein cost       # spend by model/task
bernstein doctor     # pre-flight checks
bernstein recap      # post-run summary
bernstein trace <ID> # agent decision trace
bernstein run-changelog --hours 48  # changelog from agent-produced diffs
bernstein explain <cmd>  # detailed help with examples
bernstein dry-run    # preview tasks without executing
bernstein dep-impact # API breakage + downstream caller impact
bernstein aliases    # show command shortcuts
bernstein config-path    # show config file locations
bernstein init-wizard    # interactive project setup
bernstein debug-bundle   # collect logs, config, and state for bug reports
bernstein skills list    # discoverable skill packs (progressive disclosure)
bernstein skills show <name>  # print a skill body with its references

bernstein fingerprint build --corpus-dir ~/oss-corpus  # build local similarity index
bernstein fingerprint check src/foo.py                 # check generated code against the index

Install

Method	Command
pip	`pip install bernstein`
pipx	`pipx install bernstein`
uv	`uv tool install bernstein`
Homebrew	`brew tap chernistry/bernstein && brew install bernstein`
Fedora / RHEL	`sudo dnf copr enable alexchernysh/bernstein && sudo dnf install bernstein`
npm (wrapper)	`npx bernstein-orchestrator`

Optional extras

Provider SDKs are optional so the base install stays lean. Pick what you need:

Extra	Enables
`bernstein[openai]`	OpenAI Agents SDK v2 adapter (`openai_agents`)
`bernstein[docker]`	Docker sandbox backend
`bernstein[e2b]`	E2B microVM sandbox backend (needs `E2B_API_KEY`)
`bernstein[modal]`	Modal sandbox backend, optional GPU (needs `MODAL_TOKEN_ID` / `MODAL_TOKEN_SECRET`)
`bernstein[s3]`	S3 artifact sink (via `boto3`)
`bernstein[gcs]`	Google Cloud Storage artifact sink
`bernstein[azure]`	Azure Blob artifact sink
`bernstein[r2]`	Cloudflare R2 artifact sink (S3-compatible `boto3`)
`bernstein[grpc]`	gRPC bridge
`bernstein[k8s]`	Kubernetes integrations

Combine extras with brackets, e.g. pip install 'bernstein[openai,docker,s3]'.

Editor extensions: VS Marketplace · Open VSX

Contributing

PRs welcome. See CONTRIBUTING.md for setup and code style.

Support

If Bernstein saves you time: GitHub Sponsors

Contact: forte@bernstein.run

Star History

License

Apache License 2.0

AutoGen is in maintenance mode; successor is Microsoft Agent Framework 1.0. ↩

Release History

Version	Changes	Urgency	Date
v2.7.0	# v2.7.0 Released 2026-05-24. This release focuses on making Bernstein's automation easier to verify: stricter release gates, a complete Sonar cleanup, deterministic skill authoring tools, and an opt-in maintainer-share telemetry path that stays off by default. ## Highlights - Skills are closer to end-to-end. `SKILL.md` manifests now carry a versioned schema, and the CLI has deterministic `skills init`, `skills test`, `skills diff`, and `skills bench` commands. Strict linting can block insta	High	5/24/2026
v2.4.0	# v2.4.0 - Observability surfaces, single-writer run state, declarative planning gates Release date: 2026-05-20 Commits since v2.3.1: 33 ## Highlights - Unified `bernstein doctor observe` umbrella rolls the four observability backends (Sonar, GlitchTip, Dependency-Track, GitHub Code Scanning) into one aggregated table with delta-since-last-check, plus a per-PR sticky summary comment and a daily trends snapshot. Each backend soft-fails to `SKIPPED` when its env vars are unset, so a fre	High	5/20/2026
v1.11.0	## Lineage v1 — every agent edit, signed and auditable Bernstein runs now produce a per-artefact transparency log. Two agents touching the same file no longer race silently — concurrent edits surface as siblings, and the Steward writes an explicit merge entry. Compliance officers run one command to get an EU AI Act Article 12 evidence bundle. Auditors verify the bundle on an air-gapped laptop without installing Bernstein. What's new - `bernstein compliance pack --since … --until … --org "	High	5/13/2026
v1.10.7	## v1.10.7 A small operator-visible release. One new CLI command — `bernstein export` for shareable post-run reports — and a 16-scenario expansion of the planning library. ### New: `bernstein export` — shareable run reports `bernstein export` reads a finished run from `.sdd/archive/tasks.jsonl`, `.sdd/runs/{run_id}/`, and `.sdd/metrics/`, then renders either a self-contained HTML page (inline CSS, no external assets) or a Markdown summary. Output is capped at 500 KB so the artefact is paste-f	High	5/11/2026
v1.10.0	## v1.10.0 ### New features - sandbox: vercel backend implementing SandboxBackend protocol - sandbox: runloop backend implementing SandboxBackend protocol - sandbox: daytona backend implementing SandboxBackend protocol - sandbox: blaxel backend implementing SandboxBackend protocol - orchestration: mechanical exit gates between phases with re-fire on violation - routing: rework-rate ledger with auto-promotion in cascade router - planning: per-phase artifact schemas wi	High	5/5/2026
v1.9.2	## v1.9.2 — per-step CLI override, six new adapters, leaf-node orchestrator delegation The cooperating-CLI-adapter count goes 31 → 37, plans can mix CLIs between stages, and there's a separate new "Bernstein orchestrates the orchestrators" delegation track. ### Per-step `cli:` in plan files (#965, closes #964) Plan steps now take a `cli:` field directly. No more inventing role-shaped wrappers in `bernstein.yaml` to switch CLIs between stages: ```yaml stages: - name: red-green-refactor	High	4/29/2026
v1.8.14	## v1.8.14 — broader coverage + the operator pack ### 31 CLI adapters Thirteen new first-class adapters: Droid, GitHub Copilot, Hermes Agent, Crush, Auggie, Kimi, Rovo Dev, Cline, Codebuff, Pi, Mistral Vibe, Autohand, Forge. Mix any of the 31 in one plan. ### Four new commands - `bernstein pr` — opens a GitHub PR from the last completed session, with the janitor's gate results and a token/USD cost breakdown in the body. - **`bernstein f	High	4/23/2026
v1.8.12	## v1.8.12 ### Bug fixes - persistence: handle Windows OSError in _pid_alive Full changelog: https://github.com/chernistry/bernstein/compare/v1.8.11...v1.8.12	High	4/19/2026
v1.8.11	## v1.8.11 — fixes broken v1.8.10 Upgrade if you're on v1.8.9 or v1.8.10. Both shipped with missing sub-packages in the wheel. `bernstein run` crashed on install with `ModuleNotFoundError: bernstein.core.tokens`. Cause: `.gitignore` had `token` (for stray secret files). Hatchling honors `.gitignore` during wheel build, so the whole `src/bernstein/core/tokens/` package got dropped. Fix: - Narrowed `.gitignore` to explicit file patterns (`.token`, `_token.{json,yaml,txt}`, `auth	High	4/19/2026
v1.8.10	> ⚠️ Partially broken — CLI loads but `bernstein run` crashes. v1.8.10 fixed the v1.8.9 wheel enough that `bernstein --version` works, but the task server still crashes on startup with `ModuleNotFoundError: No module named 'bernstein.core.tokens'`. A stray `.gitignore` rule (`token`) was matching `src/bernstein/core/tokens/*` and dropping the whole sub-package from the wheel. v1.8.11 ships the real fix. ## v1.8.10 Intended to fix the broken v1.8.9 wheel (where 18 `bernstein.core.` sub-	High	4/19/2026
v1.8.9	> ⚠️ Broken wheel — do not install. The PyPI wheel for v1.8.9 is missing 18 `bernstein.core.*` sub-packages, so `bernstein --version` crashes with `ModuleNotFoundError: No module named 'bernstein.core.config'`. Install v1.8.10 or later. ## v1.8.9 — feature drop ### OpenAI Agents SDK v2 adapter A new CLI adapter wraps OpenAI's `agents.Agent` + `Runner` so agents built on the SDK become orchestratable inside a Bernstein plan.yaml, alongside Claude, Codex, and Gemini. ```yaml steps: - cli	High	4/19/2026
v1.8.8	## v1.8.8 ### CI / Infrastructure - release: strip internal ticket refs from generated notes ### Chores - deps: bump docker/login-action from 3.7.0 to 4.1.0 - deps: bump actions/setup-python from 5.6.0 to 6.2.0 - deps: bump actions/create-github-app-token from 2.2.2 to 3.1.1 - deps: bump docker/build-push-action from 6.19.2 to 7.1.0 - deps: bump reviewdog/action-actionlint Full changelog: https://github.com/chernistry/bernstein/compare/v1.8.7...v1.8.8	High	4/19/2026
v1.8.7	## v1.8.7 Architecture boundaries + CI unblock. ### Architecture - Import-linter contracts. A new CI gate enforces the intended boundaries between `cli/`, `core/`, `adapters/`, and their sub-packages. Violations fail the Lint job before they can land. ### CI - Fixed test collection on main — `cheaper_retry` and `retry_budget` back-compat redirects pointed at `bernstein.core.cost.` but the modules actually live in `bernstein.core.cost.planned.`, so the Ubuntu/macOS test matrix was failin	High	4/18/2026
v1.8.6	## v1.8.6 Security hardening and reliability — the largest patch in the 1.8.x line. ### Security - OAuth: PKCE `state` parameter is validated on callback. - SAML: assertion signature is verified. - Webhooks: every POST requires a valid HMAC; stack traces stripped from hook error responses. - MCP server: auth required by default; bound to localhost out of the box. - Licensing: empty signing keys are rejected at load time. - Agents: real per-agent credential scoping at sp	High	4/18/2026
v1.8.5	## v1.8.5 Orchestration cleanup and release-pipeline hygiene — the first patch of a multi-release audit sweep. ### Orchestrator - Idle recycling consolidated into a single `agent_recycling` module (previously forked across `agents/` and `orchestration/`). - Idle detection now also watches log growth, not just heartbeat timing — a stuck adapter with output still flowing no longer looks "alive." - `_detect_idle_reason` folded to one implementation. ### Reliability - Orphaned task claims discove	High	4/17/2026
v1.8.4	## v1.8.4 Planning, identity, and evaluation. ### Features - Plan-and-Execute architecture formalized. Planning and execution are now explicit phases with typed interfaces, so you can swap planners without touching executors. - Agent identity cards with capability enforcement. Every spawn carries a signed identity card; the orchestrator refuses tool calls outside the card's declared capabilities. - Built-in eval framework with per-model accuracy reporting — useful for A/B-ing plann	High	4/17/2026
v1.8.3	## v1.8.3 Quality + release-pipeline polish. - Cleared four vulnerabilities and fifteen code smells flagged by SonarCloud. - Auto-release now skips the version bump when nothing outside docs/CI actually changed — stops the "empty v1.8.x patch" churn. Full changelog: https://github.com/chernistry/bernstein/compare/v1.8.2...v1.8.3	High	4/17/2026
v1.8.2	## v1.8.2 Maintenance release — re-tags v1.8.1 after an auto-release pipeline hiccup. No user-visible changes. Full changelog: https://github.com/chernistry/bernstein/compare/v1.8.1...v1.8.2	High	4/16/2026
v1.8.1	## v1.8.1 Follow-up bugfix on v1.8.0. - Replaced 42 float `==` comparisons in the test suite with `pytest.approx()` so CI doesn't flake on rounding (Sonar S1244). - Restored the `_render_prompt` return-type annotation so callers importing the pre-1.8.0 signature keep working. Full changelog: https://github.com/chernistry/bernstein/compare/v1.8.0...v1.8.1	High	4/16/2026
v1.8.0	## v1.8.0 Eight feature drops focused on memory, observability, and safety. ### Features - Prompt caching for system prompts and role templates. Repeated role-prompt runs now read from the Anthropic prompt cache, cutting input-token cost on warm paths. - Structured memory layer. Episodic (per-session event trace) and semantic (long-lived facts) stores, queryable from any adapter. - Shared memory with actor-aware tagging. Cross-agent writes are tagged with the writing agent so reads	High	4/16/2026
v1.7.4	Patch release. Changes since previous version: fc29ca1e chore: auto-bump to v1.7.4	High	4/14/2026
v1.7.3	## v1.7.3 ### New features - complete Cloudflare integration platform (cf-001 through cf-012) ### Bug fixes - remove unreachable code in MCP remote transport (vulture) - resolve SonarCloud quality gate failures in CF integration - split ambiguous regex to resolve S5850/S6395 conflict ### Documentation - update index.md — use in-action.gif, fix adapter count - add Cloudflare integration documentation and update existing docs ### Chores - deps: bump agents Full changelog: https://gith	Medium	4/14/2026
v1.7.2	## v1.7.2 ### New features - improve community spotlight generator script - add spotlight auto-generator script (cherry-picked from #780) ### Bug fixes - resolve SonarCloud quality gate bugs in new code - resolve 400+ SonarCloud issues across entire codebase ### Documentation - remove retired VS Marketplace and empty codecov badges - add #783 to Alex Smith's contributor entry - fill in April 2026 Community Spotlight with real contributor data - create community spotlight template and update C	Medium	4/14/2026
v1.7.1	Patch release. Changes since previous version: 71b4b034 chore: auto-bump to v1.7.1 1092e9e0 Merge pull request #781 from chernistry/dependabot/npm_and_yarn/packages/vscode/npm_and_yarn-85af2c71bb 4c2445c8 chore(deps-dev): bump follow-redirects 8d82e03d ci: fix npm publish when version already matches tag	High	4/14/2026
v1.7.0	## v1.7.0 ### New features - add bernstein cost CLI with cache tracking and peak-hour scheduling - add API quota tracking with alerts and agent image optimization - add 'Built with Bernstein' badge to README ### Bug fixes - ci: eliminate tick guard race condition with event-based sync - ci: increase tick guard sleep for slow macOS CI VMs - cleanup workflow now deletes orphaned runs from deleted workflows ### Documentation - update README comparison table date and adapter count - updat	Medium	4/13/2026
v1.6.11	## v1.6.11 ### SonarCloud quality gate: PASS All conditions met: - Security Rating: A (0 vulnerabilities, 0 unreviewed hotspots) - Reliability Rating: A (0 bugs on new code) - Maintainability Rating: A - Duplication: 1.8% (under 3% threshold) ### Code quality - ~300 cognitive complexity refactors across 90+ files - 14 regex simplifications, 60+ float equality fixes - HTML accessibility, JS catch clauses, CSS contrast improvements - 9 stale TODOs resolved ### Distribution Al	Medium	4/13/2026
v1.6.10	## v1.6.10 ### Windows support Full Windows compatibility across the entire codebase, contributed by [@oldschoola](https://github.com/oldschoola): - Agent spawn: Windows environment variable passthrough (SYSTEMROOT, WINDIR, COMSPEC, etc.) - Unicode safety: `encoding='utf-8', errors='replace'` on 90+ subprocess calls - Terminal handling: `msvcrt` keypress detection for plan display on Windows - Process management: PowerShell/kernel32 fallbacks for stop command and process detec	Medium	4/13/2026
v1.6.9	## v1.6.9 The largest internal restructuring in Bernstein's history - a full module decomposition of the monolithic `core/` directory into focused subpackages, plus 100+ CI fixes to make all 927 tests pass. ### Module decomposition The 4,000+ line god-modules have been broken into focused, maintainable subpackages: - orchestrator.py (4,198 lines) -> 7 sub-modules in `core/orchestration/` - spawner.py (2,914 lines) -> 4 sub-modules in `core/agents/` - task_store.py (1,853 lines) -	Medium	4/13/2026
v1.6.8	## v1.6.8 ### Code quality Resolved all SonarCloud BLOCKER and CRITICAL issues across the codebase. - 29 BLOCKER fixes — removed redundant `response_model` params in FastAPI routes (tasks, agents, SBOM), switched to `Annotated` type hints for dependency injection, fixed a method that always returned the same value - Cognitive complexity reduction — refactored 34 functions across 30+ modules from CC 20-85 down to <15, extracting focused helper functions while preserving all behavior -	Medium	4/12/2026
v1.6.7	## v1.6.7 ### Dependencies - Bump `actions/labeler` from 5.0.0 to 6.0.1 - Bump `actions/download-artifact` from 4.3.0 to 8.0.1 - Bump `dependabot/fetch-metadata` to 3.0.0 Routine CI dependency updates. No functional changes.	Medium	4/12/2026
v1.6.6	## v1.6.6 ### Multi-adapter orchestration Bernstein now runs with any combination of CLI agents — no Claude Code dependency required. Configure per-role adapters in `bernstein.yaml`: ```yaml role_model_policy: backend: cli: qwen model: qwen3.6-plus security: cli: gemini model: gemini-3.1-pro-preview ``` The internal scheduler LLM also accepts any adapter (`internal_llm_provider: gemini`). ### 20 critical orchestration bug fixes Deep audit found and fixed 20 severe b	High	4/11/2026
v1.6.5	## v1.6.5 ### Highlights Any CLI adapter as internal LLM provider — `internal_llm_provider` in `bernstein.yaml` now accepts any registered adapter name (not just `"claude"` or `"openrouter"`). Set `internal_llm_provider: "gemini"` or `internal_llm_provider: "qwen"` and the manager/planner/decomposer will use that adapter's CLI for LLM calls. No code changes needed — just config. TUI notification center + session recorder — two new Textual panels: a notification history that surfaces o	Medium	4/11/2026
v1.6.4	## v1.6.4 The largest patch release yet — 369 files changed across cross-platform fixes, a critical server-stability bug, new workflow specs, and a security pentest harness. ### Highlights uvicorn `--reload` disabled in production — the task server's supervisor unconditionally enabled `--reload`, so every file write by a bernstein agent triggered a uvicorn restart. On a self-modifying codebase this caused cascading failures: port collisions, dropped HTTP connections, 127-second orchestrat	Medium	4/11/2026
v1.6.3	## v1.6.3 ### CI - Impacted-test selection for PRs — on pull requests, CI now runs `scripts/run_tests.py --affected refs/remotes/origin/$BASE_REF` which uses `git diff` to identify which test files are affected by the PR's changed source files. Unaffected tests are skipped. Reduces PR CI time by 3–5× on focused changes while still running the full suite on push-to-main. - Added cross-platform CI assertions (`test_cross_platform_ci.py`) that verify the workflow YAML structure matches expect	Medium	4/9/2026
v1.6.2	## v1.6.2 ### Security - Path traversal hardening — server routes that accept file paths now validate against the workspace root before any filesystem operation. - Resolved 3 SonarCloud security hotspots: hardcoded test IPs replaced with `127.0.0.1` constants, assertion-based auth checks converted to explicit `if` guards. ### Fixed - 4 reliability bugs in test assertions flagged by SonarCloud (float equality without tolerance in `test_task_splitter`, `test_token_budget_compaction`). - Sk	Medium	4/9/2026
v1.6.1	## v1.6.1 Patch release: version infrastructure fix only. No functional changes. - Auto-release workflow was tagging but not bumping `pyproject.toml` — added the version-bump step so PyPI and the CLI report the same version. Full changelog: https://github.com/chernistry/bernstein/compare/v1.6.0...v1.6.1	Medium	4/9/2026
v1.6.0	## v1.6.0 ### Highlights CLI command aliases (#391) — Type `bernstein s` instead of `bernstein status`, `bernstein r` instead of `bernstein run`. User-defined aliases via `~/.bernstein/aliases.yaml` override built-ins. Auto-release loop fix — The GitHub App token was causing an infinite CI → release → CI loop (14 spam releases in one day). Bot commits now skip both CI and auto-release. SonarCloud security fixes — All 16 security hotspots resolved: pinned 24 GitHub Actions to SHA	Medium	4/9/2026
v1.5.5	## Highlights Agent lifecycle reliability — Fixed the root cause of mass agent failures: loopback API requests were being rate-limited by our own server (28K+ 429 errors per run). Internal traffic now bypasses rate limiting. Also fixed stale claim detection using wrong timestamp, silent stderr, heartbeat race condition, and worktree failures not blocking spawn. LinUCB bandit routing — The contextual bandit model router is now wired into the orchestrator. Learns optimal model selection	Medium	4/9/2026
v1.5.4	## Highlights Spawn error classification — The spawner now categorizes failures (rate limit, missing adapter, permission denied, resource exhausted) and uses the category to decide retry strategy: fail-fast for permanent errors, fallback for transient ones. (#594) EU AI Act compliance engine — New compliance module with risk classification, conformity assessment templates, and evidence export for regulated environments. WebSocket frontend + API versioning — Live WebSocket updates	High	4/8/2026
v1.5.3	## v1.5.3 ### New - Config path validation — `bernstein.yaml` paths are validated before run starts, catching typos early (#583, contributed by @Beledarian) ### Improved - Consistent error handling with `handle_cli_error` and `ExitCode.CONFIG`	Medium	4/8/2026
v1.5.2	## v1.5.2 ### Security - Docker GitHub Actions pinned to full SHA hashes (supply chain hardening) ### Fixed - SonarCloud reliability bugs in test scripts resolved - CI workflow stabilization	Medium	4/7/2026
v1.5.1	## v1.5.1 ### New - Multi-registry distribution — published to PyPI, npm (MCP server), and Docker Hub simultaneously - MCP registry `server.json` for tool discovery ### Fixed - Broken HOL workflow removed	Medium	4/7/2026
v1.5.0	## What's New Major release with 215 new features, atomic batch operations, community contributions, and comprehensive documentation. ### Core - Atomic batch ticket ingestion — tasks are now claimed in groups; if the POST fails, no files move. Eliminates partial state on force-stop (#241-#244) - Parallelized test suite — isolated per-file test runner prevents OOM in CI across 2000+ tests - WAL and crash recovery groundwork — idempotent task operations for safer restarts ### Commun	Medium	4/7/2026
v1.4.16	Distribution and publish fixes. - Fixed VS Code Marketplace publisher (alex-chernysh) and extension naming - Fixed PyPI publish with `skip-existing` to handle duplicate versions - Removed broken screenshot placeholder from Open VSX listing - Stopped extension publish from creating cluttering GitHub releases	Medium	4/5/2026
v1.4.15	VS Code extension and role templates. - VS Code extension v0.2.0 — approve/reject commands, cost warnings, agent badges, status icons - Published and revised all 17 agent role templates - Fixed 44 Pyright errors in token_cmd using proper TokenAnalysis type	Medium	4/5/2026
v1.4.14	Task lifecycle and test fixes. - SLO cap now takes precedence over minimum agent floor - Fixed task completion test for module move - Added templates to CLI allowlist and fixed role validation in tests	Medium	4/4/2026
v1.4.13	Bug fix. - Fixed broken `rules.yaml` YAML parsing and missing `bernstein` label in PR templates	Medium	4/3/2026
v1.4.12	Lock management and test coverage. - Added `renew_lock()` for TTL reset during long-running memory writes - Integration tests for convergence guard blocking spawn waves	Medium	4/3/2026
v1.4.11	Agent lifecycle improvements. - Fixed bridge spawn skipping local adapter loop (prevented timeout and double transition) - Improved heartbeat monitoring and CPU handling for agent lifecycle - Updated idle threshold values (300s/120s)	Medium	4/3/2026
v1.4.10	Server and orchestrator fixes. - Fixed IP allowlist middleware registration with dynamic config lookup - Orchestrator skips dependency scan after `stop()` to prevent extra API calls - Upgraded `safe-push-main.sh` with CI monitor and autorelease verification	Medium	4/2/2026
v1.4.9	Repo hygiene and security. - Purged accidentally tracked `.sdd/` files from git history - Resolved Vulture dead code and Pyright type errors	Medium	4/1/2026
v1.4.8	Smarter task timeouts. - Complexity-based task timeout buckets — simple tasks get shorter timeouts, complex tasks get longer ones	Medium	4/1/2026
v1.4.7	Email notifications and smart retries. - Email notifications for task events (completion, failure, budget alerts) - Dynamic retry limits based on failure type (transient vs permanent)	Medium	4/1/2026
v1.4.6	No user-facing changes — version bump only.	Medium	4/1/2026
v1.4.5	Reliability and cleanup. - Hardened hard-stop cleanup of orphaned agent processes - Normalized flaky tests and lockfile metadata - Fixed repo hygiene regressions in examples and APIs - Reframed benchmark docs around verified results only	Medium	4/1/2026
v1.4.4	Telemetry and testing improvements. - OpenTelemetry spans for task lifecycle events - Conventional commit validation tests - Batch test improvements and watchdog enhancements	Medium	4/1/2026
v1.4.3	Test isolation fixes. - Fixed env var leaks in CI (masked `PYTHONPATH` and CI variables in splash tests) - Fixed `_is_process_alive` patching in heartbeat reaping tests	Medium	3/31/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

Enterprise-Multi-AI-Agent-Systems-🤖 Build and deploy scalable Multi-AI Agent systems with LangGraph and Groq LLMs to enhance intelligence across enterprise applications.main@2026-06-07

mcp-audit🌟 Track token consumption in real-time with MCP Audit. Diagnose context bloat and unexpected spikes across MCP servers and tools efficiently.main@2026-06-06

mcp-rag-agent🔍 Build a production-ready RAG system that combines LangGraph and MCP integration for precise, context-aware AI-driven question answering.main@2026-06-06

solace-agent-meshAn event-driven framework designed to build and orchestrate multi-agent AI systems. It enables seamless integration of AI agents with real-world data sources and systems, facilitating complex, multi-s1.28.0

arifOSArifOS — Constitutional MCP kernel for governed AI execution. AAA architecture: Architect · Auditor · Agent. Built for the open-source agentic era.v2026.05.22-birthday

More from chernistry

kotefAI dev that actually gets things done

More in MCP Servers

PlanExeCreate a plan from a description in minutes

automagik-genieSelf-evolving AI agent orchestration framework with Model Context Protocol support

agentroveYour own Claude Code UI, sandbox, in-browser VS Code, terminal, multi-provider support (Anthropic, OpenAI, GitHub Copilot, OpenRouter), custom skills, and MCP servers.

ProxmoxMCP-PlusEnhanced Proxmox MCP server with advanced virtualization management and full OpenAPI integration.