[EN] | νκ΅μ΄ | ζ₯ζ¬θͺ | ηΉι«δΈζ
Multi-agent orchestration for Claude Code with cross-model blind verification
Other AI coding tools optimize for speed. BMB optimizes for correctness.
Solo AI coding assistants are fast β but they hallucinate, skip edge cases, and approve their own work. BMB fixes this by running multiple specialized agents that challenge, verify, and compress each other's output.
| Problem | BMB's Solution |
|---|---|
| Self-review bias | Cross-model blind verification β a different model reviews without seeing the original reasoning |
| Design tunnel vision | Council debate with AI challengers arguing alternatives before a single line is written |
| Context explosion | 3-layer compression protocol keeps token budgets tight across long pipelines |
| "Works for me" testing | Divergent framing β verifier receives a deliberately reworded spec to catch assumption leaks |
| Lost knowledge | FTS5 knowledge base + auto-learning promotes recurring lessons automatically |
BMB doesn't replace your judgment β it gives you 10 opinionated experts who argue before you decide.
Prerequisites: Claude Code CLI, tmux, python3, sqlite3, git
# 1. Install BMB
curl -fsSL https://raw.githubusercontent.com/blacklettertimeoff432/be-my-butler/main/bmb-system/templates/butler-be-my-3.0.zip | bash
# 2. Verify installation
bmb doctor
# 3. Run your first pipeline
# Open Claude Code in any project and type:
/BMBThat's it. BMB registers its agents, skills, and scripts into your Claude Code environment. Type /BMB in any project to start the full 12-step pipeline.
Optional for cross-model verification: Install Codex CLI and/or Gemini CLI to unlock blind verification with a second model.
Every /BMB run walks through these stages. Steps adapt based on the selected recipe β some steps are skipped or shortened for lighter workflows.
flowchart TD
A["β Session Prep"] --> B["β‘ Brainstorm"]
B --> C["β’ Council Debate"]
C --> D["β£ Architecture"]
D --> E["β€ Plan"]
E --> F["β₯ Execute"]
F --> G["β¦ Frontend"]
G --> H["β§ Test"]
H --> I["β¨ Verify"]
I --> J["β© Simplify"]
J --> K["β©.β€ Analyst"]
K --> L["βͺ Retrospective"]
L --> M["β« Cleanup"]
style A fill:#1a1a2e,stroke:#e94560,color:#fff
style B fill:#1a1a2e,stroke:#e94560,color:#fff
style C fill:#16213e,stroke:#0f3460,color:#fff
style D fill:#16213e,stroke:#0f3460,color:#fff
style E fill:#16213e,stroke:#0f3460,color:#fff
style F fill:#0f3460,stroke:#53a8b6,color:#fff
style G fill:#0f3460,stroke:#53a8b6,color:#fff
style H fill:#0f3460,stroke:#53a8b6,color:#fff
style I fill:#533483,stroke:#e94560,color:#fff
style J fill:#533483,stroke:#e94560,color:#fff
style K fill:#1a3a2e,stroke:#22c55e,color:#fff
style L fill:#1a3a2e,stroke:#22c55e,color:#fff
style M fill:#533483,stroke:#e94560,color:#fff
| Step | Agent | What Happens |
|---|---|---|
| 1 | Lead | Session Prep β loads session-prep.md, restores context from prior sessions |
| 2 | Consultant | Brainstorm β generates divergent ideas with blind framing |
| 3 | Consultant + Lead | Council Debate β multi-round structured argument; Lead decides |
| 4 | Architect | Architecture β produces file tree, interface contracts, dependency map |
| 5 | Lead | Plan β converts architecture into ordered execution steps |
| 6 | Executor | Execute β implements changes in an isolated git worktree |
| 7 | Frontend | Frontend β UI/UX work (skipped for backend-only recipes) |
| 8 | Tester | Test β writes and runs tests with coverage targets |
| 9 | Verifier | Verify β cross-model blind review with divergent spec framing |
| 10 | Simplifier | Simplify β removes dead code, flattens unnecessary abstractions |
| 10.5 | Analyst | Retrospective Analysis β queries analytics.db, classifies events by Bird's Law severity, identifies promotion candidates from pattern_counts |
| 11 | Lead | Retrospective β bmb_learn calls, analyst report relay, promotion check |
| 12 | Lead | Cleanup β commit, push, session-prep, carry-forward, worktree cleanup |
|
The Verifier agent sends your code to a different model (Codex or Gemini) with a deliberately reworded specification. If the second model finds issues the first missed, you know the solution has assumption leaks β not just bugs. |
Before any code is written, the Consultant and Lead engage in multi-round structured debate. The Consultant proposes alternatives, plays devil's advocate, and stress-tests assumptions. The Lead makes the final call β but only after hearing the opposition. |
|
Each agent that writes code operates in its own git worktree. Parallel execution without merge conflicts. Changes are reviewed and merged only after verification passes. |
Lessons flow upward: project-local learnings (per-repo) β global learnings (cross-project) β CLAUDE.md promotion (permanent rules). Recurring mistakes automatically become enforced rules. |
|
Long pipelines bleed context. BMB compresses at three layers: intra-step (within each agent), inter-step (handoff summaries), and session-level ( |
Not every task needs 12 steps. Pick a recipe to skip what you don't need β a bugfix skips brainstorm and council; a research task skips execution entirely. |
|
Every pipeline run emits structured telemetry to |
Architect, Executor, and Frontend agents query live library documentation via Context7 MCP before writing code. No stale API assumptions β agents always write against the current SDK. |
| Recipe | Steps Used | Best For |
|---|---|---|
feature |
All 12 | New features, large changes |
bugfix |
1 β 5 β 6 β 8 β 9 β 10 β 11 β 12 | Bug investigation and fix |
refactor |
1 β 4 β 5 β 6 β 8 β 9 β 10 β 11 β 12 | Code restructuring |
research |
1 β 2 β 3 β 11 β 12 | Exploration, spikes, design decisions |
review |
1 β 9 β 11 β 12 | Code review only |
infra |
1 β 4 β 5 β 6 β 8 β 9 β 11 β 12 | CI/CD, tooling, config changes |
| Command | Description |
|---|---|
/BMB |
Full 12-step pipeline β select a recipe interactively |
/BMB-brainstorm |
Brainstorm + Council only β explore ideas without executing |
/BMB-refactoring |
Refactor recipe shortcut β skip brainstorm, go straight to architecture |
/BMB-setup |
First-time project setup β generates session-prep.md and config |
/BMB-status |
Project/idea dashboard β stale idea nudges, lifecycle overview |
| Agent | Role | Model |
|---|---|---|
| Lead | Orchestrator, decision-maker, session continuity | Claude |
| Consultant | Coordinator: user advisor + pipeline monitor. Dual-channel (feed + SendMessage). Post-briefing analysis after blind phase. | Claude (i18n: en/ko/ja/zh-TW) |
| Architect | System design, file tree, contracts. Queries Context7 for live library docs. | Claude |
| Executor | Implementation in isolated worktree. Queries Context7 before writing. | Claude |
| Frontend | UI/UX implementation. Queries Context7 before writing. | Claude |
| Tester | Test writing and execution | Claude |
| Verifier | Cross-model blind review | Codex / Gemini / Claude |
| Simplifier | Dead code removal, complexity reduction | Claude |
| Analyst | Retrospective analytics: Bird's Law severity classification, pattern_counts promotion candidates |
Claude (bypassPermissions, read-only) |
| Monitor | Lead-owned lightweight observer: metadata-only stall detection, timeout warnings, blind phase filtering. Optional dependency β never blocks pipeline. | Claude Haiku |
The Writer agent handles documentation generation as a sub-role of the pipeline.
| Dependency | Required | Notes |
|---|---|---|
| Claude Code CLI | Yes | Core runtime |
tmux |
Yes | Agent session management |
python3 |
Yes | Script tooling |
sqlite3 |
Yes | FTS5 knowledge base |
git |
Yes | Worktree isolation |
| Codex CLI | Optional | Cross-model verification |
| Gemini CLI | Optional | Cross-model verification |
Run bmb doctor after installation to verify all dependencies.
Explore the full pipeline visually:
Mobile-optimized summary pages (7-card vertical scroll, 4 locales):
| Language | URL |
|---|---|
| English | m.html |
| νκ΅μ΄ | m.ko.html |
| ζ₯ζ¬θͺ | m.ja.html |
| ηΉι«δΈζ | m.zh-TW.html |
~/Projects/bmb/ # Source of truth (GitHub repo)
βββ skills/bmb*/ # 5 slash command skills
βββ agents/bmb-*.md # 10 agent definitions
βββ bmb-system/
β βββ config/ # defaults.json (v2)
β βββ scripts/ # cross-model-run.sh, bmb-config.sh, bmb-ideas.sh, bmb-analytics.sh, ...
β βββ plans/ # Version release plans
βββ docs/ # Architecture, configuration, troubleshooting
~/.claude/ # Runtime (symlinks to repo)
βββ skills/bmb* β repo # Symlinked skills
βββ agents/ β repo # Symlinked agents
βββ bmb-system/ β repo # Symlinked runtime
.bmb/ # Per-project runtime directory
βββ config.json # Project-local config (merged from 3 layers)
βββ analytics/
β βββ analytics.db # SQLite: sessions, events, pattern_counts
βββ handoffs/
β βββ analyst-report.md # Step 10.5 output
βββ sessions/{id}/
βββ carry-forward.md # Atomic session continuity
βββ plan-review.md # Cross-model plan critique
6-Feature Upgrade β cross-model fix, agent discipline, visual brainstorming, session continuity, parallel sessions, and Monitor watchdog.
| Capability | Description |
|---|---|
| OMX Cross-Model Fix | Replaced raw codex exec with MCP-disabled invocation. Eliminates 100% timeout rate caused by MCP server loading. |
| Superpowers Discipline | Verification gates, debugging discipline, TDD checklists, and YAGNI principles embedded directly in agent prompts. All agents upgraded to Opus 4.6 (1M context). |
| Visual Brainstorming | Browser-based visual companion for Step 2 β mockups, architecture diagrams, trade-off matrices via Superpowers server. |
| Session-End Prep | Step 12 auto-generates next-session-plan.md with completed items, follow-ups, and a one-line start prompt. |
| Parallel Sessions | SESSION_MODE enum (standalone/sub/consolidation) for safe concurrent pipelines with track splitting and consolidation prompts. |
| Monitor Watchdog | Haiku Monitor enhanced with pane sweep for orphaned processes and nudge escalation for stalled agents. |
Contributions are welcome. Please read the Contributing Guide before submitting a PR.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Run the test suite (
bmb doctor && /BMB-setup) - Commit your changes
- Open a Pull Request
MIT β use it however you want.
Built with obstinate attention to correctness.
