freshcrate

Orchestra — how to orchestrate agents without making a mess

Opinionated operator guidance for multi-agent systems: what scales, what fails, and where to keep humans in the loop.

Patterns tracked
6
Principles
6
Anti-patterns named
18
ResetShowing 6 patterns
Freshcrate opinionated playbook
Add a real review gate before side effectsP0
  • Separate proposer and reviewer roles.
  • Require spec compliance plus code-quality review.
  • Block deploy/write actions until review passes.
Require tool-grounded reads before actionP0
  • Fetch current state before answering or mutating.
  • Prefer logs, DB rows, and file reads over prompt memory.
  • Save structured observations for downstream agents.
Introduce task contracts between agentsP0
  • Each delegated task includes objective, allowed tools, success criteria, and max budget.
  • Outputs are structured: result, evidence, unresolved risks.
  • No worker gets implicit permission to mutate unrelated surfaces.
Codify human escalation boundariesP1
  • List destructive and regulated actions that require human approval.
  • Bundle evidence with each escalation.
  • Keep a visible queue of blocked tasks.
Centralize the artifact spineP1
  • One plan or issue doc per workstream.
  • Attach tests/logs/receipts to the work item.
  • Record reversals and operator overrides.
Best-practice patterns
Human escalation thresholdsproductionDefine exactly when the orchestra stops and asks a human: production writes, secrets, payments, legal, or ambiguous user intent.

Why it works: Strong orchestration is not full autonomy — it is clean escalation at the right boundary.

safetyopshuman-in-the-loop
Do this
  • Codify escalation triggers instead of relying on agent intuition.
  • Expose pending approvals in one queue.
  • Capture the full evidence bundle that caused escalation.
Avoid this
  • Human approval for every trivial step.
  • No human review for destructive actions.
  • Escalation with no context, logs, or diff attached.
Review-gated execution laneproductionSeparate generation from approval: one agent proposes changes, another checks spec/security, then the executor applies.

Why it works: It catches shallow reasoning, over-broad edits, and unsafe side effects before they hit prod.

reviewsafetydeployment
Do this
  • Use at least one explicit review gate for schema changes, auth, billing, or deploys.
  • Review against both product spec and code quality — not just tests passing.
  • Keep reviewer prompts adversarial: ask what could break, leak, or drift.
Avoid this
  • Same agent writes and rubber-stamps its own work.
  • Review happening only after merge.
  • Treating green CI as the only approval signal.
Shared artifact spineteamCoordinate through explicit artifacts — plans, issue specs, receipts, test outputs, and decision logs.

Why it works: Artifacts survive context windows and prevent hidden assumptions between agents.

memoryhandoffcoordination
Do this
  • Use one canonical task doc per workstream.
  • Store acceptance criteria next to the artifact, not only in chat.
  • Log decisions and reversals so later agents know why a path changed.
Avoid this
  • Coordination purely through chat memory.
  • Multiple diverging TODO lists.
  • Undocumented manual fixes by human operators.
Small-batch delegationprototypeStart with 2–3 concurrent agents on independent slices, then scale only after measuring merge pain and review load.

Why it works: Parallelism helps only when synthesis cost stays lower than the work you save.

delegationthroughputcost
Do this
  • Split by file boundary or concern boundary, not by vague themes.
  • Cap parallelism until you can measure collision rate.
  • Always reserve one lane for validation and synthesis.
Avoid this
  • Spawning ten agents into the same surface area.
  • Parallel agents editing the same auth/config files.
  • Assuming more agents always means more speed.
Supervisor → worker graphproductionUse one planner/supervisor to break work into bounded sub-tasks and route them to narrow workers.

Why it works: You keep strategy centralized while shrinking the context and permissions each worker needs.

delegationsupervisionrouting
Do this
  • Make workers single-purpose: code, research, QA, or deployment — not everything at once.
  • Pass explicit task contracts with success criteria, budget, and allowed tools.
  • Require the supervisor to synthesize worker outputs before taking side-effecting actions.
Avoid this
  • Letting every agent talk to every other agent freely.
  • Giving all workers the full repo and full prompt history by default.
  • No review gate before write or deploy actions.
Tool-first groundingteamMake agents inspect live state before deciding: files, logs, DB rows, process state, metrics.

Why it works: Most orchestration failures come from agents acting on stale assumptions instead of current system state.

observabilitygroundingtooling
Do this
  • Require a live read before any irreversible action.
  • Prefer deterministic tools over memory for versions, counts, and current configs.
  • Persist structured outputs so downstream agents inherit facts instead of prose guesses.
Avoid this
  • Agents answering from memory for current facts.
  • Long prompt chains with no system-state refresh.
  • Passing screenshots or summaries when raw logs are available.