freshcrate orchestra — orchestration patterns for the agent ecosystem

Orchestra — patterns for coordinating the agent ecosystem

Practical guidance for multi-agent systems across delegation, supervision, review gates, artifact spines, and human-in-the-loop control.

Freshcrate opinionated playbook

Add a real review gate before side effectsP0

Separate proposer and reviewer roles.
Require spec compliance plus code-quality review.
Block deploy/write actions until review passes.

Require tool-grounded reads before actionP0

Fetch current state before answering or mutating.
Prefer logs, DB rows, and file reads over prompt memory.
Save structured observations for downstream agents.

Introduce task contracts between agentsP0

Each delegated task includes objective, allowed tools, success criteria, and max budget.
Outputs are structured: result, evidence, unresolved risks.
No worker gets implicit permission to mutate unrelated surfaces.

Codify human escalation boundariesP1

List destructive and regulated actions that require human approval.
Bundle evidence with each escalation.
Keep a visible queue of blocked tasks.

Centralize the artifact spineP1

One plan or issue doc per workstream.
Attach tests/logs/receipts to the work item.
Record reversals and operator overrides.

Best-practice patterns

Human escalation thresholdsproductionDefine exactly when the orchestra stops and asks a human: production writes, secrets, payments, legal, or ambiguous user intent.

Why it works: Strong orchestration is not full autonomy — it is clean escalation at the right boundary.

safetyopshuman-in-the-loop

Do this

Codify escalation triggers instead of relying on agent intuition.
Expose pending approvals in one queue.
Capture the full evidence bundle that caused escalation.

Avoid this

Human approval for every trivial step.
No human review for destructive actions.
Escalation with no context, logs, or diff attached.

Review-gated execution laneproductionSeparate generation from approval: one agent proposes changes, another checks spec/security, then the executor applies.

Why it works: It catches shallow reasoning, over-broad edits, and unsafe side effects before they hit prod.

reviewsafetydeployment

Do this

Use at least one explicit review gate for schema changes, auth, billing, or deploys.
Review against both product spec and code quality — not just tests passing.
Keep reviewer prompts adversarial: ask what could break, leak, or drift.

Avoid this

Same agent writes and rubber-stamps its own work.
Review happening only after merge.
Treating green CI as the only approval signal.

Shared artifact spineteamCoordinate through explicit artifacts — plans, issue specs, receipts, test outputs, and decision logs.

Why it works: Artifacts survive context windows and prevent hidden assumptions between agents.

memoryhandoffcoordination

Do this

Use one canonical task doc per workstream.
Store acceptance criteria next to the artifact, not only in chat.
Log decisions and reversals so later agents know why a path changed.

Avoid this

Coordination purely through chat memory.
Multiple diverging TODO lists.
Undocumented manual fixes by human operators.

Small-batch delegationprototypeStart with 2–3 concurrent agents on independent slices, then scale only after measuring merge pain and review load.

Why it works: Parallelism helps only when synthesis cost stays lower than the work you save.

delegationthroughputcost

Do this

Split by file boundary or concern boundary, not by vague themes.
Cap parallelism until you can measure collision rate.
Always reserve one lane for validation and synthesis.

Avoid this

Spawning ten agents into the same surface area.
Parallel agents editing the same auth/config files.
Assuming more agents always means more speed.

Supervisor → worker graphproductionUse one planner/supervisor to break work into bounded sub-tasks and route them to narrow workers.

Why it works: You keep strategy centralized while shrinking the context and permissions each worker needs.

delegationsupervisionrouting

Do this

Make workers single-purpose: code, research, QA, or deployment — not everything at once.
Pass explicit task contracts with success criteria, budget, and allowed tools.
Require the supervisor to synthesize worker outputs before taking side-effecting actions.

Avoid this

Letting every agent talk to every other agent freely.
Giving all workers the full repo and full prompt history by default.
No review gate before write or deploy actions.

Tool-first groundingteamMake agents inspect live state before deciding: files, logs, DB rows, process state, metrics.

Why it works: Most orchestration failures come from agents acting on stale assumptions instead of current system state.

observabilitygroundingtooling

Do this

Require a live read before any irreversible action.
Prefer deterministic tools over memory for versions, counts, and current configs.
Persist structured outputs so downstream agents inherit facts instead of prose guesses.

Avoid this

Agents answering from memory for current facts.
Long prompt chains with no system-state refresh.
Passing screenshots or summaries when raw logs are available.