freshcrate
Home > MCP Servers > gossipcat-ai

gossipcat-ai

Multi-agent code review mesh โ€” orchestrates AI agents from multiple providers to review code in parallel, cross-review each other's findings, and build accuracy profiles over time. Agents that catch r

Description

Multi-agent code review mesh โ€” orchestrates AI agents from multiple providers to review code in parallel, cross-review each other's findings, and build accuracy profiles over time. Agents that catch real bugs get picked more often. Agents that hallucinate get deprioritized. MCP server for Claude Code, Cursor, and other IDEs.

README

Gossipcat

weightless in-context RL for code review โ€” agents that learn from grounded signals, no weights touched.

npm version npm weekly downloads MIT License Node 22+ GitHub starsminified bundle size last commit tests

Install ยท First Run ยท Daily Use ยท Dashboard ยท Troubleshooting ยท Config ยท For AI Agents


What is Gossipcat?

Gossipcat is an MCP server that orchestrates multiple AI agents to review your code in parallel. Agents independently review, then cross-review each other's findings. Agreements are confirmed. Hallucinations are caught and penalized. Over time, each agent builds an accuracy profile โ€” the system learns who to trust for what.

It's weightless in-context reinforcement learning

Most RL pipelines update model weights. Gossipcat doesn't touch weights โ€” it learns by updating the prompt layer.

Every finding an agent produces must cite a real file:line. Peers verify those citations against actual source code. Verified findings (and caught hallucinations) become grounded reward signals โ€” no judge model, no subjective grade, just mechanical checks against ground truth. Those signals update per-agent competency scores, which steer future dispatch. When an agent keeps failing in a category, a targeted skill file is auto-generated from its own failure history and injected into future prompts.

flowchart LR
    A([agent review]) -->|cites file:line| B([peer cross-review])
    B -->|verifies against code| C{verdict}
    C -->|confirmed| D[reward signal]
    C -->|hallucination| E[penalty signal]
    D --> F[competency score]
    E --> F
    F -->|steer dispatch| G([next agent pick])
    E -->|โ‰ฅ3 in category| H[auto-generate skill]
    H -->|inject into prompt| A
    G --> A
    style A fill:#0ea5e9,stroke:#0369a1,color:#fff
    style H fill:#f59e0b,stroke:#b45309,color:#fff
    style D fill:#10b981,stroke:#047857,color:#fff
    style E fill:#ef4444,stroke:#b91c1c,color:#fff
Loading

The "policy update" is a markdown file under .gossip/agents/<id>/skills/. No fine-tuning, no RLHF infrastructure, no labelling pipeline. The reward signal is grounded in source code rather than a judge model, which is the piece that makes the loop trustworthy enough to automate. When agents disagree, we check the code โ€” not another LLM's opinion.


The single-reviewer failure mode: a solo AI reviewer ships hallucinated bugs as critical findings 5โ€“10% of the time. Gossipcat's cross-review drops that to under 1%. That delta is what the whole system exists to produce.


Why multi-agent?

Without gossipcat With gossipcat
One AI reviews your code โ€” and hallucinates a finding you waste 20 minutes on Multiple agents cross-check each other โ€” hallucinations get caught before you see them
Every agent gets the same tasks regardless of track record Dispatch weights route tasks to the agent with the best accuracy in that category
An agent keeps making the same class of mistake Skill files are auto-generated from failure data and injected into future prompts
You don't know which agent to trust Accuracy, uniqueness, and reliability scores are tracked per agent, per category

Gossipcat is right for you if

  • You want multiple AI models catching different classes of bugs
  • You don't trust a single agent to catch everything
  • You want agents to cross-check each other's findings before you act on them
  • You want to know which agents are actually accurate vs. hallucinating
  • You want agents that get better over time based on their track record

Features

Consensus Review

3+ agents review independently, then cross-review each other. Findings tagged as CONFIRMED, DISPUTED, or UNIQUE.

Adaptive Dispatch

Agent accuracy is tracked per-category. Dispatch weights adjust automatically โ€” the best agent for the job gets picked.

Skill Development

When an agent keeps failing in a category, targeted skills are generated from failure data and injected into future prompts. Effectiveness is measured with a z-test on post-bind signals โ€” passed, failed, or inconclusive.

Multi-Provider

Mix Anthropic, Google, OpenAI, and OpenClaw agents in one team. Each brings different strengths. Native agents need no API key. ๐Ÿฆž Lobster friendly.

Live Dashboard

Real-time view of tasks, consensus reports, agent scores, and activity feed. Terminal Amber theme. WebSocket updates.

Agent Memory

Per-agent cognitive memory persists across sessions. Agents remember past findings, patterns, and project context.

Works
with
Claude CodeCursor
Not yet
Windsurf
Not yet
VS Code
Not yet

Provider
gateways
OpenClaw Ollama OpenAI-compatible

How it works

The Mermaid diagram above shows the loop end-to-end. Here's the per-step definition:

Step What happens
Dispatch Tasks routed to agents based on dispatch weights (accuracy history per category)
Parallel review Agents work independently, each producing findings with confidence scores
Cross-review Each agent reviews peers' findings: agree, disagree, unverified, or new finding
Consensus Findings deduplicated and tagged: CONFIRMED, DISPUTED, UNVERIFIED, UNIQUE
Signals You verify findings against code and record accuracy signals
Skill development Agents with repeated failures get targeted skill files injected into future prompts

Two types of agents

Native Relay
Runs as Claude Code subagent (Agent() tool) WebSocket worker on relay server
Providers Anthropic (Claude) Google (Gemini), OpenAI, any provider
API key None โ€” uses your Claude Code subscription Required per provider
Defined in .claude/agents/*.md .gossip/config.json
Consensus Yes Yes
Memory & Skills Yes Yes

Both types participate equally in consensus, cross-review, and skill development. Native subagents get skill files injected into their system prompts and can call gossip_remember for memory recall. Relay workers call the equivalent memory_query tool and get file_read + file_grep during cross-review so their verification parity matches natives.


Quickstart

Requirements: Node.js 22+ and Claude Code.

One-liner

npm install -g gossipcat && claude mcp add gossipcat -s user -- gossipcat

Restart Claude Code. Then in any project, ask:

"Set up a gossipcat team for this project"

Manual MCP config (if claude mcp add doesn't work for your setup)

Add to ~/.claude/mcp_settings.json:

{
  "mcpServers": {
    "gossipcat": {
      "command": "gossipcat"
    }
  }
}

Or project-local in .mcp.json:

{
  "mcpServers": {
    "gossipcat": {
      "command": "npx",
      "args": ["gossipcat"]
    }
  }
}

Claude Code will call gossip_setup() to scaffold .gossip/config.json and your agent team. First-run bootstrap also writes the dispatch rules and tool catalog so Claude Code knows how to use gossipcat โ€” no manual config needed.

Gossipcat is on npm and GitHub Releases โ€” both carry the same bundle. npm install -g gossipcat pulls from the registry and is the shortest path; the GitHub release URL is useful when you want to pin to a specific tarball (see Alternative install paths below). Either way, npm drops a gossipcat binary on your PATH.

What the install ships

What you get
MCP server Bundled binary at dist-mcp/mcp-server.js, wired as the gossipcat command on PATH
Dashboard Prebuilt static assets in dist-dashboard/ โ€” launches automatically on a dynamic port (ask Claude Code "what's my gossipcat dashboard URL?"). Override with GOSSIPCAT_PORT=24420 if you want a stable port.
Default skills + rules + archetypes 16 bundled skill templates, operational rules, and project archetypes copied into the install
Postinstall wizard Writes .mcp.json with correct absolute paths for your machine

Alternative install paths

Pin to a specific npm version:

npm install -g gossipcat@0.4.14

Pin to a specific GitHub release tarball (version-locked, bypasses npm registry):

npm install -g https://github.com/gossipcat-ai/gossipcat-ai/releases/download/v0.4.14/gossipcat-0.4.14.tgz

Project-local install (each project gets its own gossipcat):

cd your-project
npm install --save-dev gossipcat

The postinstall writes .mcp.json to your project root. Open Claude Code in that directory and gossipcat connects automatically โ€” no claude mcp add needed.

From source (contributors):

git clone https://github.com/gossipcat-ai/gossipcat-ai.git
cd gossipcat-ai
npm install
npm run build:mcp
claude mcp add gossipcat -s user -- node "$PWD/dist-mcp/mcp-server.js"

Upgrading

Re-run the install โ€” npm will fetch the latest version and replace the installed binary:

npm install -g gossipcat@latest

Or in-session, ask Claude Code: "Check for gossipcat updates" โ€” the gossip_update tool fetches the latest release notes and applies the upgrade with your confirmation.

3. API keys

Add env vars for the providers you want to use. Pass them with -e when registering, or set them in your shell environment.

Provider Env var Notes
Native (Claude Code) โ€” Dispatches through your active Claude Code subscription. No key needed.
Anthropic API ANTHROPIC_API_KEY Direct API access if you don't want to go through Claude Code.
Google Gemini GOOGLE_API_KEY Gemini Pro / Flash relay agents.
OpenAI OPENAI_API_KEY (+ optional OPENAI_BASE_URL) GPT-4 / GPT-4o relay agents. OPENAI_BASE_URL lets you point at OpenAI-compatible gateways (Azure, Together, Groq, etc.).
OpenClaw โ€” (local gateway) OpenAI-compatible, defaults to http://127.0.0.1:18789/v1. No API key โ€” auth handled by your local OpenClaw daemon.
Ollama (local) โ€” Runs locally via http://localhost:11434. No key. Pull your model first with ollama pull llama3.1:8b.

Examples โ€” registering gossipcat with each provider

Native only (zero API keys โ€” everything runs through Claude Code):

claude mcp add gossipcat -s user -- gossipcat

Then in session ask for a team built from sonnet-reviewer / haiku-researcher / opus-implementer. Native agents dispatch through Agent() and relay back. Good zero-config starting point.

Anthropic API (direct, bypasses Claude Code):

claude mcp add gossipcat -s user \
  -e ANTHROPIC_API_KEY=sk-ant-... \
  -- gossipcat

Use this if you want relay agents running Claude models without going through the Claude Code subscription path โ€” e.g. for parallelism beyond Claude Code's concurrency cap, or for running long background reviews while you keep working.

Google Gemini:

claude mcp add gossipcat -s user \
  -e GOOGLE_API_KEY=AIza... \
  -- gossipcat

Enables gemini-reviewer, gemini-tester, gemini-implementer on the relay. Watch the quota โ€” gossipcat has a built-in 429 watcher that falls back to native agents when Gemini is cooling down.

OpenAI (and OpenAI-compatible gateways):

claude mcp add gossipcat -s user \
  -e OPENAI_API_KEY=sk-... \
  -- gossipcat

For Azure / Together / Groq / OpenRouter, add OPENAI_BASE_URL:

claude mcp add gossipcat -s user \
  -e OPENAI_API_KEY=your-key \
  -e OPENAI_BASE_URL=https://api.groq.com/openai/v1 \
  -- gossipcat

OpenClaw (local gateway):

# Start the OpenClaw daemon first (see openclaw docs), default port 18789
claude mcp add gossipcat -s user -- gossipcat

No env vars. Configure an agent with provider: "openclaw" in .gossip/config.json and gossipcat talks to the local gateway automatically. Override the port with base_url in the agent config if your daemon runs elsewhere.

Ollama (fully local, no API):

# Pull a model once
ollama pull llama3.1:8b
# Then register gossipcat
claude mcp add gossipcat -s user -- gossipcat

Configure the agent with provider: "local" and model: "llama3.1:8b" in .gossip/config.json. Good for airgapped dev, offline work, and burning-down-test-debt sessions where you don't want to spend API credits.

Mixed setup (common production shape โ€” Gemini cheap reviewers + Anthropic heavy implementers):

claude mcp add gossipcat -s user \
  -e GOOGLE_API_KEY=AIza... \
  -e ANTHROPIC_API_KEY=sk-ant-... \
  -- gossipcat

Then set up a team with gemini-reviewer + haiku-researcher (native) + opus-implementer (native) + sonnet-reviewer (native). Gossipcat dispatches by category strength from the signal pipeline.

Keys are stored persistently and cross-platform:

  • macOS โ€” OS Keychain
  • Linux โ€” Secret Service (secret-tool)
  • Windows / other โ€” AES-256-GCM encrypted file

4. Initialize your team

Start a Claude Code session in any project and ask Claude to set up your team:

"Set up a gossipcat team with a Gemini reviewer and a Sonnet implementer"

Claude Code calls gossip_setup() to create your .gossip/config.json and agent definitions. You choose the providers, models, and roles โ€” gossipcat adapts to your setup.

Available presets: reviewer, implementer, tester, researcher, debugger, architect, security, designer, planner, devops, documenter


First Run โ€” 5 Minutes

The fastest path from "just installed" to "first useful review". If you skip this section you'll probably get stuck on the same things everyone else gets stuck on.

Step 1 โ€” Open Claude Code in any project

cd ~/your-project
claude

Gossipcat is registered globally now, so it boots automatically. You'll see it in the MCP server list.

Step 2 โ€” Bootstrap once

In Claude Code, just type:

Run gossip_status

This loads gossipcat's operating rules into the current session, creates .gossip/ in your project on first run, and prints the dashboard URL + auth key. Copy the key โ€” you'll paste it into the dashboard once.

You'll see something like:

Status:
  Host: claude-code (native agents supported)
  Relay: running :49664
  Workers: 0
  Dashboard: http://localhost:49664/dashboard (key: c3208820f8f70605fd45fa90004a2a4b)
  Quota: google โ€” OK

Open the dashboard URL in your browser, paste the key. You're now connected.

Step 3 โ€” Create your first team

Tell Claude what you're building:

"Set up a gossipcat team for this project โ€” it's a TypeScript Next.js app with a Postgres backend and Stripe payments."

Claude calls gossip_setup() and proposes a team. Typical proposal:

Proposed team:
  - sonnet-reviewer    (anthropic/claude-sonnet-4-6, native)   reviewer + security
  - gemini-reviewer    (google/gemini-2.5-pro, relay)          reviewer + types
  - haiku-researcher   (anthropic/claude-haiku-4-5, native)    researcher
  - opus-implementer   (anthropic/claude-opus-4-6, native)     implementer

Approve? (y/n)

Native agents (native: true) run through your existing Claude Code subscription โ€” no API key needed. Relay agents need a key for their provider. If you don't have a Google API key, drop gemini-reviewer from the team for now and add it later.

Once you approve, gossipcat writes .gossip/config.json and the agents are live.

Step 4 โ€” Run your first review

In a project where you've made some changes:

"Do a consensus review of my recent changes"

What happens (typical timing):

Phase Time What you see
1. Decompose 1s Claude picks agents and dispatches them in parallel
2. Independent review 30sโ€“2min Each agent reads your diff and reports findings
3. Cross-review 30sโ€“1min Each agent reviews the others' findings
4. Consensus report <1s Findings tagged CONFIRMED / DISPUTED / UNVERIFIED / UNIQUE
5. Verification varies Claude reads UNVERIFIED findings against the code, decides if they're real
6. Signal recording <1s Accuracy signals saved per agent

You get a report like:

Consensus round b81956b2-e0fa4ea4 โ€” 3 agents

CONFIRMED (2):
  [critical] Race condition in tasks Map at server.ts:47 โ€” sonnet + gemini
  [high]     Missing auth on WebSocket upgrade at server.ts:112 โ€” sonnet + gemini

UNIQUE (1):
  [medium]   String concat in SQL query at queries.ts:88 โ€” only sonnet caught this

DISPUTED (1):
  [low]      "Memory leak in timer" โ€” haiku says yes, sonnet/gemini say no
             โ†’ verified, sonnet was right (not a leak โ€” cleanup is in finally)

Final: 3 real bugs to fix, 1 false alarm caught by cross-review.

You only act on CONFIRMED + verified UNIQUE findings. The cross-review is the whole point โ€” single-agent reviews ship hallucinated bugs as critical findings 5โ€“10% of the time. Cross-review with verification drops that to under 1%.

Step 5 โ€” Watch the dashboard

The dashboard shows everything live: agents, scores, active tasks, consensus reports, signals. You can leave it open in a tab while you work โ€” every gossipcat tool call pushes an update via WebSocket.

That's the basic loop. The rest of this README covers advanced workflows, troubleshooting, and how to interpret what you're seeing.


How to use it day-to-day

Concrete recipes for the most common workflows. Each one shows what to type, what you'll get back, and what to do with it.

Recipe 1: Review a diff before committing

Type:

"Review my staged changes"

What you'll get: A consensus report (1โ€“3 minutes) with findings tagged CONFIRMED / UNIQUE / DISPUTED. Claude verifies UNVERIFIED findings against the code and tells you which are real.

What to do with it: Fix the CONFIRMED + verified-real findings. Ignore disputed-but-falsified findings. If a finding looks important but you disagree, ask Claude "verify finding f3 against the code yourself" โ€” it'll re-check and either back you up or push back.

When NOT to use it: Tiny diffs (under 20 lines) โ€” overhead exceeds value. Just eyeball them.


Recipe 2: Catch security issues before shipping a feature

Type:

"Security audit the payment handler at lib/stripe/webhook.ts"

What you'll get: Each security-skilled agent reviews from a different angle (OWASP, input validation, auth, secrets). Findings get cross-validated. Real vulns surface; theoretical ones get caught and dropped.

What to do with it: Fix critical/high findings before merge. Bookmark medium/low findings for the next pass.

Tip: Be specific about the file or module. "Security audit the codebase" is too broad and produces noisy results. "Security audit lib/stripe/webhook.ts" produces actionable findings.


Recipe 3: Understand a piece of code before changing it

Type:

"Research how the WebSocket connection lifecycle works in this project before I touch it"

What you'll get: A research agent (haiku-researcher by default โ€” fast and cheap) reads the code, traces call paths, and writes a summary. The summary is saved to that agent's cognitive memory so the next time you ask about the same area it remembers.

What to do with it: Use the summary to plan your change. The agent will reference it next time you ask anything related โ€” no re-discovery cost.


Recipe 4: Verify your own assumption

Type:

"I think there's a race condition in the tasks Map at server.ts:47 โ€” check if I'm right"

What you'll get: Two agents independently check the specific claim and either confirm or push back. Author self-review is optimistic โ€” this isn't.

What to do with it: If both agree with you, fix it. If they push back, read their reasoning before defending your hypothesis. They might be right.


Recipe 5: See which agents you can actually trust

Type:

"Show me agent scores"

What you'll get: A table of agents sorted by reliability with per-category accuracy and dispatch weights. Categories include trust_boundaries, injection_vectors, concurrency, error_handling, data_integrity, type_safety, etc.

What to do with it: If gemini-reviewer is sitting at 30% accuracy on concurrency, you know not to trust its concurrency findings without cross-review. If sonnet-reviewer is at 90% on trust_boundaries, you can ship its findings on auth/session bugs with high confidence.


Recipe 6: Improve an agent that keeps making the same mistake

Type:

"gemini-reviewer keeps hallucinating about concurrency โ€” develop a skill for it"

What you'll get: Gossipcat reads gemini-reviewer's failure data, generates a targeted skill file with concrete anti-patterns, and injects it into the agent's prompt for all future concurrency-related reviews. Effectiveness is measured statistically (z-test on post-bind signals) โ€” it'll tell you if the skill is actually working after ~30 dispatches.

What to do with it: Nothing โ€” it's automatic. Just keep using the agent. Over time, the failure rate drops.


Recipe 7: Set up a team for a brand-new project

Type:

"Set up a gossipcat team for a TypeScript Cloudflare Workers project with Drizzle ORM and KV storage"

What you'll get: A proposed team with archetypes matched to your stack. Worker projects need different reviewers than long-running Node services โ€” gossipcat picks accordingly.

What to do with it: Review the proposal, drop agents you can't run (missing API keys), approve.


Things to avoid

  • Don't ask for "review the whole codebase" โ€” too broad, agents will pick whatever they find first. Scope to a file, module, or diff.
  • Don't approve findings without reading them โ€” even after cross-review, ~5% of findings are genuinely wrong. The reasoning matters more than the verdict.
  • Don't ignore the dashboard โ€” when something feels weird (slow dispatch, repeated failures, suspicious findings), the dashboard usually shows you why before you have to ask.
  • Don't run consensus mode for trivial questions โ€” gossip_run with one agent is fine for "what does this function do?"-tier queries. Save consensus for changes that touch shared state, auth, persistence, or the dispatch pipeline itself.

Reading the dashboard

The dashboard at http://localhost:<port>/dashboard is the visual layer over everything gossipcat knows. Open it once with the auth key from gossip_status, leave the tab open while you work. Updates push live via WebSocket.

Panel What it shows When to look at it
Overview Active agents, dispatch weights, recent finding counts First thing in the morning โ€” quick sanity check
Team All agents sorted by reliability score, with category breakdowns Picking which agent to trust for a tricky finding
Tasks Live + historical task list with agent, duration, status When something feels stuck โ€” find it here first
Findings Consensus reports paginated by round, with CONFIRMED/DISPUTED/UNVERIFIED breakdowns Reviewing what got caught in a recent review
Agent detail Per-agent memory entries, skills, score history, task history Diagnosing why a specific agent keeps failing in a category
Signals Raw signal feed (agreement / hallucination / unique_confirmed) Auditing the scoring pipeline if scores look wrong
Logs mcp.log content (boot, errors, warnings) When the MCP server is misbehaving and you need raw evidence

Auth keys rotate every session. A fresh key is generated each time gossipcat boots. If the dashboard says "unauthorized", run gossip_status again to get the new key.


Troubleshooting

"Dashboard says unauthorized / 401"

The auth key rotates every boot. Run gossip_status in Claude Code to get the current key, paste it into the dashboard login.

"Dashboard URL doesn't load at all"

Check ~/.gossip/mcp.log (or <your-project>/.gossip/mcp.log) for the boot log. Look for the [gossipcat] ๐ŸŒ Dashboard: line โ€” that's the actual port. If it's missing, the relay didn't start. Common causes:

  • Conflicting .gossip/relay.pid from a crashed previous boot โ€” delete it and restart Claude Code
  • GOSSIPCAT_PORT set to a port already in use โ€” unset the env var or pick a free port

"Boot says 'No .gossip/config.json found' and nothing happens"

This was a critical bug in v0.1.0 โ€” fixed in v0.1.1. Upgrade with the install one-liner above. v0.1.1+ boots in degraded mode (dashboard + relay only) so you can run gossip_setup from inside Claude Code.

"Agents keep returning empty findings"

Usually a model or quota problem. Check gossip_status โ€” it shows Quota: google โ€” OK (or cooling down) per provider. If you're rate-limited, gossipcat will fall back to native agents automatically, but fallback agents may not be in your team. Either wait for the cooldown or add native agents to your team.

"The same hallucinated finding keeps coming back"

Record a hallucination_caught signal: ask Claude "record a hallucination_caught signal for finding f3 in the last consensus round โ€” it claimed X but the code shows Y". After 3 such signals, the offending agent's score drops in that category and the orchestrator stops asking it questions in that area.

"I want to use my own model / provider"

Edit .gossip/config.json directly. Any OpenAI-compatible endpoint works via provider: "openai" + base_url. Local models work via Ollama (provider: "local"). See the Configuration section.

"An agent produced output but the consensus report is empty"

The strict <agent_finding> parser drops tags whose type isn't one of finding | suggestion | insight (see invariant #8 in docs/HANDBOOK.md). When that happens, the gossip_signals receipt surfaces the drop count and a finding_dropped_format pipeline signal is emitted. Check the consensus round's droppedFindingsByType field on the dashboard โ€” it names the offending type. If you see &lt;agent_finding&gt; instead of raw <agent_finding>, a transport layer is entity-encoding the output; pass agent output verbatim to gossip_relay.

"Multiple Claude Code instances all want gossipcat"

Already supported as of v0.1.1 โ€” each instance gets its own dynamic port. If you want a stable port for one specific instance (e.g. for browser bookmarks), set GOSSIPCAT_PORT=24420 for that one project's environment.

"How do I uninstall?"

npm uninstall -g gossipcat
claude mcp remove gossipcat -s user
rm -rf ~/.gossip  # if you want to wipe global memory + signals
rm -rf <project>/.gossip  # if you want to wipe per-project state

Still stuck?

File an issue at https://github.com/gossipcat-ai/gossipcat-ai/issues. Include the contents of .gossip/mcp.log (last 100 lines) and the output of gossip_status. Or ask Claude in-session: "file a gossipcat bug report about <...>" โ€” the gossip_bug_feedback tool packages it up automatically.


Under the hood

Claude Code translates your natural-language requests into gossipcat MCP tool calls automatically โ€” you don't need to type these โ€” but if you want fine-grained control they're documented here:

gossip_run(agent_id: "auto", task: "...")        โ†’ single-agent task
gossip_dispatch(mode: "consensus", tasks: [...]) โ†’ multi-agent review with cross-review
gossip_collect(consensus: true)                  โ†’ wait for results, run consensus
gossip_signals(action: "record", signals: [...]) โ†’ record accuracy after verification
gossip_scores()                                  โ†’ view agent performance
gossip_skills(action: "develop", ...)            โ†’ improve a struggling agent
gossip_status()                                  โ†’ system status + dashboard URL
gossip_setup(...)                                โ†’ create or update your team

The dispatch rules at .claude/rules/gossipcat.md (auto-generated on first boot) teach Claude Code when to pick which mode based on what your change touches. You can edit these rules to bias the dispatch.


For AI Agents

Currently supported orchestrator: Claude Code only. Cursor, Windsurf, and VS Code agents do not yet have native agent support and cannot run consensus cross-review. If you are reading this as a Claude Code instance with the gossipcat MCP server connected, this guide applies to you.

If you are a Claude Code instance reading this to understand how to use gossipcat, here is the complete operational guide.

You are the orchestrator. Gossipcat gives you tools to dispatch work to a team of specialized agents, collect their results,

Release History

VersionChangesUrgencyDate
v0.4.15## What's Changed * test(install): regression guards for tarball integrity + rules drift by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/186 * docs(claude-md): add '## Your Role' heading + restore drift anchor by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/187 * feat(orchestrator,cli): L2 โ€” signal-writer visibility (Symbol accessor + typed helpers + wiring) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/190 * test(signal-pipeline):High4/20/2026
v0.4.14## What's Changed * feat(orchestrator): Layer 1 signal-pipeline parity test by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/173 * feat(orchestrator): Layer 3 signal-pipeline drift detector by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/174 * feat(security): gossip_remember + memory_query hardening by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/175 * fix(completion-signals): _system guard + parity length assert + redundant nullish bHigh4/19/2026
v0.4.13## What's Changed * fix(signal-pipeline): shared emitCompletionSignals helper + error-path + memoryQueryCalled threading by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/169 * chore(release): v0.4.13 by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/170 **Full Changelog**: https://github.com/gossipcat-ai/gossipcat-ai/compare/v0.4.12...v0.4.13High4/18/2026
v0.4.12## What's Changed * fix(sandbox): self-documenting deny + gossip_setup env-var tip by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/167 * chore(release): v0.4.12 by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/168 **Full Changelog**: https://github.com/gossipcat-ai/gossipcat-ai/compare/v0.4.11...v0.4.12High4/18/2026
v0.4.11## What's Changed * feat(observability): classifier expansion + gossip_watch + pipeline signals + rotation by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/163 * fix(sandbox): orchestrator env exemption + quoted-arg extraction (#162) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/164 * docs(gossip_watch): surface in CLAUDE.md + gossip_status banner by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/165 * chore(release): v0.4.11 by @atabHigh4/18/2026
v0.4.10## What's Changed * feat(dashboard): auto-benching v2 badges by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/127 * fix(consensus): realpath-based containment (pre-#126 hardening) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/128 * feat(consensus): user-worktree citation resolution (#126) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/129 * feat(consensus): round-level retraction via gossip_signals consensus_id by @ataberk-xyz in hHigh4/17/2026
v0.4.9## Highlights **Load-bearing fix for all-native consensus teams** (#121 / PR #123). Before this release, when `gossip_dispatch(mode: "consensus")` ran with all native agents and 0 relay workers, Phase 2 auto-verifier returned empty text in ~0ms, synthesis proceeded with zero peer input, and every finding was tagged UNIQUE regardless of real overlap. No error surfaced. If your team is all native (Claude Code subagents with no Gemini/OpenAI workers), you should upgrade. ## Shipped in 0.4.9 ### High4/17/2026
v0.4.8## What's Changed * fix(sandbox+tools): L3 scope exclusion + git execFile ENOENT retry by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/109 * fix(tools): Tool Server cwd divergence โ€” worktree relative paths by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/110 * chore(release): v0.4.8 by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/111 **Full Changelog**: https://github.com/gossipcat-ai/gossipcat-ai/compare/v0.4.7...v0.4.8High4/16/2026
v0.4.7## [0.4.7] โ€” 2026-04-16 Layer 3 audit overhaul โ€” scoped-mode noise drops 99.9%, worktree-mode 82%. Shipped via #107 which stacks two commits: ### Fixed - **`find -prune` instead of `-not -path`** for L3 exclusions. `-not -path` only filters `find`'s output; the scan still descends into every excluded directory. On macOS that means `find` enters `~/Library/Application Support/{Safari,Photos,Group Containers}/` and hits TCC "Operation not permitted" every dispatch, producing a noisy `find partiHigh4/16/2026
v0.4.6## [0.4.6] โ€” 2026-04-16 ### Fixed - **Layer 3 audit now excludes user-level app directories** (#105). Live-fire verification against v0.4.5 produced 44 "boundary escape" violations that were 100% OS-level churn โ€” Chrome cookies, Spotify cache, NordVPN data, Claude Code's own `~/.claude/projects/*.jsonl` session logs โ€” not a single agent action. These app directories are unreachable through the Tool Server sandbox or the Layer 2 PreToolUse hook, so false positives from them were pure noise drowHigh4/16/2026
v0.4.5## [0.4.5] โ€” 2026-04-16 Tool Server union-of-roots โ€” worktree-mode relay agents can now actually write inside their own worktrees. ### Fixed - **Tool Server was worktree-blind on 6 file_* tools** (#103). `ToolServer.enforceWriteScope` correctly gated against the agent's assigned worktree root, but then `FileTools.fileWrite` called `Sandbox.validatePath` which re-resolved against `projectRoot` only. Worktrees live under `os.tmpdir()/gossip-wt-*` โ€” always outside `projectRoot` โ€” so every absoluHigh4/16/2026
v0.4.4## [0.4.4] โ€” 2026-04-16 Point release closing two Layer 3 sandbox bugs surfaced by live-fire verification against the freshly-released v0.4.3 (consensus task `56641e6e`, 2026-04-16). ### Fixed - **Layer 3 `worktreePath` patch was a no-op** (#101). The F2 fix in #99 called `ctx.mainAgent.getTask(taskId)` after `collect()`, but `DispatchPipeline.collect()` deletes the task entry from its `this.tasks` Map before returning (default `consume: true`). Every relay worktree dispatch since #99 recordeHigh4/16/2026
v0.4.3## [0.4.3] โ€” 2026-04-16 Headline: **worktree filesystem sandbox** โ€” the multi-layer defense for issue #90 ships end-to-end. Agents dispatched with `write_mode: "worktree"` are now soft-blocked from writing outside their isolated worktree by a PreToolUse hook (Layer 2), with a post-dispatch `find -newer` audit as a backstop (Layer 3) for escape paths the hook can't see. Plus a round of scoring corrections, dashboard polish, and relay resilience fixes driven by recent consensus rounds. ### WorktHigh4/16/2026
v0.4.2**README-only republish.** v0.4.1 shipped to npm before the README normalization PRs (#67/#68) merged, so the [npm package page](https://www.npmjs.com/package/gossipcat) displayed the old install one-liner while the GitHub README showed the shorter `npm install -g gossipcat` form. This release syncs the two. ## No code changes Bundle is byte-identical to 0.4.1 except for `README.md` and the `package.json` version bump. Upgrading from 0.4.1 is a no-op for behavior. ## Install ```bash npm instalMedium4/14/2026
v0.4.1A round of hardening driven by two internal consensus rounds that reviewed 0.4.0's own PRs and caught real security + correctness regressions the original review missed. Three stacked PRs (#63, #64, #65) close every HIGH/MEDIUM/LOW finding plus two silent-failure modes in already-merged 0.4.0 code. ## Security (merge-blockers caught in cross-review) - `gossip_plan` native utility now issues a `relay_token` (prevents fabricated-decomposition injection via `gossip_relay`). - Re-entry path validatMedium4/14/2026
v0.4.0## [0.4.0] โ€” 2026-04-14 Combines the unreleased 0.3.0 work (server-side cross-review, memory pre-fetch, scoring, dashboard polish) with three new streams: HTTP file bridge infrastructure, the consensus type-contract fix, and a long-standing test bug cleanup. > **Note:** v0.3.0 was published to npm on 2026-04-13 but never cut a matching GitHub release. Its changes are included here under 0.4.0 rather than retroactively tagged โ€” the CHANGELOG entries below merge both cycles for a single coherentMedium4/14/2026
v0.3.0v0.3.0 was published to npm on 2026-04-13 but never got a matching GitHub release at the time. Tagging retroactively at commit 46d4615 to close the audit trail. The change log for the 0.3.0 work (server-side cross-review, memory pre-fetch, scoring, dashboard polish) is folded into the [v0.4.0 release notes](https://github.com/gossipcat-ai/gossipcat-ai/releases/tag/v0.4.0) โ€” see that entry for the full changelog. Users who installed gossipcat@0.3.0 from npm got these features; the 0.4.0 release Medium4/14/2026
v0.2.0## What's Changed * fix(ports): sticky per-project port files for relay + HTTP MCP by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/16 * spec(skills): emergent skill engine by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/17 * dashboard(team-hero): 2x2 grid sorted by last dispatch by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/18 * fix(dashboard): render categoryAccuracy, not unbounded categoryStrengths by @ataberk-xyz in https://githMedium4/9/2026
v0.1.2## What's Changed * docs(readme): rewrite usage sections for clarity (First Run + Daily Use + Troubleshooting) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/9 * fix(release): two-stage flow that respects branch protection by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/10 * chore(gitignore): exclude .claude/agents/ and .claude/settings.local.json by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/11 * fix(native): make Claude Code a fMedium4/9/2026
v0.1.1## What's Changed * feat(dashboard): phase 1 โ€” 6-tab SPA with auth, real-time WebSocket, consensus history by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/1 * feat: gossip_verify_memory โ€” on-demand staleness check for memory files by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/2 * fix(tests): clear 4 pre-existing test-debt suites (PR 1/3) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/4 * fix(signals): record per-signal timestamps Medium4/8/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

ClawCodePersistent agents for Claude Code as a plugin, not a harness. Memory, personality, messaging across WhatsApp, Telegram, and Discord, plus a service mode for 24/7 runs. Imports from OpenClaw.v1.4.13
mcpickClaude Code extension manager โ€” MCP servers, plugins (skills, hooks, agents), and marketplacesmain@2026-04-18
trace-mcpMCP server for Claude Code and Codex. One tool call replaces ~42 minutes of agent explorationv1.28.0
@contentrain/skillsAI agent skills for Contentrain โ€” workflow procedures, framework integration guides0.4.0
memtraceCode intelligence graph โ€” MCP server + AI agent skills + visualization UI0.2.1