weightless in-context RL for code review โ agents that learn from grounded signals, no weights touched.
Install ยท First Run ยท Daily Use ยท Dashboard ยท Troubleshooting ยท Config ยท For AI Agents
Gossipcat is an MCP server that orchestrates multiple AI agents to review your code in parallel. Agents independently review, then cross-review each other's findings. Agreements are confirmed. Hallucinations are caught and penalized. Over time, each agent builds an accuracy profile โ the system learns who to trust for what.
Most RL pipelines update model weights. Gossipcat doesn't touch weights โ it learns by updating the prompt layer.
Every finding an agent produces must cite a real file:line. Peers verify those citations against actual source code. Verified findings (and caught hallucinations) become grounded reward signals โ no judge model, no subjective grade, just mechanical checks against ground truth. Those signals update per-agent competency scores, which steer future dispatch. When an agent keeps failing in a category, a targeted skill file is auto-generated from its own failure history and injected into future prompts.
flowchart LR
A([agent review]) -->|cites file:line| B([peer cross-review])
B -->|verifies against code| C{verdict}
C -->|confirmed| D[reward signal]
C -->|hallucination| E[penalty signal]
D --> F[competency score]
E --> F
F -->|steer dispatch| G([next agent pick])
E -->|โฅ3 in category| H[auto-generate skill]
H -->|inject into prompt| A
G --> A
style A fill:#0ea5e9,stroke:#0369a1,color:#fff
style H fill:#f59e0b,stroke:#b45309,color:#fff
style D fill:#10b981,stroke:#047857,color:#fff
style E fill:#ef4444,stroke:#b91c1c,color:#fff
The "policy update" is a markdown file under .gossip/agents/<id>/skills/. No fine-tuning, no RLHF infrastructure, no labelling pipeline. The reward signal is grounded in source code rather than a judge model, which is the piece that makes the loop trustworthy enough to automate. When agents disagree, we check the code โ not another LLM's opinion.
The single-reviewer failure mode: a solo AI reviewer ships hallucinated bugs as critical findings 5โ10% of the time. Gossipcat's cross-review drops that to under 1%. That delta is what the whole system exists to produce.
| Without gossipcat | With gossipcat |
|---|---|
| One AI reviews your code โ and hallucinates a finding you waste 20 minutes on | Multiple agents cross-check each other โ hallucinations get caught before you see them |
| Every agent gets the same tasks regardless of track record | Dispatch weights route tasks to the agent with the best accuracy in that category |
| An agent keeps making the same class of mistake | Skill files are auto-generated from failure data and injected into future prompts |
| You don't know which agent to trust | Accuracy, uniqueness, and reliability scores are tracked per agent, per category |
- You want multiple AI models catching different classes of bugs
- You don't trust a single agent to catch everything
- You want agents to cross-check each other's findings before you act on them
- You want to know which agents are actually accurate vs. hallucinating
- You want agents that get better over time based on their track record
| 3+ agents review independently, then cross-review each other. Findings tagged as CONFIRMED, DISPUTED, or UNIQUE. | Agent accuracy is tracked per-category. Dispatch weights adjust automatically โ the best agent for the job gets picked. | When an agent keeps failing in a category, targeted skills are generated from failure data and injected into future prompts. Effectiveness is measured with a z-test on post-bind signals โ passed, failed, or inconclusive. |
| Mix Anthropic, Google, OpenAI, and OpenClaw agents in one team. Each brings different strengths. Native agents need no API key. ๐ฆ Lobster friendly. | Real-time view of tasks, consensus reports, agent scores, and activity feed. Terminal Amber theme. WebSocket updates. | Per-agent cognitive memory persists across sessions. Agents remember past findings, patterns, and project context. |
| Works with |
Not yet |
Windsurf Not yet |
VS Code Not yet |
| Provider gateways |
How it worksThe Mermaid diagram above shows the loop end-to-end. Here's the per-step definition:
Both types participate equally in consensus, cross-review, and skill development. Native subagents get skill files injected into their system prompts and can call Requirements: Node.js 22+ and Claude Code. npm install -g gossipcat && claude mcp add gossipcat -s user -- gossipcatRestart Claude Code. Then in any project, ask:
Manual MCP config (if
|
| What you get | |
|---|---|
| MCP server | Bundled binary at dist-mcp/mcp-server.js, wired as the gossipcat command on PATH |
| Dashboard | Prebuilt static assets in dist-dashboard/ โ launches automatically on a dynamic port (ask Claude Code "what's my gossipcat dashboard URL?"). Override with GOSSIPCAT_PORT=24420 if you want a stable port. |
| Default skills + rules + archetypes | 16 bundled skill templates, operational rules, and project archetypes copied into the install |
| Postinstall wizard | Writes .mcp.json with correct absolute paths for your machine |
Pin to a specific npm version:
npm install -g gossipcat@0.4.14Pin to a specific GitHub release tarball (version-locked, bypasses npm registry):
npm install -g https://github.com/gossipcat-ai/gossipcat-ai/releases/download/v0.4.14/gossipcat-0.4.14.tgzProject-local install (each project gets its own gossipcat):
cd your-project
npm install --save-dev gossipcatThe postinstall writes .mcp.json to your project root. Open Claude Code in that directory and gossipcat connects automatically โ no claude mcp add needed.
From source (contributors):
git clone https://github.com/gossipcat-ai/gossipcat-ai.git
cd gossipcat-ai
npm install
npm run build:mcp
claude mcp add gossipcat -s user -- node "$PWD/dist-mcp/mcp-server.js"Re-run the install โ npm will fetch the latest version and replace the installed binary:
npm install -g gossipcat@latestOr in-session, ask Claude Code: "Check for gossipcat updates" โ the gossip_update tool fetches the latest release notes and applies the upgrade with your confirmation.
Add env vars for the providers you want to use. Pass them with -e when registering, or set them in your shell environment.
| Provider | Env var | Notes |
|---|---|---|
| Native (Claude Code) | โ | Dispatches through your active Claude Code subscription. No key needed. |
| Anthropic API | ANTHROPIC_API_KEY |
Direct API access if you don't want to go through Claude Code. |
| Google Gemini | GOOGLE_API_KEY |
Gemini Pro / Flash relay agents. |
| OpenAI | OPENAI_API_KEY (+ optional OPENAI_BASE_URL) |
GPT-4 / GPT-4o relay agents. OPENAI_BASE_URL lets you point at OpenAI-compatible gateways (Azure, Together, Groq, etc.). |
| OpenClaw | โ (local gateway) | OpenAI-compatible, defaults to http://127.0.0.1:18789/v1. No API key โ auth handled by your local OpenClaw daemon. |
| Ollama (local) | โ | Runs locally via http://localhost:11434. No key. Pull your model first with ollama pull llama3.1:8b. |
Native only (zero API keys โ everything runs through Claude Code):
claude mcp add gossipcat -s user -- gossipcatThen in session ask for a team built from sonnet-reviewer / haiku-researcher / opus-implementer. Native agents dispatch through Agent() and relay back. Good zero-config starting point.
Anthropic API (direct, bypasses Claude Code):
claude mcp add gossipcat -s user \
-e ANTHROPIC_API_KEY=sk-ant-... \
-- gossipcatUse this if you want relay agents running Claude models without going through the Claude Code subscription path โ e.g. for parallelism beyond Claude Code's concurrency cap, or for running long background reviews while you keep working.
Google Gemini:
claude mcp add gossipcat -s user \
-e GOOGLE_API_KEY=AIza... \
-- gossipcatEnables gemini-reviewer, gemini-tester, gemini-implementer on the relay. Watch the quota โ gossipcat has a built-in 429 watcher that falls back to native agents when Gemini is cooling down.
OpenAI (and OpenAI-compatible gateways):
claude mcp add gossipcat -s user \
-e OPENAI_API_KEY=sk-... \
-- gossipcatFor Azure / Together / Groq / OpenRouter, add OPENAI_BASE_URL:
claude mcp add gossipcat -s user \
-e OPENAI_API_KEY=your-key \
-e OPENAI_BASE_URL=https://api.groq.com/openai/v1 \
-- gossipcatOpenClaw (local gateway):
# Start the OpenClaw daemon first (see openclaw docs), default port 18789
claude mcp add gossipcat -s user -- gossipcatNo env vars. Configure an agent with provider: "openclaw" in .gossip/config.json and gossipcat talks to the local gateway automatically. Override the port with base_url in the agent config if your daemon runs elsewhere.
Ollama (fully local, no API):
# Pull a model once
ollama pull llama3.1:8b
# Then register gossipcat
claude mcp add gossipcat -s user -- gossipcatConfigure the agent with provider: "local" and model: "llama3.1:8b" in .gossip/config.json. Good for airgapped dev, offline work, and burning-down-test-debt sessions where you don't want to spend API credits.
Mixed setup (common production shape โ Gemini cheap reviewers + Anthropic heavy implementers):
claude mcp add gossipcat -s user \
-e GOOGLE_API_KEY=AIza... \
-e ANTHROPIC_API_KEY=sk-ant-... \
-- gossipcatThen set up a team with gemini-reviewer + haiku-researcher (native) + opus-implementer (native) + sonnet-reviewer (native). Gossipcat dispatches by category strength from the signal pipeline.
Keys are stored persistently and cross-platform:
- macOS โ OS Keychain
- Linux โ Secret Service (
secret-tool) - Windows / other โ AES-256-GCM encrypted file
Start a Claude Code session in any project and ask Claude to set up your team:
"Set up a gossipcat team with a Gemini reviewer and a Sonnet implementer"
Claude Code calls gossip_setup() to create your .gossip/config.json and agent definitions. You choose the providers, models, and roles โ gossipcat adapts to your setup.
Available presets: reviewer, implementer, tester, researcher, debugger, architect, security, designer, planner, devops, documenter
The fastest path from "just installed" to "first useful review". If you skip this section you'll probably get stuck on the same things everyone else gets stuck on.
cd ~/your-project
claudeGossipcat is registered globally now, so it boots automatically. You'll see it in the MCP server list.
In Claude Code, just type:
Run gossip_status
This loads gossipcat's operating rules into the current session, creates .gossip/ in your project on first run, and prints the dashboard URL + auth key. Copy the key โ you'll paste it into the dashboard once.
You'll see something like:
Status:
Host: claude-code (native agents supported)
Relay: running :49664
Workers: 0
Dashboard: http://localhost:49664/dashboard (key: c3208820f8f70605fd45fa90004a2a4b)
Quota: google โ OK
Open the dashboard URL in your browser, paste the key. You're now connected.
Tell Claude what you're building:
"Set up a gossipcat team for this project โ it's a TypeScript Next.js app with a Postgres backend and Stripe payments."
Claude calls gossip_setup() and proposes a team. Typical proposal:
Proposed team:
- sonnet-reviewer (anthropic/claude-sonnet-4-6, native) reviewer + security
- gemini-reviewer (google/gemini-2.5-pro, relay) reviewer + types
- haiku-researcher (anthropic/claude-haiku-4-5, native) researcher
- opus-implementer (anthropic/claude-opus-4-6, native) implementer
Approve? (y/n)
Native agents (native: true) run through your existing Claude Code subscription โ no API key needed. Relay agents need a key for their provider. If you don't have a Google API key, drop gemini-reviewer from the team for now and add it later.
Once you approve, gossipcat writes .gossip/config.json and the agents are live.
In a project where you've made some changes:
"Do a consensus review of my recent changes"
What happens (typical timing):
| Phase | Time | What you see |
|---|---|---|
| 1. Decompose | 1s | Claude picks agents and dispatches them in parallel |
| 2. Independent review | 30sโ2min | Each agent reads your diff and reports findings |
| 3. Cross-review | 30sโ1min | Each agent reviews the others' findings |
| 4. Consensus report | <1s | Findings tagged CONFIRMED / DISPUTED / UNVERIFIED / UNIQUE |
| 5. Verification | varies | Claude reads UNVERIFIED findings against the code, decides if they're real |
| 6. Signal recording | <1s | Accuracy signals saved per agent |
You get a report like:
Consensus round b81956b2-e0fa4ea4 โ 3 agents
CONFIRMED (2):
[critical] Race condition in tasks Map at server.ts:47 โ sonnet + gemini
[high] Missing auth on WebSocket upgrade at server.ts:112 โ sonnet + gemini
UNIQUE (1):
[medium] String concat in SQL query at queries.ts:88 โ only sonnet caught this
DISPUTED (1):
[low] "Memory leak in timer" โ haiku says yes, sonnet/gemini say no
โ verified, sonnet was right (not a leak โ cleanup is in finally)
Final: 3 real bugs to fix, 1 false alarm caught by cross-review.
You only act on CONFIRMED + verified UNIQUE findings. The cross-review is the whole point โ single-agent reviews ship hallucinated bugs as critical findings 5โ10% of the time. Cross-review with verification drops that to under 1%.
The dashboard shows everything live: agents, scores, active tasks, consensus reports, signals. You can leave it open in a tab while you work โ every gossipcat tool call pushes an update via WebSocket.
That's the basic loop. The rest of this README covers advanced workflows, troubleshooting, and how to interpret what you're seeing.
Concrete recipes for the most common workflows. Each one shows what to type, what you'll get back, and what to do with it.
Type:
"Review my staged changes"
What you'll get: A consensus report (1โ3 minutes) with findings tagged CONFIRMED / UNIQUE / DISPUTED. Claude verifies UNVERIFIED findings against the code and tells you which are real.
What to do with it: Fix the CONFIRMED + verified-real findings. Ignore disputed-but-falsified findings. If a finding looks important but you disagree, ask Claude "verify finding f3 against the code yourself" โ it'll re-check and either back you up or push back.
When NOT to use it: Tiny diffs (under 20 lines) โ overhead exceeds value. Just eyeball them.
Type:
"Security audit the payment handler at lib/stripe/webhook.ts"
What you'll get: Each security-skilled agent reviews from a different angle (OWASP, input validation, auth, secrets). Findings get cross-validated. Real vulns surface; theoretical ones get caught and dropped.
What to do with it: Fix critical/high findings before merge. Bookmark medium/low findings for the next pass.
Tip: Be specific about the file or module. "Security audit the codebase" is too broad and produces noisy results. "Security audit lib/stripe/webhook.ts" produces actionable findings.
Type:
"Research how the WebSocket connection lifecycle works in this project before I touch it"
What you'll get: A research agent (haiku-researcher by default โ fast and cheap) reads the code, traces call paths, and writes a summary. The summary is saved to that agent's cognitive memory so the next time you ask about the same area it remembers.
What to do with it: Use the summary to plan your change. The agent will reference it next time you ask anything related โ no re-discovery cost.
Type:
"I think there's a race condition in the tasks Map at server.ts:47 โ check if I'm right"
What you'll get: Two agents independently check the specific claim and either confirm or push back. Author self-review is optimistic โ this isn't.
What to do with it: If both agree with you, fix it. If they push back, read their reasoning before defending your hypothesis. They might be right.
Type:
"Show me agent scores"
What you'll get: A table of agents sorted by reliability with per-category accuracy and dispatch weights. Categories include trust_boundaries, injection_vectors, concurrency, error_handling, data_integrity, type_safety, etc.
What to do with it: If gemini-reviewer is sitting at 30% accuracy on concurrency, you know not to trust its concurrency findings without cross-review. If sonnet-reviewer is at 90% on trust_boundaries, you can ship its findings on auth/session bugs with high confidence.
Type:
"gemini-reviewer keeps hallucinating about concurrency โ develop a skill for it"
What you'll get: Gossipcat reads gemini-reviewer's failure data, generates a targeted skill file with concrete anti-patterns, and injects it into the agent's prompt for all future concurrency-related reviews. Effectiveness is measured statistically (z-test on post-bind signals) โ it'll tell you if the skill is actually working after ~30 dispatches.
What to do with it: Nothing โ it's automatic. Just keep using the agent. Over time, the failure rate drops.
Type:
"Set up a gossipcat team for a TypeScript Cloudflare Workers project with Drizzle ORM and KV storage"
What you'll get: A proposed team with archetypes matched to your stack. Worker projects need different reviewers than long-running Node services โ gossipcat picks accordingly.
What to do with it: Review the proposal, drop agents you can't run (missing API keys), approve.
- Don't ask for "review the whole codebase" โ too broad, agents will pick whatever they find first. Scope to a file, module, or diff.
- Don't approve findings without reading them โ even after cross-review, ~5% of findings are genuinely wrong. The reasoning matters more than the verdict.
- Don't ignore the dashboard โ when something feels weird (slow dispatch, repeated failures, suspicious findings), the dashboard usually shows you why before you have to ask.
- Don't run consensus mode for trivial questions โ
gossip_runwith one agent is fine for "what does this function do?"-tier queries. Save consensus for changes that touch shared state, auth, persistence, or the dispatch pipeline itself.
The dashboard at http://localhost:<port>/dashboard is the visual layer over everything gossipcat knows. Open it once with the auth key from gossip_status, leave the tab open while you work. Updates push live via WebSocket.
| Panel | What it shows | When to look at it |
|---|---|---|
| Overview | Active agents, dispatch weights, recent finding counts | First thing in the morning โ quick sanity check |
| Team | All agents sorted by reliability score, with category breakdowns | Picking which agent to trust for a tricky finding |
| Tasks | Live + historical task list with agent, duration, status | When something feels stuck โ find it here first |
| Findings | Consensus reports paginated by round, with CONFIRMED/DISPUTED/UNVERIFIED breakdowns | Reviewing what got caught in a recent review |
| Agent detail | Per-agent memory entries, skills, score history, task history | Diagnosing why a specific agent keeps failing in a category |
| Signals | Raw signal feed (agreement / hallucination / unique_confirmed) | Auditing the scoring pipeline if scores look wrong |
| Logs | mcp.log content (boot, errors, warnings) | When the MCP server is misbehaving and you need raw evidence |
Auth keys rotate every session. A fresh key is generated each time gossipcat boots. If the dashboard says "unauthorized", run gossip_status again to get the new key.
The auth key rotates every boot. Run gossip_status in Claude Code to get the current key, paste it into the dashboard login.
Check ~/.gossip/mcp.log (or <your-project>/.gossip/mcp.log) for the boot log. Look for the [gossipcat] ๐ Dashboard: line โ that's the actual port. If it's missing, the relay didn't start. Common causes:
- Conflicting
.gossip/relay.pidfrom a crashed previous boot โ delete it and restart Claude Code GOSSIPCAT_PORTset to a port already in use โ unset the env var or pick a free port
This was a critical bug in v0.1.0 โ fixed in v0.1.1. Upgrade with the install one-liner above. v0.1.1+ boots in degraded mode (dashboard + relay only) so you can run gossip_setup from inside Claude Code.
Usually a model or quota problem. Check gossip_status โ it shows Quota: google โ OK (or cooling down) per provider. If you're rate-limited, gossipcat will fall back to native agents automatically, but fallback agents may not be in your team. Either wait for the cooldown or add native agents to your team.
Record a hallucination_caught signal: ask Claude "record a hallucination_caught signal for finding f3 in the last consensus round โ it claimed X but the code shows Y". After 3 such signals, the offending agent's score drops in that category and the orchestrator stops asking it questions in that area.
Edit .gossip/config.json directly. Any OpenAI-compatible endpoint works via provider: "openai" + base_url. Local models work via Ollama (provider: "local"). See the Configuration section.
The strict <agent_finding> parser drops tags whose type isn't one of finding | suggestion | insight (see invariant #8 in docs/HANDBOOK.md). When that happens, the gossip_signals receipt surfaces the drop count and a finding_dropped_format pipeline signal is emitted. Check the consensus round's droppedFindingsByType field on the dashboard โ it names the offending type. If you see <agent_finding> instead of raw <agent_finding>, a transport layer is entity-encoding the output; pass agent output verbatim to gossip_relay.
Already supported as of v0.1.1 โ each instance gets its own dynamic port. If you want a stable port for one specific instance (e.g. for browser bookmarks), set GOSSIPCAT_PORT=24420 for that one project's environment.
npm uninstall -g gossipcat
claude mcp remove gossipcat -s user
rm -rf ~/.gossip # if you want to wipe global memory + signals
rm -rf <project>/.gossip # if you want to wipe per-project stateFile an issue at https://github.com/gossipcat-ai/gossipcat-ai/issues. Include the contents of .gossip/mcp.log (last 100 lines) and the output of gossip_status. Or ask Claude in-session: "file a gossipcat bug report about <...>" โ the gossip_bug_feedback tool packages it up automatically.
Claude Code translates your natural-language requests into gossipcat MCP tool calls automatically โ you don't need to type these โ but if you want fine-grained control they're documented here:
gossip_run(agent_id: "auto", task: "...") โ single-agent task
gossip_dispatch(mode: "consensus", tasks: [...]) โ multi-agent review with cross-review
gossip_collect(consensus: true) โ wait for results, run consensus
gossip_signals(action: "record", signals: [...]) โ record accuracy after verification
gossip_scores() โ view agent performance
gossip_skills(action: "develop", ...) โ improve a struggling agent
gossip_status() โ system status + dashboard URL
gossip_setup(...) โ create or update your team
The dispatch rules at .claude/rules/gossipcat.md (auto-generated on first boot) teach Claude Code when to pick which mode based on what your change touches. You can edit these rules to bias the dispatch.
Currently supported orchestrator: Claude Code only. Cursor, Windsurf, and VS Code agents do not yet have native agent support and cannot run consensus cross-review. If you are reading this as a Claude Code instance with the gossipcat MCP server connected, this guide applies to you.
If you are a Claude Code instance reading this to understand how to use gossipcat, here is the complete operational guide.
You are the orchestrator. Gossipcat gives you tools to dispatch work to a team of specialized agents, collect their results,
Release History
| Version | Changes | Urgency | Date |
|---|---|---|---|
| v0.4.15 | ## What's Changed * test(install): regression guards for tarball integrity + rules drift by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/186 * docs(claude-md): add '## Your Role' heading + restore drift anchor by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/187 * feat(orchestrator,cli): L2 โ signal-writer visibility (Symbol accessor + typed helpers + wiring) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/190 * test(signal-pipeline): | High | 4/20/2026 |
| v0.4.14 | ## What's Changed * feat(orchestrator): Layer 1 signal-pipeline parity test by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/173 * feat(orchestrator): Layer 3 signal-pipeline drift detector by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/174 * feat(security): gossip_remember + memory_query hardening by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/175 * fix(completion-signals): _system guard + parity length assert + redundant nullish b | High | 4/19/2026 |
| v0.4.13 | ## What's Changed * fix(signal-pipeline): shared emitCompletionSignals helper + error-path + memoryQueryCalled threading by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/169 * chore(release): v0.4.13 by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/170 **Full Changelog**: https://github.com/gossipcat-ai/gossipcat-ai/compare/v0.4.12...v0.4.13 | High | 4/18/2026 |
| v0.4.12 | ## What's Changed * fix(sandbox): self-documenting deny + gossip_setup env-var tip by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/167 * chore(release): v0.4.12 by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/168 **Full Changelog**: https://github.com/gossipcat-ai/gossipcat-ai/compare/v0.4.11...v0.4.12 | High | 4/18/2026 |
| v0.4.11 | ## What's Changed * feat(observability): classifier expansion + gossip_watch + pipeline signals + rotation by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/163 * fix(sandbox): orchestrator env exemption + quoted-arg extraction (#162) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/164 * docs(gossip_watch): surface in CLAUDE.md + gossip_status banner by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/165 * chore(release): v0.4.11 by @atab | High | 4/18/2026 |
| v0.4.10 | ## What's Changed * feat(dashboard): auto-benching v2 badges by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/127 * fix(consensus): realpath-based containment (pre-#126 hardening) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/128 * feat(consensus): user-worktree citation resolution (#126) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/129 * feat(consensus): round-level retraction via gossip_signals consensus_id by @ataberk-xyz in h | High | 4/17/2026 |
| v0.4.9 | ## Highlights **Load-bearing fix for all-native consensus teams** (#121 / PR #123). Before this release, when `gossip_dispatch(mode: "consensus")` ran with all native agents and 0 relay workers, Phase 2 auto-verifier returned empty text in ~0ms, synthesis proceeded with zero peer input, and every finding was tagged UNIQUE regardless of real overlap. No error surfaced. If your team is all native (Claude Code subagents with no Gemini/OpenAI workers), you should upgrade. ## Shipped in 0.4.9 ### | High | 4/17/2026 |
| v0.4.8 | ## What's Changed * fix(sandbox+tools): L3 scope exclusion + git execFile ENOENT retry by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/109 * fix(tools): Tool Server cwd divergence โ worktree relative paths by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/110 * chore(release): v0.4.8 by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/111 **Full Changelog**: https://github.com/gossipcat-ai/gossipcat-ai/compare/v0.4.7...v0.4.8 | High | 4/16/2026 |
| v0.4.7 | ## [0.4.7] โ 2026-04-16 Layer 3 audit overhaul โ scoped-mode noise drops 99.9%, worktree-mode 82%. Shipped via #107 which stacks two commits: ### Fixed - **`find -prune` instead of `-not -path`** for L3 exclusions. `-not -path` only filters `find`'s output; the scan still descends into every excluded directory. On macOS that means `find` enters `~/Library/Application Support/{Safari,Photos,Group Containers}/` and hits TCC "Operation not permitted" every dispatch, producing a noisy `find parti | High | 4/16/2026 |
| v0.4.6 | ## [0.4.6] โ 2026-04-16 ### Fixed - **Layer 3 audit now excludes user-level app directories** (#105). Live-fire verification against v0.4.5 produced 44 "boundary escape" violations that were 100% OS-level churn โ Chrome cookies, Spotify cache, NordVPN data, Claude Code's own `~/.claude/projects/*.jsonl` session logs โ not a single agent action. These app directories are unreachable through the Tool Server sandbox or the Layer 2 PreToolUse hook, so false positives from them were pure noise drow | High | 4/16/2026 |
| v0.4.5 | ## [0.4.5] โ 2026-04-16 Tool Server union-of-roots โ worktree-mode relay agents can now actually write inside their own worktrees. ### Fixed - **Tool Server was worktree-blind on 6 file_* tools** (#103). `ToolServer.enforceWriteScope` correctly gated against the agent's assigned worktree root, but then `FileTools.fileWrite` called `Sandbox.validatePath` which re-resolved against `projectRoot` only. Worktrees live under `os.tmpdir()/gossip-wt-*` โ always outside `projectRoot` โ so every absolu | High | 4/16/2026 |
| v0.4.4 | ## [0.4.4] โ 2026-04-16 Point release closing two Layer 3 sandbox bugs surfaced by live-fire verification against the freshly-released v0.4.3 (consensus task `56641e6e`, 2026-04-16). ### Fixed - **Layer 3 `worktreePath` patch was a no-op** (#101). The F2 fix in #99 called `ctx.mainAgent.getTask(taskId)` after `collect()`, but `DispatchPipeline.collect()` deletes the task entry from its `this.tasks` Map before returning (default `consume: true`). Every relay worktree dispatch since #99 recorde | High | 4/16/2026 |
| v0.4.3 | ## [0.4.3] โ 2026-04-16 Headline: **worktree filesystem sandbox** โ the multi-layer defense for issue #90 ships end-to-end. Agents dispatched with `write_mode: "worktree"` are now soft-blocked from writing outside their isolated worktree by a PreToolUse hook (Layer 2), with a post-dispatch `find -newer` audit as a backstop (Layer 3) for escape paths the hook can't see. Plus a round of scoring corrections, dashboard polish, and relay resilience fixes driven by recent consensus rounds. ### Workt | High | 4/16/2026 |
| v0.4.2 | **README-only republish.** v0.4.1 shipped to npm before the README normalization PRs (#67/#68) merged, so the [npm package page](https://www.npmjs.com/package/gossipcat) displayed the old install one-liner while the GitHub README showed the shorter `npm install -g gossipcat` form. This release syncs the two. ## No code changes Bundle is byte-identical to 0.4.1 except for `README.md` and the `package.json` version bump. Upgrading from 0.4.1 is a no-op for behavior. ## Install ```bash npm instal | Medium | 4/14/2026 |
| v0.4.1 | A round of hardening driven by two internal consensus rounds that reviewed 0.4.0's own PRs and caught real security + correctness regressions the original review missed. Three stacked PRs (#63, #64, #65) close every HIGH/MEDIUM/LOW finding plus two silent-failure modes in already-merged 0.4.0 code. ## Security (merge-blockers caught in cross-review) - `gossip_plan` native utility now issues a `relay_token` (prevents fabricated-decomposition injection via `gossip_relay`). - Re-entry path validat | Medium | 4/14/2026 |
| v0.4.0 | ## [0.4.0] โ 2026-04-14 Combines the unreleased 0.3.0 work (server-side cross-review, memory pre-fetch, scoring, dashboard polish) with three new streams: HTTP file bridge infrastructure, the consensus type-contract fix, and a long-standing test bug cleanup. > **Note:** v0.3.0 was published to npm on 2026-04-13 but never cut a matching GitHub release. Its changes are included here under 0.4.0 rather than retroactively tagged โ the CHANGELOG entries below merge both cycles for a single coherent | Medium | 4/14/2026 |
| v0.3.0 | v0.3.0 was published to npm on 2026-04-13 but never got a matching GitHub release at the time. Tagging retroactively at commit 46d4615 to close the audit trail. The change log for the 0.3.0 work (server-side cross-review, memory pre-fetch, scoring, dashboard polish) is folded into the [v0.4.0 release notes](https://github.com/gossipcat-ai/gossipcat-ai/releases/tag/v0.4.0) โ see that entry for the full changelog. Users who installed gossipcat@0.3.0 from npm got these features; the 0.4.0 release | Medium | 4/14/2026 |
| v0.2.0 | ## What's Changed * fix(ports): sticky per-project port files for relay + HTTP MCP by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/16 * spec(skills): emergent skill engine by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/17 * dashboard(team-hero): 2x2 grid sorted by last dispatch by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/18 * fix(dashboard): render categoryAccuracy, not unbounded categoryStrengths by @ataberk-xyz in https://gith | Medium | 4/9/2026 |
| v0.1.2 | ## What's Changed * docs(readme): rewrite usage sections for clarity (First Run + Daily Use + Troubleshooting) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/9 * fix(release): two-stage flow that respects branch protection by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/10 * chore(gitignore): exclude .claude/agents/ and .claude/settings.local.json by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/11 * fix(native): make Claude Code a f | Medium | 4/9/2026 |
| v0.1.1 | ## What's Changed * feat(dashboard): phase 1 โ 6-tab SPA with auth, real-time WebSocket, consensus history by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/1 * feat: gossip_verify_memory โ on-demand staleness check for memory files by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/2 * fix(tests): clear 4 pre-existing test-debt suites (PR 1/3) by @ataberk-xyz in https://github.com/gossipcat-ai/gossipcat-ai/pull/4 * fix(signals): record per-signal timestamps | Medium | 4/8/2026 |

