freshcrate
Home > RAG & Memory > agentops

agentops

The operational layer for coding agents. Memory, validation, and feedback loops that compound between sessions.

Description

The operational layer for coding agents. Memory, validation, and feedback loops that compound between sessions.

README


Install

# Claude Code (recommended): marketplace + plugin install
claude plugin marketplace add boshu2/agentops
claude plugin install agentops@agentops-marketplace

# Codex CLI (v0.115.0+ native hooks by default)
curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-codex.sh | bash

# OpenCode
curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-opencode.sh | bash

# Other Skills-compatible agents (example: Cursor)
npx skills@latest add boshu2/agentops --cursor -g

Then type /quickstart in your agent chat.

For Codex, that installer stages the native plugin, installs ~/.codex/hooks.json, archives stale raw mirrors when found, and makes native hooks the default path. Restart Codex after install.

Concern Answer
What it touches Installs skills globally, writes knowledge artifacts to .agents/, registers Claude hooks in .claude/settings.json when requested, and for Codex writes the native plugin cache plus ~/.codex/hooks.json
Source code changes None. AgentOps does not modify your source code during install
Network behavior Install and update paths fetch from GitHub. Repo artifacts stay local unless you choose external tools, browsing, or remote model runtimes
Permission surface Skills may run shell commands and read or write repo files as part of agent work, so install it where you want an agent to operate
Reversible Remove the installed skill directories, delete .agents/, and remove hook entries from .claude/settings.json

Nothing modifies your source code.

Install `ao` CLI โ€” optional, unlocks the full repo-native layer

Skills work standalone. The ao CLI adds bookkeeping automation, retrieval and injection, maturity scoring, goals, and terminal-native flows.

brew tap boshu2/agentops https://github.com/boshu2/homebrew-agentops
brew install agentops
which ao
ao version

Or install via release binaries or build from source.

Other install notes โ€” Linux, OpenCode, configuration

On Linux, install system bubblewrap so Codex uses it directly:

sudo apt-get install -y bubblewrap

OpenCode details: .opencode/INSTALL.md

All configuration is optional. Full reference: docs/ENV-VARS.md

Troubleshooting: docs/troubleshooting.md


See It Work

One command โ€” validate a PR:

> /council validate this PR

[council] 3 judges spawned (independent, no anchoring)
[judge-1] PASS โ€” token bucket implementation correct
[judge-2] WARN โ€” rate limiting missing on /login endpoint
[judge-3] PASS โ€” Redis integration follows middleware pattern
Consensus: WARN โ€” add rate limiting to /login before shipping

Full pipeline โ€” research through post-mortem:

> /rpi "add retry backoff to rate limiter"

[research]    Found 3 prior learnings on rate limiting (injected)
[plan]        2 issues, 1 wave โ†’ epic ag-0058
[pre-mortem]  Council validates plan โ†’ PASS (knew about Redis choice)
[crank]       Parallel agents: Wave 1 โ–ˆโ–ˆ 2/2
[vibe]        Council validates code โ†’ PASS
[post-mortem] 2 new learnings โ†’ .agents/
[flywheel]    Next: /rpi "add circuit breaker to external API calls"

The endgame โ€” define goals, walk away, come back to a better codebase:

> /evolve

[evolve] GOALS.md: 18 gates loaded, score 77.0% (14/18 passing)

[cycle-1]     Worst: wiring-closure (weight 6) + 3 more
              /rpi "Fix failing goals" โ†’ score 93.3% (25/28) โœ“

              โ”€โ”€ the agent naturally organizes into phases โ”€โ”€

[cycle-2-35]  Coverage blitz: 17 packages from ~85% โ†’ ~97% avg
[cycle-38-59] Benchmarks added to all 15 internal packages
[cycle-60-95] Complexity annihilation: zero functions >= 8
[cycle-96-116] Modernization: sentinel errors, exhaustive switches

[teardown]    203 files changed, 20K+ lines, 116 cycles
              All tests pass. Go vet clean. Avg coverage 97%.
              /post-mortem โ†’ 33 learnings extracted

That ran overnight on this repo. Regression gates auto-reverted anything that broke a passing goal.

More examples โ€” swarm, continuity, and intent-based entry points

Parallelize anything with /swarm:

> /swarm "research auth patterns, brainstorm rate limiting improvements"

[swarm] 3 agents spawned โ€” each gets fresh context
[agent-1] /research auth โ€” found JWT + session patterns, 2 prior learnings
[agent-2] /research rate-limiting โ€” found token bucket, middleware pattern
[agent-3] /brainstorm improvements โ€” 4 approaches ranked
[swarm] Complete โ€” artifacts in .agents/

Session continuity across compaction or restart:

> /handoff
[handoff] Saved: 3 open issues, current branch, next action
         Continuation prompt written to .agents/handoffs/

--- next session ---

> /recover
[recover] Found in-progress epic ag-0058 (2/5 issues closed)
          Branch: feature/rate-limiter
          Next: /implement ag-0058.3
Intent Commands What happens
Review before shipping /council validate this PR One command, actionable feedback
Understand before changing /research โ†’ /plan โ†’ /council validate Surface prior context, scope the work, then validate the approach
Ship one change end to end /rpi "add user auth" Run discovery through post-mortem in one flow
Parallelize or compound improvements /swarm + /evolve Fan out work and keep improving the repo over time

Start Here

A few commands, zero methodology. Pick an entry point and go:

/council validate this PR          # Multi-model code review โ€” immediate value
/research "how does auth work"     # Explore the codebase and surface prior bookkeeping
/pre-mortem "add retry backoff"    # Pressure-test the plan before you build
/implement "fix the login bug"     # Run one scoped task end to end

When you want bigger flows:

/plan โ†’ /crank                     # Decompose into issues, then parallel-execute
/validation                        # Review finished work and extract learnings
/rpi "add retry backoff"           # Full pipeline: discovery โ†’ build โ†’ validation โ†’ bookkeeping
/evolve                            # Fitness-scored improvement loop

If you want the explicit operator surface instead of individual primitives:

ao factory start --goal "fix auth startup"
/rpi "fix auth startup"           # or: ao rpi phased "fix auth startup"
ao codex stop

That path keeps briefing, runtime startup, delivery, and session closeout on one surface.

Full catalog: docs/SKILLS.md ยท Unsure which skill to run? Skill Router


What AgentOps Gives You

AgentOps gives your coding agent four things it does not have by default:

  1. Bookkeeping โ€” sessions do not just leave behind chat history; AgentOps captures learnings, findings, and reusable context, then resurfaces them through .agents/, retrieval, and the flywheel.
  2. Validation โ€” /pre-mortem, /vibe, and /council validate plans and code before they ship, and record what worked, what failed, and why.
  3. Primitives โ€” individually invocable skills, hooks, and CLI surfaces you can pull from for almost any interaction.
  4. Flows โ€” named compositions of those primitives for discovery, implementation, validation, and knowledge extraction that you can run separately, compose together, or automate end to end.

Session 1, your agent spends 2 hours debugging a timeout bug. Session 15, a new agent finds the answer in 10 seconds because the lesson was captured, validated, and surfaced back into the next cycle.

Primitives compose into flows, flows generate bookkeeping, validation shapes what gets promoted, and together they feed the flywheel so the repo compounds knowledge instead of resetting every session.

Under the hood, AgentOps acts as a context compiler: raw session signal becomes reusable knowledge, compiled prevention, and better next work.

flowchart LR
    P[Primitives<br/>skills, hooks, ao CLI] --> F[Flows<br/>discovery, implementation,<br/>validation, knowledge extraction]
    F --> B[Bookkeeping<br/>learnings, findings,<br/>reusable context]
    F --> V[Validation<br/>what worked,<br/>what failed, and why]
    B --> FW[(Flywheel<br/>capture -> retrieve -> promote)]
    V --> FW
    FW --> N[Next session<br/>better context,<br/>stronger gates, faster work]
    N --> F
Loading

Local and auditable: .agents/ is plain text you can grep, diff, review in PRs, and open in Obsidian. Stale insights decay. Useful ones promote.


Skills

Every skill works alone. Primitives are the single skills, hooks, and CLI surfaces. Flows are the named compositions built from them.

Skill What it does
/council Independent judges debate, surface disagreement, and converge. The core validation primitive
/research Discovery primitive โ€” explores the codebase and produces structured findings with prior bookkeeping surfaced at the right time
/implement Single-task flow โ€” research, plan, build, validate, learn
/rpi Full pipeline flow โ€” discovery โ†’ implementation โ†’ validation โ†’ bookkeeping
/vibe Code quality review โ€” complexity + council + domain checklists
/evolve Measure goals, fix the worst gap, regression-gate everything, repeat overnight
Full catalog โ€” validation, flows, bookkeeping, and supporting skills

Validation: /council ยท /vibe ยท /pre-mortem ยท /post-mortem

Flows: /research ยท /plan ยท /implement ยท /crank ยท /swarm ยท /rpi ยท /evolve

Bookkeeping: /retro ยท /forge ยท /flywheel ยท /compile

Session: /handoff ยท /recover ยท /status ยท /trace ยท /provenance

Product: /product ยท /goals ยท /release ยท /readme ยท /doc

Utility: /brainstorm ยท /bug-hunt ยท /complexity ยท /scaffold ยท /push

Full reference: docs/SKILLS.md

Cross-runtime orchestration โ€” mix Claude, Codex, and OpenCode

AgentOps orchestrates across runtimes. Claude can lead a team of Codex workers. Codex judges can review Claude's output.

Backend How it works Best for
Native teams TeamCreate + SendMessage Tight coordination, debate
Codex sub-agents /codex-team Cross-vendor validation
Background tasks Task(run_in_background=true) Fallback when no team APIs are available
How It Works โ€” phases, flywheel, and architecture

Phases

Phase Primary skills What you get
Discovery /brainstorm โ†’ /research โ†’ /plan โ†’ /pre-mortem Surfaces prior context, scopes the work, and pressure-tests the plan before build
Implementation /crank โ†’ /swarm โ†’ /implement Executes scoped work through composable primitives and wave-based coordination
Validation + bookkeeping /validation โ†’ /vibe โ†’ /post-mortem โ†’ /retro โ†’ /forge Captures what worked, what failed, and what should feed the next cycle

/rpi orchestrates all three phases. /evolve keeps running /rpi against GOALS.md so the worst fitness gap gets addressed next.

The explicit operator surface around that line is:

  • ao factory start for briefing-first startup
  • /rpi or ao rpi phased for delivery
  • ao codex stop for explicit session closeout

How bookkeeping compounds

.agents/ is the repo-native bookkeeping layer for what your agents learned, stored as plain files.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Traditional Cache          .agents/ Knowledge Store                    โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ Stores results     โ”‚    โ”‚ Stores extracted lessons                 โ”‚  โ”‚
โ”‚  โ”‚ Hit = skip compute โ”‚    โ”‚ Hit = skip the 2-hour debugging          โ”‚  โ”‚
โ”‚  โ”‚ Flat key-value     โ”‚    โ”‚ Hierarchical: learning โ†’ pattern โ†’ rule  โ”‚  โ”‚
โ”‚  โ”‚ Static after write โ”‚    โ”‚ Promotes through tiers over time         โ”‚  โ”‚
โ”‚  โ”‚ One consumer       โ”‚    โ”‚ Any agent, any runtime, any session      โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
> /research "retry backoff strategies"

[lookup] 3 prior learnings found (freshness-weighted):
  - Token bucket with Redis (established, high confidence)
  - Rate limit at middleware layer, not per-handler (pattern)
  - /login endpoint was missing rate limiting (decision)
[research] Found prior art in your codebase + retrieved context
           Recommends: exponential backoff with jitter, reuse existing Redis client

In repeated use, the compounding effect is that the environment gets smarter while the model stays the same.

Deep dive

Topic Where
Five pillars, operational invariants Architecture
Brownian Ratchet, context windowing How It Works
Injection philosophy, freshness decay, MemRL The Science
Context lifecycle, three-tier injection Context Lifecycle
Philosophy and observations Philosophy

Built on: Ralph Wiggum ยท Multiclaude ยท beads ยท CASS ยท MemRL


The ao CLI

The ao CLI adds repo-native bookkeeping automation, retrieval, decay, maturity scoring, and terminal-native flows that run without an active chat session.

ao seed                                    # Plant AgentOps in any repo
ao rpi loop --supervisor --max-cycles 1    # Canonical autonomous cycle
ao rpi phased --from=implementation ag-058 # Resume a specific phased run
ao search "query"                          # Search session history and repo-local bookkeeping
ao lookup --query "topic"                  # Retrieve curated learnings, patterns, and findings
ao context assemble                        # Build a task briefing
ao memory sync                             # Sync session history into MEMORY.md bookkeeping notes
ao metrics health                          # Flywheel health dashboard
ao demo                                    # Interactive demo

Full reference: CLI Commands


How AgentOps Fits With Other Tools

Tool What it does well What AgentOps adds
GSD Clean subagent spawning, fights context rot Cross-session bookkeeping โ€” carries reusable knowledge between sessions
Compound Engineer Knowledge compounding, structured loop Multi-model councils and validation gates

Detailed comparisons โ†’


Contributing

See docs/CONTRIBUTING.md. Agent contributors should also read AGENTS.md and use bd for issue tracking.

FAQ

docs/FAQ.md

License

Apache-2.0 ยท Docs ยท CLI Reference

Release History

VersionChangesUrgencyDate
v2.37.2`brew update && brew upgrade agentops` ยท `bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh)` ยท [checksums](https://github.com/boshu2/agentops/releases/download/v2.37.2/checksums.txt) ยท [verify provenance](https://docs.github.com/en/actions/security-for-github-actions/using-artifact-attestations/using-artifact-attestations-to-establish-provenance-for-builds) --- ## Highlights This hotfix hardens AgentOps' validation and execution surfaces across hooksHigh4/16/2026
v2.37.1`brew update && brew upgrade agentops` ยท `bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh)` ยท [checksums](https://github.com/boshu2/agentops/releases/download/v2.37.1/checksums.txt) ยท [verify provenance](https://docs.github.com/en/actions/security-for-github-actions/using-artifact-attestations/using-artifact-attestations-to-establish-provenance-for-builds) --- ## Highlights Dream now leaves behind actionable morning work instead of just a short overHigh4/15/2026
v2.37.0`brew update && brew upgrade agentops` ยท `bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh)` ยท [checksums](https://github.com/boshu2/agentops/releases/download/v2.37.0/checksums.txt) ยท [verify provenance](https://docs.github.com/en/actions/security-for-github-actions/using-artifact-attestations/using-artifact-attestations-to-establish-provenance-for-builds) --- ## Highlights This release pushes AgentOps further toward a repo-native knowledge workspacHigh4/14/2026
v2.36.0`brew update && brew upgrade agentops` ยท `bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh)` ยท [checksums](https://github.com/boshu2/agentops/releases/download/v2.36.0/checksums.txt) ยท [verify provenance](https://docs.github.com/en/actions/security-for-github-actions/using-artifact-attestations/using-artifact-attestations-to-establish-provenance-for-builds) --- ## Highlights This release turns Dream from a concept into a usable operator surface. AgenHigh4/10/2026
v2.35.0`brew update && brew upgrade agentops` ยท `bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh)` ยท [checksums](https://github.com/boshu2/agentops/releases/download/v2.35.0/checksums.txt) ยท [verify provenance](https://docs.github.com/en/actions/security-for-github-actions/using-artifact-attestations/using-artifact-attestations-to-establish-provenance-for-builds) --- ### Added - **Codex native hooks** โ€” AgentOps hooks now install natively into Codex CLI vHigh4/7/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

kelosKelos - The Kubernetes-native framework for orchestrating autonomous AI coding agents.v0.30.0
ai-agents-skills๐Ÿง  Enhance AI agents with a collection of skills for improved coding assistance, enabling efficient and production-ready solutions.master@2026-04-21
altk-evolveSelf improving agents through iterationsv1.0.10
context-modeContext window optimization for AI coding agents. Sandboxes tool output, 98% reduction. 12 platformsv1.0.89
meerkatMeerkat - A modular, high-performance agent harness built in Rust.v0.5.2