freshcrate
Skin:/
Home > AI Agents > codebot-ai

codebot-ai

Safe, local-first autonomous coding agent. Policy-governed, audit-trailed, sandboxed. Works with any LLM.

Why this rank:Release freshnessHealthy release cadenceStrong adoption

Description

Safe, local-first autonomous coding agent. Policy-governed, audit-trailed, sandboxed. Works with any LLM.

README

CodeBot AI

Open-source autonomous coding agent with a cryptographic audit trail.

For work you want to delegate, not just assist with โ€” and verify after the fact.

npm versionlicensetests

What CodeBot is

CodeBot runs coding tasks end-to-end. Point it at a GitHub issue, a problem statement, or a spec โ€” it reads the repo, makes the changes, runs the tests, and opens a PR. Every tool call it makes (every file it touches, every command it runs, every URL it fetches) is recorded in a SHA-256 hash-chained audit log. Tamper with the log, the chain breaks, you know.

It runs against the LLM you pick โ€” local Ollama / LM Studio / vLLM, or any of eight cloud providers โ€” through your API key, on your endpoint. Zero telemetry by default. MIT. Air-gapped if you want.

What CodeBot is NOT

CodeBot is not an AI-powered editor. Cursor, Zed, and VS Code with Copilot already own that category. If you want Tab-completion and inline suggestions while you type, one of those is a better fit โ€” CodeBot won't try to compete.

CodeBot is for the class of work that starts with "hey agent, go do this while I'm not watching" and ends with someone โ€” maybe you, maybe your auditor โ€” needing to know exactly what got done.

Who it's for

  • Security-conscious engineering teams that can't send code to third-party AI services but still want agent-level automation.
  • Regulated industries (fintech, healthcare, gov-adjacent) that need an auditable paper trail for every AI action.
  • Solo builders and small teams running AI on long-running tasks who need to verify results later.
  • Anyone who wants their AI agent to run Ollama, not send code to an API they don't control.

Quick Start

npm install -g codebot-ai
codebot --setup                    # auto-detects local LLMs + cloud keys
codebot "refactor auth to use JWT" # run a task
codebot --dashboard                # web UI at localhost:3120
codebot --solve https://github.com/you/repo/issues/42  # issue โ†’ tested PR

Hero workflow โ€” --solve

Point CodeBot at a GitHub issue and walk away:

codebot --solve https://github.com/you/repo/issues/42

An 8-phase pipeline runs autonomously:

  1. Parse โ€” extract requirements from the issue
  2. Clone โ€” shallow-clone the target repo
  3. Analyze โ€” map the codebase, locate relevant files
  4. Install โ€” detect package manager, install deps
  5. Fix โ€” apply code changes guided by the issue
  6. Test โ€” run the suite, iterate until green
  7. Self-review โ€” audit the diff for regressions
  8. PR โ€” open a pull request with the audit trail attached

Every phase writes to the hash-chained log. If the agent does anything unexpected, you can prove it after the fact.

Second workflow โ€” --vault (research assistant over your notes)

Point CodeBot at a folder of markdown notes and ask questions:

codebot --vault ~/Documents/my-notes "what did I capture about Q3 strategy?"

CodeBot reads your notes, synthesizes an answer, and cites the files it actually consulted. Read-only by default โ€” it won't edit or create anything. No network calls unless you opt in. Every file it opens goes into the same hash-chained audit log: you can prove exactly which notes the AI touched.

# Interactive mode โ€” open a session over the vault and ask follow-ups
codebot --vault ~/Documents/my-notes

# Allow CodeBot to create or edit notes when you ask it to
codebot --vault ~/Documents/my-notes --vault-writable

# Allow outbound web_fetch / http_client when you want it to look something up
codebot --vault ~/Documents/my-notes --vault-allow-network

Works with any markdown folder โ€” Obsidian vaults, plain ~/notes, dumped Evernote exports. .obsidian/, .git/, and node_modules/ are automatically skipped.

How CodeBot differs

Cursor / Copilot Aider Devin CodeBot
Autonomous issue-to-PR No Partial Yes Yes
Cryptographic audit trail No No No Yes
Local LLM supported No Yes No Yes
Policy + risk-scoring layer No No Partial Yes
SARIF export for CI No No No Yes
MIT-licensed / open source No Yes No Yes
Runs fully offline (with local LLM) No Yes No Yes
Price $20/mo Free $500/mo Free / MIT

Three pillars

1. Autonomous, not interactive

CodeBot takes a task and finishes it. No inline suggestions, no "accept completion." You hand it a goal; it runs the loop (read โ†’ plan โ†’ edit โ†’ test โ†’ review) until done or explicitly stopped. Iteration budget, timeout, and max-cost are all configurable.

2. Cryptographic audit trail

Every tool call is logged as an append-only entry containing prevHash + content, hashed with SHA-256. Tampering breaks the chain. Entries include the tool name, arguments, return value size, timestamp, session ID, and 7-factor risk score. Export to SARIF 2.1.0 for CI integration.

Run codebot audit verify <session-id> any time to re-hash and prove the log hasn't been modified.

3. Runs where your code can't leave

Eight providers: Ollama / LM Studio / vLLM (fully local, offline-capable) and Anthropic / OpenAI / Google / DeepSeek / Groq / Mistral / xAI (cloud, your keys). No CodeBot-hosted relay. No opt-in-required telemetry (the heartbeat ping is off by default and won't turn itself on). Works on an air-gapped network with a local LLM.

Real benchmark

SWE-bench Verified, 50-task slice, Docker-scored: 17 tasks resolved unattended (34.0% over attempted, 51.5% over submitted patches). Mid-tier-open-source range, reproducible, harness in bench/swe/. Full report.

This is a ceiling number, not a growth number โ€” what it proves is that the agent loop genuinely works end-to-end, not just in demos.

Architecture

User โ†’ Agent Loop โ†’ Policy Enforcer โ†’ Risk Scorer โ†’ CORD Safety Engine โ†’ Tool Executor
             โ†“              โ†“              โ†“                 โ†“                โ†“
       8 providers   Denied paths    7 factors       Constitutional     36 tools
       (local+cloud)  Writable scope  (0-100 score)   rules + VIGIL    (code, shell,
                                                                        browser, gitโ€ฆ)
             โ†“
      Hash-chained audit log (SARIF export) โ”€โ”€โ”€โ”€โ”€โ†’ every call, always

Extend

import { Agent, OpenAIProvider } from 'codebot-ai';

const agent = new Agent({
  provider: new OpenAIProvider({
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-5.4',
  }),
  model: 'gpt-5.4',
  autoApprove: true,
});

for await (const event of agent.run('list all TypeScript files and count them')) {
  if (event.type === 'text') process.stdout.write(event.text || '');
}

Custom tools via .codebot/plugins/ ยท MCP servers via .codebot/mcp.json ยท VS Code extension ยท GitHub Action

The honest limits

  • Not a Cursor replacement. No tab-completion, no inline suggestions, no in-editor UX.
  • Autonomous โ‰  perfect. SWE-bench Verified pass rate is 34% unattended. Humans still need to review PRs.
  • Local LLM quality is LLM-dependent. A 7B model won't solve what gpt-5.4 solves. You pick the tradeoff.
  • Policy enforcement is safety, not a guarantee. CORD + risk scoring reduce the blast radius of agent mistakes; they don't eliminate them. Use git, use branches, use CI.

Docs ยท Changelog ยท Security ยท Compliance ยท Contributing

MIT โ€” Ascendral

Release History

VersionChangesUrgencyDate
v2.10.0## What's New in v2.10.0 ### --solve: Autonomous Issue-to-PR Pipeline Point CodeBot at a GitHub issue, get back a reviewed PR with full audit trail. ``` codebot --solve https://github.com/you/repo/issues/42 ``` 8 phases: Parse โ†’ Clone โ†’ Analyze โ†’ Install โ†’ Fix โ†’ Test โ†’ Self-review โ†’ PR ### Electron Desktop App - Signed, notarized, Gatekeeper-approved (macOS arm64) - 113MB DMG (down from 343MB) - Auto-restart on crash with exponential backoff - Network drop recovery with reconnecting overlay - Medium3/22/2026
v2.9.0## CodeBot AI v2.9.0 Autonomous coding agent with governed execution. ### Highlights - **--solve command**: Point at a GitHub issue โ†’ get a PR with full audit trail - **Constitutional safety (CORD)**: Every action goes through a safety layer - **32 tools**: File ops, git, web, browser, docker, SSH, and more - **Self-review**: Agent reviews its own diff before committing - **Audit trail**: Every phase logged, every decision traceable - **Electron desktop app**: Signed, notarized, Gatekeeper-appLow3/19/2026
v2.8.0## What's New ### Dashboard Models Panel - VRAM detection card with real-time usage bar - Quantization advisor โ€” select a model size, get recommendations based on available VRAM - Local model browser showing installed Ollama models ### CodeAGI Continuous Mode - Auto-run mission cycles on a configurable timer (30s to 5m intervals) - SSE streaming for real-time phase updates - Error-based auto-stop after 2+ consecutive failures ### Documentation Overhaul - Added "Who This Is For" section targetLow3/15/2026
v2.5.2## What's New ### Command Center (Dashboard) - **Terminal** โ€” execute shell commands with live streaming output, command history - **Quick Actions** โ€” 8 one-click buttons (Git Status, Run Tests, Git Log, Git Diff, Health Check, List Tools, List Files, NPM Outdated) - **Chat** โ€” interactive AI chat with agent (`codebot --dashboard`) - **Tool Runner** โ€” select any tool, fill parameters, execute with result display - **Standalone mode** โ€” Terminal + Quick Actions work without agent connection ###Low3/5/2026
v1.4.3## The Problem During multi-tool interactions, CodeBot's message history could get corrupted, causing OpenAI to permanently reject all requests with: ``` Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls' ``` ## Root Cause The context compaction code iterated message-by-message backward and could split `assistant + tool_response` groups in half. Example: ``` [assistant with tool_calls: call_1, call_2] โ† DROPPED (budget exceeded) [tool:Low2/28/2026
v1.4.2## What's Fixed **Message history corruption causing OpenAI 400 errors** โ€” During multi-tool-call interactions, the message history could become corrupted (orphaned tool messages, duplicates), causing OpenAI to reject all subsequent requests with: ``` Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls' ``` The circuit breaker (from v1.4.1) would stop the loop after 3 attempts, but the session was permanently broken. ## The Fix Enhanced `Low2/28/2026
v1.4.1## Bug Fix **Fixed:** Agent was looping 50 times on non-retryable errors (missing API key, auth failure, billing issues) instead of stopping immediately. ### What changed - **Fatal error detection** โ€” New `isFatalError()` recognizes permanent failures (missing API key, 401/403, billing, model not found) and stops the agent loop immediately - **Circuit breaker** โ€” If the same non-fatal error repeats 3 times consecutively, the agent stops instead of burning through all 50 iterations - **Early ALow2/28/2026
v1.4.0## 15 New Tools (13 โ†’ 28 total) CodeBot now has parity with top-tier coding agents. All tools are zero-dependency, using only Node.js built-ins. ### Tier 1 โ€” Intelligence | Tool | What It Does | |------|-------------| | `git` | status, diff, log, commit, branch, checkout, stash, push, pull, merge, blame, tag | | `code_analysis` | Symbol extraction, find references, imports, file outline | | `multi_search` | Fuzzy search across filenames + content + symbols with ranking | | `task_planner` | HieLow2/28/2026
v1.3.0## What's New **9 stability fixes** that eliminate silent crashes during continuous operation. ### Error Recovery - **Automatic retry** โ€” 429 rate limits, 5xx server errors, and network failures (ECONNRESET, ETIMEDOUT) now retry with exponential backoff + jitter, respecting Retry-After headers - **Stream crash recovery** โ€” if the LLM connection drops mid-response, the agent loop retries on the next iteration instead of dying - **Compaction fallback** โ€” if LLM-powered context summarization failLow2/27/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

skalesYour local AI Desktop Agent for Windows, macOS & Linux. Agent Skills (SKILL.md), autonomous coding (Codework), multi-agent teams, desktop automation, 15+ AI providers, Desktop Buddy. No Docker, no terv11.1.6
kotefAI dev that actually gets things done0.0.0
opencode-telegram-botOpenCode mobile client via Telegram: run and monitor AI coding tasks from your phone while everything runs locally on your machine. Scheduled tasks support. Can be used as lightweight OpenClaw alternav0.21.1
mainframeAI-native development environment for orchestrating agentsv0.20.0
GenericAgentSelf-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumptionv0.1.0

More from Ascendral

KlomboAGIAutonomous cognition runtime โ€” persistent memory, world model, planner-verifier-critic loop, LLM-powered reasoning. Python.

More in AI Agents

hermes-agentThe agent that grows with you
awesome-copilotCommunity-contributed instructions, agents, skills, and configurations to help you make the most of GitHub Copilot.
CopilotKitThe Frontend Stack for Agents & Generative UI. React + Angular. Makers of the AG-UI Protocol
e2bE2B SDK that give agents cloud environments