freshcrate
Home > Frameworks > mini-agent

mini-agent

The AI agent that sees before it acts — perception-driven, file-based, pluggable personal AI agent framework

Description

The AI agent that sees before it acts — perception-driven, file-based, pluggable personal AI agent framework

README

mini-agent

License: MIT TypeScript In Production

The AI agent that sees before it acts.

Most agent frameworks are goal-driven: give it a task, get steps back. mini-agent is perception-driven — it observes your environment continuously, then decides whether to act. Goal-driven agents fail when the goal is wrong. Perception-driven agents adapt to what's actually happening.

Shell scripts define what the agent can see. Claude decides what to do. No database, no embeddings — just Markdown files + shell scripts + Claude CLI.

demo

Quick Start

Prerequisites: Node.js 20+ and Claude CLI (npm install -g @anthropic-ai/claude-code)

# Install (pnpm auto-installed if needed)
curl -fsSL https://raw.githubusercontent.com/miles990/mini-agent/main/install.sh | bash

# Interactive chat — auto-creates agent-compose.yaml on first run
mini-agent

# Run autonomously in background
mini-agent up -d        # Start the OODA loop
mini-agent status       # What is it doing?
mini-agent logs -f      # Watch it think

What a Cycle Looks Like

── Perceive ─────────────────────────────────
  <workspace> 2 files changed: src/auth.ts, src/api.ts </workspace>
  <docker> container "redis" unhealthy (OOM) </docker>

── Decide ───────────────────────────────────
  Redis OOM is blocking the API. Fix infrastructure first.

── Act ──────────────────────────────────────
  Restarted redis with --maxmemory 256mb. API responding.
  Notified via Telegram: "Redis was OOM, restarted with memory limit."

Each cycle: perceive → decide → act. No human prompt needed.

What Makes It Different

Platform Agents Goal-Driven (AutoGPT) mini-agent
Core idea Agents on a platform Goal in, steps out See first, then act
Identity Platform-assigned None SOUL.md — personality, growth
Memory Platform DB Vector DB Markdown files (human-readable)
Perception Platform APIs Minimal Shell scripts — anything is a sense
Security Sandbox Varies Transparency > Isolation
Complexity Heavy 181K lines (AutoGPT) ~29K lines TypeScript

How It Works

Four building blocks:

  • Perception — Shell scripts that output environment state. Anything scriptable becomes a sense
  • Skills — Markdown files injected into the prompt. Domain knowledge as instructions
  • Memory — Markdown + JSON Lines. Hot → warm → cold tiers. FTS5 full-text search, no vector DB
  • Identity — SOUL.md defines personality, interests, evolving worldview. Not just a task executor

Perception Plugins

Any executable that writes to stdout becomes a sense:

#!/bin/bash
# plugins/my-sensor.sh — output becomes <my-sensor>...</my-sensor> in context
echo "Status: $(systemctl is-active myservice)"
echo "Queue: $(wc -l < /tmp/queue.txt) items"

Register it in agent-compose.yaml:

perception:
  custom:
    - name: my-sensor
      script: ./plugins/my-sensor.sh

34 plugins included out of the box: workspace changes, Docker health, Chrome tabs, Telegram inbox, mobile GPS, GitHub issues/PRs, and more.

Skills

Write domain knowledge in Markdown. The agent follows it as instructions:

skills:
  - ./skills/docker-ops.md      # Container troubleshooting
  - ./skills/web-research.md    # Three-layer web access
  - ./skills/debug-helper.md    # Systematic debugging

25 skills included.

Configuration

One YAML file defines your agent:

# agent-compose.yaml
agents:
  assistant:
    name: My Assistant
    port: 3001
    persona: A helpful personal AI assistant
    loop:
      enabled: true
      interval: "5m"
    cron:
      - schedule: "*/30 * * * *"
        task: Check for pending tasks
    perception:
      custom:
        - name: docker
          script: ./plugins/docker-status.sh
    skills:
      - ./skills/docker-ops.md

Features

  • Organic Parallelism — Multi-lane architecture inspired by slime mold: main cycle + foreground lane + 6 background tentacles
  • System 1 Triage — Optional mushi companion uses a small model (~800ms) to filter noise before expensive LLM calls — saves ~40% token cost
  • Telegram — Bidirectional messaging with notifications and smart batching
  • Mobile PWA — Phone sensors (GPS, accelerometer, camera) as perception inputs
  • Web Access — Multi-layer extraction: Readability → trafilatura → VLM vision fallback
  • Team Chat Room — Multi-party discussion with persistent history and threading
  • MCP Server — 14 tools for Claude Code integration
  • CI/CD — Auto-commit → auto-push → GitHub Actions → deploy
  • Modes — calm (loop off) / reserved (loop on, notifications off) / autonomous (everything on)

Requirements

  • Node.js 20+
  • Claude CLI (npm install -g @anthropic-ai/claude-code)
  • Chrome (optional, for web access via CDP)

Philosophy

"There is no such thing as an empty environment."

A personal AI agent shares your context — your browser sessions, your conversations, your files. Isolating it means isolating yourself. mini-agent chooses transparency over isolation: every action has an audit trail (behavior logs + git history + File=Truth).

The agent's world is defined by its perception plugins — its Umwelt. Add a plugin, expand what it can see. What it sees shapes what it does.

Documentation

License

MIT

Release History

VersionChangesUrgencyDate
v0.1.0## What is mini-agent? A perception-driven AI agent framework. Most agent frameworks are goal-driven — give it a task, get steps back. mini-agent **observes your environment first**, then decides whether and how to act. Shell scripts define what the agent can see. Claude decides what to do. No database, no embeddings — just Markdown files + shell scripts + Claude CLI. ## Key Features - **Perception-First Architecture** — OODA loop (Observe → Orient → Decide → Act). Shell plugins define senseMedium3/10/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

night-watch-cliAI agent that implements your specs, opens PRs, and reviews code overnight. Queue GitHub issues or PRDs, wake up to pull requests.master@2026-04-20
acolyteA terminal-first AI coding agent. Open-source, observable, and built for developer control.v0.19.0
ag-clawModular AI agent framework with 59 pluggable features, 8+ messaging channels, and production-grade security. TypeScript-first. MIT license. Self-hosted, no subscriptions.v0.0.1
agent-brainAgent ReAct framework with cognitive planning engine — five-phase cognitive cycle with nested ReAct loops, dynamic skill acquisition, and interactive user input.v0.1.2
flow-nextPlan-first AI workflow plugin for Claude Code, OpenAI Codex, and Factory Droid. Zero-dep task tracking, worker subagents, Ralph autonomous mode, cross-model reviews.flow-next-v0.29.4