freshcrate
Skin:/
Home > Frameworks > Wee-Orchestrator

Wee-Orchestrator

🍀 Self-hosted multi-agent AI orchestrator — chat with Claude, Gemini & Copilot CLI from Telegram, WebEx, or browser. 5 runtimes, 17+ models, task scheduling, skill plugins.

Why this rank:Recent releaseStrong adoptionHealthy release cadence

Description

🍀 Self-hosted multi-agent AI orchestrator — chat with Claude, Gemini & Copilot CLI from Telegram, WebEx, or browser. 5 runtimes, 17+ models, task scheduling, skill plugins.

README

🍀 Wee-Orchestrator

One platform. Every AI. Any channel.

Python 3.10+ License: MIT

Wee-Orchestrator is a unified AI agent platform that lets you chat with any AI CLI runtime — GitHub Copilot, Claude Code, OpenCode, Google Gemini, or OpenAI Codex — from Telegram, WebEx, or a beautiful browser-based Web UI. Switch models, agents, and runtimes on the fly with slash commands. Schedule recurring AI tasks. Send files and images. All from one place.

Wee-Orchestrator Architecture


✨ Why Wee-Orchestrator?

Problem Wee-Orchestrator Solution
Juggling multiple AI tools and CLIs One unified interface across 5 runtimes and 17+ models
AI is stuck in the terminal Chat from anywhere — Telegram, WebEx, or the Web UI
No memory between sessions Persistent sessions with full conversation history
Can't automate AI tasks Built-in task scheduler with cron-like scheduling
One-size-fits-all agents Multi-agent architecture — switch agents per task
Complex setup Zero-config bot creation with the Starter Kit

📸 Screenshots

Chat Interface Task Scheduler
Chat Interface Task Scheduler
Secure Pairing Login Architecture Overview
Login Screen Architecture

🚀 Key Features

  • 🔀 5 AI Runtimes — GitHub Copilot CLI, Claude Code, OpenCode, Google Gemini, OpenAI Codex
  • đŸ’Ŧ 3 Channels — Telegram bot, WebEx bot (via RabbitMQ), glassmorphism Web UI with SSE streaming
  • 🤖 Multi-Agent — Define specialized agents in agents.json, switch with /agent; hot-reload on change (no restart needed)
  • 🔄 Live Model Switching — Change models mid-conversation with /model
  • 📅 Task Scheduler — Schedule recurring AI jobs with natural language (every day at 9am)
  • 📁 File & Image Support — Upload, download, and inline images across all channels
  • 🎤 Audio Transcription — Voice messages auto-transcribed via Whisper (OpenAI or local)
  • 🔐 Secure Auth — Pairing-code login, per-user ACLs, agent/model pinning, yolo/restricted modes
  • 📜 Session History — Full conversation persistence with search and resume
  • ⚡ Background Tasks — Delegate long-running work to background agents with in-thread status updates
  • 🔔 In-Thread Notifications — Real-time task lifecycle updates (queued → running → complete) in your conversation
  • 📋 Dual-Source TODOs — Sync TODOs between GitHub Issues (primary) and flat files (fallback) with auto-deduplication
  • 🔧 Expandable Tool Calls — View tool invocations with collapsible output panels in WebUI; markdown rendering, error highlighting, silent mode support
  • 💰 Token Usage Tracking — Real-time tracking of prompt/completion tokens and cost estimation across all runtimes; live stats displayed in WebUI footer
  • 🔌 Extensible Skills — Plugin architecture for adding capabilities (Cisco Meraki, Home Assistant, etc.)
  • **âš™ī¸ Slash Command Registry — Pure-server commands that bypass the LLM for reduced latency; auto-registers with Telegram BotFather for autocomplete; built-in /secret command for secure credential management

đŸ—ī¸ Architecture

  Telegram ──â–ē TelegramConnector ──┐
                                   │
  WebEx ─────â–ē WebEXConnector ─────â”ŧ──â–ē SessionManager ──â–ē AI CLI Runtimes
                                   │       │                (Copilot, Claude,
  Browser ───â–ē FastAPI /api/v1 ────┘       │                 OpenCode, Gemini,
                                           │                 Codex)
                                    TaskScheduler

Each inbound message flows through a channel connector, into the shared SessionManager (which handles slash commands, session state, and agent routing), and out to the selected AI CLI runtime as a subprocess. Responses stream back in real time.

For the full component diagram, sequence diagrams, and deployment topology, see ARCHITECTURE.md.


📋 Overview

Wee-Orchestrator provides a flexible framework to:

  • Chat with AI agents from Telegram, WebEx, or the browser-based Web UI
  • Call AI CLIs (Copilot, OpenCode, Claude Code, Gemini, Codex) from N8N workflows
  • Maintain session affinity across multiple conversation turns
  • Switch between different agent repositories dynamically
  • Configure agents via JSON config files instead of hardcoding
  • Support multiple AI models and runtimes
  • Schedule recurring AI tasks with the built-in Task Scheduler
  • Execute bash commands directly with ! prefix
  • Send and receive files and images over Telegram and WebEx
  • Enforce per-user agent pinning, model pinning, and yolo/restricted mode ACLs

For release history and feature documentation see CHANGELOG.md and RELEASE_NOTES.md.

⚡ Quick Start

# 1. Clone the repo
git clone https://github.com/leprachuan/Wee-Orchestrator.git
cd Wee-Orchestrator

# 2. Install dependencies
pip install -r requirements.txt

# 3. Configure your environment
cp .env.example .env    # Edit with your API keys and bot tokens

# 4. Define your agents
vi agents.json           # Add your agent definitions

# 5. Start the API server
python3 agent_manager.py --api

# 6. (Optional) Start channel connectors
python3 telegram_connector.py   # Telegram bot
python3 webex_connector.py      # WebEx bot

Then open http://localhost:8000/ui in your browser and pair via Telegram or WebEx.

🚀 Want to create your own bot? Use the Wee-Orchestrator Starter Kit to scaffold one in minutes.


đŸ’Ŧ Slash Commands

Command Description
/agent <name> Switch to a different agent
/model <model> Change AI model mid-conversation
/runtime <runtime> Switch AI runtime (copilot, claude, claude-agent-sdk, gemini, opencode, copilot-sdk, codex, devin)
/timeout <seconds> Adjust execution timeout
/status Check running task status
/cancel Cancel the current running task
/schedule list List all scheduled jobs
/schedule add <name> | <schedule> | <task> Create a scheduled job
/help Show all available commands

Bot Setup Guide

Wee-Orchestrator enables you to create custom bots — specialized AI agents with their own configuration, knowledge base, and capabilities. Each bot is a self-contained repository that can be integrated with Wee-Orchestrator.

🚀 New here? Use the Wee-Orchestrator Starter Kit to scaffold a new bot in minutes — includes AGENTS.md, skill management with security scanning, memory structure, and setup scripts.

What is a Bot?

A bot is a Git repository containing:

  1. Core Configuration — An AGENTS.md file defining agent behavior, preferences, and runtime configurations
  2. Knowledge Base — A memory/ directory using the PARA methodology (Projects, Areas, Resources, Archive) for organizing operational knowledge
  3. Focus Areas — Organized folders for specific domains (e.g., email_triage/, smart_home/, infrastructure/)
  4. Skills Integration — References to specialized skills from pot-o-skills or custom skills
  5. Documentation — README, guides, and workflow documentation

Example Bot Structure

my-bot/
├── README.md                  # Bot overview & usage
├── AGENTS.md                  # Agent behavior & configuration
├── .env                       # Credentials (git-ignored)
├── .gitignore                 # Protect secrets
│
├── memory/                    # Knowledge base (PARA methodology)
│   ├── projects/              # Active multi-step initiatives
│   ├── areas/                 # Ongoing responsibility areas
│   ├── resources/             # Reference material & best practices
│   └── archive/               # Completed/deprecated items
│
├── skills/                    # Custom skill implementations
│   ├── custom-skill-1/
│   └── custom-skill-2/
│
└── domain-folders/            # Domain-specific organization
    ├── email/                 # Email processing
    ├── home-automation/       # Smart home tasks
    └── infrastructure/        # Infrastructure management

Key Components

AGENTS.md

Defines the bot's behavior, preferences, and runtime configuration:

  • Agent name, purpose, and timezone
  • Preferred models and runtimes (Claude, Copilot, Gemini)
  • Tool permissions and access control
  • Sub-agent delegation rules
  • Skill definitions and repository locations
  • Security and credential management

Example excerpt:

---
name: my-bot
runtime: copilot
model: gpt-5-sonnet
timezone: EST/EDT
---

## Behavior

- Preferred AI runtime: Claude > Copilot > Gemini
- Task routing: Delegate to specialized sub-agents for domain expertise
- Notification channel: Telegram

Memory Structure (PARA)

Organize knowledge for long-term retention and reuse:

  • Projects/ — Active multi-step work (e.g., home-automation-setup.md)
  • Areas/ — Ongoing responsibilities (e.g., orchestration.md, security.md)
  • Resources/ — Reference material (e.g., best-practices.md, api-docs.md)
  • Archive/ — Completed or deprecated knowledge

Skills

Skills extend your bot's capabilities by providing pre-built integrations with external APIs and services. Skills should be sourced from reputable, official repositories to minimize security risks.

Recommended Skill Sources
  1. pot-o-skills — Community skills for cloud networking and security

    • Repository: https://github.com/leprachuan/pot-o-skills
    • Skills: Cisco Meraki, Cisco Security Cloud Control, and more
    • Status: Public, open-source, actively maintained
    • Usage: Clone and link into your bot's skills/ directory
  2. Anthropic Official Skills — Official skills from Anthropic

    • Repository: https://github.com/anthropics/skills
    • Status: Official, production-ready
    • Security: Vetted and maintained by Anthropic team
    • Best for: Claude AI integration, code generation, analysis
  3. Custom Skills — Implement your own domain-specific skills

    • Location: ./skills/ directory in your bot repository
    • Documentation: Must include SKILL.md, README, and examples
    • Security: You control the code and updates
âš ī¸ Skills Security Guidelines

Skills have full access to your system — they can execute commands, read files, and call APIs. Follow these practices:

  • ✅ Only use official skills from original software/service authors

    • Example: Use Cisco's official Meraki skill, not community forks
    • Example: Use Anthropic's official skills, not third-party versions
  • ✅ Validate before installation

    • Review the source code in the skill repository
    • Check for hardcoded credentials or suspicious patterns
    • Verify the repository is actively maintained
    • Look for security issues reported in GitHub Issues
  • ✅ Use trusted repositories

    • Official repos (Anthropic, GitHub, etc.)
    • Long-standing community projects with active maintainers
    • Projects with security policies and issue tracking
    • Avoid random GitHub repos without documentation or maintenance
  • âš ī¸ Audit custom skills carefully

    • Never trust a skill without reviewing its code first
    • Check for unintended API calls or data exfiltration
    • Validate input sanitization
    • Ensure credentials are handled safely
  • ✅ Keep skills updated

    • Periodically review and update to latest versions
    • Subscribe to security advisories from skill repositories
    • Remove unused skills to reduce attack surface
Using Skills in Your Bot
# Link public skills from pot-o-skills (verified, open-source)
ln -s /opt/pot-o-skills/cisco-meraki ./skills/
ln -s /opt/pot-o-skills/cisco-security-cloud-control ./skills/

# Link Anthropic official skills (verified, official)
ln -s /opt/anthropic-skills/code-analysis ./skills/
ln -s /opt/anthropic-skills/file-operations ./skills/

# Or implement custom skills in skills/ directory
mkdir skills/my-custom-skill
Discovering Skills
  • pot-o-skills: https://github.com/leprachuan/pot-o-skills

    cd /opt && git clone https://github.com/leprachuan/pot-o-skills.git
  • Anthropic Skills: https://github.com/anthropics/skills

    cd /opt && git clone https://github.com/anthropics/skills.git
  • Custom Community Skills: Search GitHub for topic:agent-skills with verification:

    • ✅ Active maintenance (recent commits)
    • ✅ Clear documentation
    • ✅ Security policy file
    • ✅ Public issue tracking

Domain Folders

Organize bot work by area of focus:

  • Keep related scripts, templates, and documentation together
  • Example: email/ for email processing, home/ for automation tasks
  • Each folder can have its own README with domain-specific guidance

Getting Started

💡 Recommended: Fork the Wee-Orchestrator Starter Kit instead of starting from scratch — it includes everything below pre-configured with best practices, security scanning, and setup scripts.

  1. Create your bot repository:

    mkdir my-bot && cd my-bot
    git init
    git remote add origin https://github.com/username/my-bot.git
  2. Add AGENTS.md: Copy and customize the AGENTS.md template from Wee-Orchestrator with your bot's preferences

  3. Create memory directory:

    mkdir -p memory/{projects,areas,resources,archive}
    echo "# Knowledge Base" > memory/INDEX.md
  4. Add .env and .gitignore:

    cp /opt/n8n-copilot-shim-dev/.env.example .env
    echo ".env" >> .gitignore
    echo "*.key" >> .gitignore
    echo "secrets.json" >> .gitignore
  5. Link or implement skills:

    mkdir skills
    ln -s /opt/pot-o-skills skills/cisco-meraki
  6. Register with Wee-Orchestrator: Update Wee-Orchestrator's agents.json to include your bot:

    {
      "agents": [
        {
          "name": "my-bot",
          "path": "/opt/my-bot",
          "enabled": true
        }
      ]
    }

Best Practices

  • Secrets First: Store all credentials in .env (git-ignored), never commit secrets
  • Document Decisions: Use memory/areas/ to record architectural decisions and conventions
  • Skill Reuse: Leverage pot-o-skills before building custom skills
  • Domain Organization: Group related work into focused folders for maintainability
  • README Clarity: Each folder should have clear purpose and examples

Resources


Requirements

This project requires one or more of the following AI CLI tools to be installed:

Claude Code CLI

Prerequisites:

  • Node.js 18+ (for npm installation) OR native binary support
  • Anthropic API key for authentication

Installation:

Native binary (recommended):

curl -fsSL https://claude.ai/install.sh | bash

Or via npm:

npm install -g @anthropic-ai/claude-code

Supported Systems: macOS 10.15+, Linux (Ubuntu 20.04+/Debian 10+, Alpine), Windows 10+ (via WSL)

Reference: Claude Code Quickstart Documentation

GitHub Copilot CLI

Prerequisites:

  • Node.js 22 or higher
  • Active GitHub Copilot subscription (Pro, Pro+, Business, or Enterprise plan)
  • GitHub account for authentication

Installation:

npm install -g @github/copilot
copilot  # Launch and authenticate

For authentication, use the /login command or set GH_TOKEN environment variable with a fine-grained PAT.

Supported Systems: macOS, Linux, Windows (via WSL)

Reference: GitHub Copilot CLI Installation Guide

OpenCode CLI

Prerequisites:

  • Node.js or compatible runtime

Installation (Recommended):

curl -fsSL https://opencode.ai/install | bash

Or via npm:

npm i -g opencode-ai@latest

Alternative package managers:

  • Homebrew: brew install opencode
  • Scoop (Windows): scoop bucket add extras && scoop install extras/opencode
  • Arch Linux: paru -S opencode-bin

Supported Systems: Windows, macOS, Linux

Reference: OpenCode Documentation

Google Gemini CLI

Prerequisites:

  • Python 3.7 or higher
  • Google Cloud account with Gemini API access
  • Google API key for authentication

Installation:

pip install google-generativeai
# Or using the CLI wrapper
pip install gemini-cli

Authentication:

Set your API key as an environment variable:

export GOOGLE_API_KEY='your-api-key-here'

Or configure it in your shell profile:

echo 'export GOOGLE_API_KEY="your-api-key-here"' >> ~/.bashrc
source ~/.bashrc

Supported Systems: Windows, macOS, Linux

Reference: Google Gemini API Documentation

Tool Permissions & Access Control

All AI runtimes in this system are configured with full tool access to enable read, write, and execute operations without approval prompts. This provides maximum automation capabilities.

Permission Configuration by Runtime

GitHub Copilot CLI

  • Flags Used: --allow-all-tools --allow-all-paths
  • Enables:
    • All MCP tools and shell commands without approval
    • Read/write/execute permissions for all files and directories
  • Security Note: Gives Copilot the same permissions as your user account

Claude Code CLI

  • Flags Used: --permission-mode bypassPermissions
  • Enables:
    • Auto-approve all file edits, writes, and reads
    • Execute shell commands without approval
    • Access web/network tools without prompts
  • Also Known As: YOLO mode or dontAsk mode

OpenCode CLI

  • Configuration: Uses opencode.json file for permission settings
  • Required Setup:
    1. Copy the example config: cp opencode.example.json opencode.json
    2. Place opencode.json in your agent directories or project root
  • Permissions Enabled:
    • edit: allow
    • write: allow
    • bash: allow
    • read: allow
    • webfetch: allow
  • Reference: OpenCode Permissions Documentation

Google Gemini CLI

  • Flags Used: --yolo
  • Enables:
    • Read/write file operations without confirmation
    • Shell command execution without approval
    • All built-in tools with unrestricted access
  • Built-in Tools: read_file, write_file, run_shell_command

OpenAI Codex CLI

  • Flags Used: --dangerously-bypass-approvals-and-sandbox
  • Enables:
    • Disables all approval prompts
    • Removes sandbox restrictions (full file system access)
    • Allows all shell commands and tools without confirmation
  • Security Note: Only use in trusted, controlled environments

Claude Agent SDK (Python)

  • Package: claude-agent-sdk>=0.1.0 (install via pip install claude-agent-sdk)
  • Enables:
    • In-process async execution (no subprocess spawn)
    • Structured error types (CLINotFoundError, CLIConnectionError, ProcessError)
    • Native permission_mode field instead of CLI flags
    • Session continuity via ResultMessage.session_id capture
  • Permission Modes:
    • elevated → bypassPermissions (full access, no prompts)
    • sandboxed → plan (read-only + approval for writes)
    • restricted → default (standard safety checks)
  • Streaming: Real-time text chunks pushed to WebUI SSE consumers via _StreamBuffer
  • Tool Calls: ToolUseBlock/ToolResultBlock detection emits standardized tool_call events
  • Usage: /runtime set claude-agent-sdk
  • Issues: #77, #87, #91, #94

GitHub Copilot SDK (Python)

  • Package: github-copilot-sdk>=0.1.0 (install via pip install github-copilot-sdk)
  • Enables:
    • In-process async execution via CopilotClient
    • Real-time streaming via ASSISTANT_STREAMING_DELTA/ASSISTANT_MESSAGE_DELTA events
    • Tool call tracking via TOOL_EXECUTION_START/COMPLETE and COMMAND_EXECUTE events
    • Session resumption and structured error handling
  • Usage: /runtime set copilot-sdk
  • Issues: #76, #87, #91

Wee Native Runtime

  • Also Known As: wee — OpenAI-compatible API backend runtime
  • Description: Connects to any OpenAI-compatible API endpoint (Ollama, OpenRouter, LM Studio, etc.) without depending on external CLI tools like GitHub Copilot CLI, Claude Code, or OpenCode.
  • Supported Backends:
    • Ollama at http://192.168.1.101:11434/v1 — local, free (Kubuntu)
    • OpenRouter at https://openrouter.ai/api/v1 — cloud fallback, 100+ models
    • LM Studio at http://localhost:1234/v1 — local alternative
  • Model Format: Uses provider/model_name prefix syntax for auto-resolving API base URL and API key:
    • ollama/gemma4:e4b — Ollama on Kubuntu (default)
    • openrouter/meta-llama/llama-4-scout — OpenRouter cloud
    • lmstudio/qwen2.5-7b — LM Studio local
  • Configuration Example:
    {
      "runtime": "wee",
      "model": "ollama/gemma4:e4b"
    }
  • Environment Variables:
    • WEE_API_BASE — Override API base URL (e.g., http://192.168.1.101:11434/v1)
    • WEE_API_KEY — API key for authenticated endpoints (OpenRouter, etc.)
    • WEE_DEFAULT_MODEL — Default model when model not specified in config
  • Features:
    • In-process execution using OpenAI Python SDK
    • Real-time SSE streaming to WebUI
    • Provider presets auto-resolve API base URLs and API keys
    • Graceful error handling with informative messages
    • Background task subprocess execution via wee_runtime.py
  • Implementation: run_wee_native() in agent_manager.py; wee_runtime.py standalone CLI for background tasks
  • Usage: /runtime set wee
  • Features & Improvements:
    • OpenRouter integration: Full UI support for cloud-based models with 300s cached discovery & keyring-based API key management (Issue #119)
    • Model grouping in UI: Ollama and OpenRouter models displayed in separate dropdown optgroups
    • Dynamic OpenRouter model discovery: Live catalog fetch from OpenRouter API with per-provider grouping (Issue #157)
  • Bug Fixes:
    • Wrong Ollama port corrected: 11436 → 11434 (Issue #105)
    • httpx.Timeout(connect=15s) and max_retries=0 added to OpenAI client for fast-fail on bad endpoints (Issue #105)
    • Model resolution fixed: get_models_for_runtime('wee') returns flat strings; get_model_from_name() strips provider prefix (ollama/) and prefers exact/shortest match (Issue #105)
  • Bug Fixes (continued):
    • OpenRouter 401 auth fixed: OPENROUTER_API_KEY env var + keyring resolution replaces silent 'ollama' fallback; raises clear error when no key found (Issue #153)
  • Issues: #88, #105, #119, #153, #157

Wee CLI (wee_cli.py)

  • Also Known As: wee — standalone terminal AI assistant

  • Description: A user-facing command-line tool for the Wee ecosystem. Similar in style to GitHub Copilot CLI, Claude Code CLI, and Codex CLI. Supports single-shot prompts, interactive REPL, stdin piping, and tool calling via any OpenAI-compatible backend.

  • Supported Backends: Same as Wee Native Runtime (Ollama, OpenRouter, LM Studio)

  • Quick Start:

    # Single-shot
    python3 wee_cli.py "What is the capital of France?"
    # Interactive REPL
    python3 wee_cli.py --interactive
    # Pipe from stdin
    echo "summarize this" | python3 wee_cli.py --model ollama/qwen3:8b
  • Key Flags:

    Flag Short Default Description
    --model -m ollama/qwen3:8b Model ID with provider prefix
    --permission -p restricted Tool execution level: restricted / auto / elevated
    --output -o text Output format: text / json / markdown
    --tools -t off Enable tool calling (bash, python)
    --interactive -i off Enter interactive REPL mode
    --system -s none System prompt override
    --temperature -T none Sampling temperature
    --timeout 120s Request timeout
    --api-key -k env/keyring API key override (prefer env var)
    --api-base -b auto Custom API base URL
    --config ~/.wee/config.json Config file path
  • Permission Levels:

    • restricted (default) — tool calls blocked; safe for untrusted input
    • auto — tool calls confirmed per invocation; suitable for interactive use
    • elevated — tool calls unrestricted; use in trusted automation
  • Output Formats:

    • text (default) — plain streamed output
    • json — full response as a JSON object {"response": "...", "model": "..."}
    • markdown — rich-rendered markdown via rich library (falls back to plain text)
  • Config File (~/.wee/config.json):

    {
      "model": "ollama/qwen3:8b",
      "system_prompt": "You are a helpful assistant",
      "tools": false,
      "permission": "restricted",
      "output_format": "text"
    }
  • Environment Variables:

    • WEE_MODEL — Default model (overridden by --model)
    • WEE_API_KEY — API key (prefer over --api-key to avoid exposure in ps aux)
    • WEE_API_BASE — API base URL override
  • Implementation: wee_cli.py (re-uses core from wee_runtime.py)

  • Issues: #158

Security Considerations

âš ī¸ Warning: These configurations grant AI agents extensive system access:

  • Full file system access: Can read, modify, or delete any file your user can access
  • Command execution: Can run any shell command with your user privileges
  • No safety prompts: All operations execute automatically without confirmation

Best Practices:

  1. Use in controlled environments: Development containers, VMs, or sandboxed systems
  2. Regular backups: Maintain backups of critical files and directories
  3. Code review: Review AI-generated changes before committing to production
  4. Limit agent scope: Configure agents to work in specific project directories
  5. Monitor activity: Review session logs and agent outputs regularly

Recommended Use Cases:

  • ✅ Development and testing environments
  • ✅ Automated CI/CD pipelines in isolated containers
  • ✅ Personal projects with version control
  • ❌ Production systems without review
  • ❌ Shared systems with sensitive data
  • ❌ Public or untrusted environments

Configuration

Agent Configuration

The system loads agents from agents.json or a custom config file. Each agent represents a repository context where the AI CLI will operate.

Config Format:

{
  "agents": [
    {
      "name": "devops",
      "description": "DevOps and infrastructure management",
      "path": "/path/to/MyHomeDevops"
    },
    {
      "name": "projects",
      "description": "Software development projects",
      "path": "/path/to/projects"
    }
  ]
}

Configuration Fields:

  • name (required): Short identifier for the agent (used in /agent set commands)
  • description (required): Brief human-readable description of the agent
  • path (required): Full path to the repository or project directory

Environment Configuration

âš ī¸ API_HOST Security Warning Never set API_HOST=0.0.0.0 — this exposes the server on every network interface including your LAN and any public NIC. Always bind to specific trusted interfaces (e.g. 127.0.0.1,<tailscale-ip>). See Network Binding & Secure Access.

The default agent, model, and runtime can be customized via environment variables. This is useful for:

  • Different users having different defaults
  • Docker container configuration
  • CI/CD pipeline customization
  • Development vs. production setups

Available Environment Variables:

# Default agent for new sessions
COPILOT_DEFAULT_AGENT=orchestrator        # Default: orchestrator

# Default model for new sessions  
COPILOT_DEFAULT_MODEL=gpt-5-mini          # Default: gpt-5-mini

# Default runtime for new sessions
COPILOT_DEFAULT_RUNTIME=copilot           # Default: copilot

Usage Examples:

# Set orchestrator as default
export COPILOT_DEFAULT_AGENT=orchestrator
export COPILOT_DEFAULT_RUNTIME=copilot

# Or set family agent with Claude runtime
export COPILOT_DEFAULT_AGENT=family
export COPILOT_DEFAULT_MODEL=claude-sonnet
export COPILOT_DEFAULT_RUNTIME=claude

# Run the agent
python3 agent_manager.py "Your prompt" "session_id"

Docker Example:

ENV COPILOT_DEFAULT_AGENT=orchestrator
ENV COPILOT_DEFAULT_MODEL=gpt-5-mini
ENV COPILOT_DEFAULT_RUNTIME=copilot

Reference Configuration:

Copy .env.example to .env and customize:

cp .env.example .env
# Edit .env with your defaults

When environment variables are not set, the system uses these hardcoded defaults:

  • Agent: orchestrator
  • Model: gpt-5-mini
  • Runtime: copilot

Setup

  1. Copy the agent manager script:

    cp agent_manager.py /usr/local/bin/agent-manager
    chmod +x /usr/local/bin/agent-manager
  2. Configure your agents:

    • Copy agents.example.json to agents.json
    • Edit agents.json with your actual repository paths
    • Place agents.json in the same directory as the script or current working directory
  3. Optional: Specify config location via environment variable

    export AGENTS_CONFIG=/path/to/custom/agents.json

Usage

Command Line

The agent manager supports both positional arguments (for backwards compatibility) and named options for more flexibility.

Basic Usage (Positional Arguments)

python agent_manager.py "<prompt>" [session_id] [config_file]

Arguments:

  • prompt: The prompt/command to send to the AI CLI
  • session_id (optional): N8N session identifier for tracking conversations (default: "default")
  • config_file (optional): Path to agents.json config file

Examples:

# Basic usage
python agent_manager.py "List all files in the current directory"

# With session ID
python agent_manager.py "Continue debugging the issue" "session-123"

# With custom config file
python agent_manager.py "Deploy the app" "session-456" "/etc/agents.json"

Advanced Usage (Named Arguments)

python agent_manager.py [options] "<prompt>" [session_id]

Options:

Agent Options:

  • --agent NAME - Set the agent to use (e.g., devops, family, projects)
  • --list-agents - List all available agents and exit

Model Options:

  • --model NAME - Set the model to use (e.g., gpt-5, sonnet, gemini-1.5-pro)
  • --list-models - List all available models for current runtime and exit

Runtime Options:

  • --runtime NAME - Set the runtime to use (choices: copilot, opencode, claude, claude-agent-sdk, gemini, copilot-sdk, codex, devin)
  • --list-runtimes - List all available runtimes and exit

Configuration:

  • --config FILE or -c FILE - Path to agents.json configuration file

Examples:

# List available agents
python agent_manager.py --list-agents

# List available agents with custom config
python agent_manager.py --list-agents --config my-agents.json

# List available runtimes
python agent_manager.py --list-runtimes

# List available models
python agent_manager.py --list-models

# Set agent via CLI
python agent_manager.py --agent devops "Check server status"

# Set runtime and model via CLI
python agent_manager.py --runtime gemini --model gemini-1.5-pro "Analyze this code"

# Combine multiple options
python agent_manager.py --agent family --runtime claude --model sonnet "Find recipes for dinner"

# Use custom configuration file
python agent_manager.py --config /etc/my-agents.json --agent projects "Review pull requests"

# All options together
python agent_manager.py --config my-agents.json --agent devops --runtime claude --model haiku "Deploy to production" "session-123"

Getting Help:

python agent_manager.py --help

Slash Commands

Interact with the agent manager using slash commands:

Bash Commands

!<command>                 # Execute bash command directly (e.g., !pwd, !ls -la)

Examples:

!pwd                       # Show current working directory
!echo "Hello World"        # Echo a message
!ls -lh                    # List files with details
!date                      # Show current date/time
!git status                # Run git commands
!python3 --version         # Check installed versions

Features:

  • Commands execute directly without hitting any AI runtime
  • 10-second timeout for safety
  • Runs in current working directory
  • Supports pipes, redirects, and command chaining (&&, ||, |)
  • Returns stdout/stderr output

Runtime Management

/runtime list              # Show available runtimes (copilot, opencode, claude, gemini)
/runtime set <runtime>     # Switch runtime (e.g., /runtime set gemini)
/runtime current           # Show current runtime

Model Management

/model list                # Show available models for current runtime
/model set "<model>"       # Switch model (e.g., /model set "claude-opus-4.5")
/model current             # Show current model

Agent Management

/agent list                # Show all available agents with descriptions
/agent set "<agent>"       # Switch to an agent (e.g., /agent set "projects")
/agent current             # Show current agent and its context

Session Management

/session reset             # Reset the current session (starts fresh next message)
/help                      # Show all available commands

Query Management

/status                    # Check status of running query for this session
/cancel                    # Cancel running query for this session

Query Tracking: When a query is executing, the agent manager tracks its process ID (PID), runtime, agent, and output. Use /status to check if a query is running and see recent output, or /cancel to terminate a long-running query.

Secrets Management

/secret set <name>         # Create/update a secret (value read from stdin)
/secret get <name>         # Retrieve a secret value
/secret list               # List all secret names (values redacted)
/secret delete <name>      # Remove a secret

Features:

  • Secrets stored securely via secret_tool.py (never exposed in shell history or LLM context)
  • Name validation: alphanumeric, dots, hyphens, underscores only (^[A-Za-z0-9._-]+$)
  • stdin-based input prevents secrets from appearing in command history
  • Pre-LLM dispatch — secrets never touch the AI model
  • Supported on all channels (Telegram, WebEx, Web UI)

Examples:

echo "my-db-password" | /secret set db_password
/secret get db_password    # Returns: my-db-password
/secret list               # Returns: db_password, api_key, github_token
/secret delete db_password

Programmatic Secret Access in AI Agents (wee_executor)

AI agents running in privileged modes can retrieve secrets programmatically via the get_secret() capability in wee_executor.py (F024).

When to use:

  • AI agents need secure access to credentials (API keys, database passwords) during task execution
  • Secrets must never be logged or exposed to LLM context
  • Only available in interactive and sync modes; blocked in background and api modes for security

Requirements:

  1. Elevation flag: Task must run with WEE_ELEVATED=true in the session environment
  2. Name validation: Secret names must match ^[A-Za-z0-9._-]+$ (alphanumeric, dot, hyphen, underscore)
  3. Mode restriction: Only callable from interactive or sync mode sessions

Capability signature:

# Called within an AI agent's context
get_secret(
    name: str,           # Secret name (e.g., "GITHUB_TOKEN")
    backend: str = "keyring"  # Storage backend: "keyring" or "file"
) -> Dict
# Returns: {status, name, backend, value} on success
#          {error, code} on failure (e.g., ELEVATION_REQUIRED, INVALID_NAME)

Agent Context Injection: When an agent runs with WEE_ELEVATED=true, agent_manager.py automatically injects get_secret() documentation and usage examples into the agent's context. The agent can then call get_secret() to retrieve secrets needed for the task.

Security:

  • 🔐 Elevation requirement: Prevents accidental secret access from untrusted agents
  • đŸ›Ąī¸ Name validation: Blocks path traversal attempts (e.g., ../etc/passwd rejected)
  • đŸšĢ Mode filtering: Only in interactive/sync modes; disabled in background/api for API call safety
  • 📋 Audit logging: All calls logged with name + backend; secret values never logged for compliance
  • âąī¸ Rate limiting: 50 requests/minute per session to prevent brute-force attacks

How it works:

  1. AI agent calls get_secret(name="GITHUB_TOKEN", backend="keyring")
  2. wee_executor.py validates the name and checks WEE_ELEVATED=true
  3. Subprocess delegates to secret_tool.py to retrieve the secret value
  4. Secret is returned to the agent but never written to logs
  5. Agent can use the secret for its task (e.g., authenticate to GitHub API)

Example agent usage (conceptual):

Agent (with WEE_ELEVATED=true):
  "I need to push code to GitHub. Let me get my credentials."
  
  get_secret(name="GITHUB_TOKEN", backend="keyring")
  → {status: "success", name: "GITHUB_TOKEN", value: "ghp_...", backend: "keyring"}
  
  # Now the agent has the token and can authenticate API calls

Available backends:

  • keyring (default): System keyring (GNOME Keyring, Macos Keychain, etc.)
  • file: Encrypted JSON store (requires cryptography + python-keyring)

See docs/secret-tool.md for CLI and storage backend details.

N8N Integration

Use in an N8N workflow:

Basic N8N Integration (Positional Arguments)

// Execute the agent manager from N8N
const { exec } = require('child_process');
const prompt = "Your prompt here";
const sessionId = "n8n_session_123";
const configFile = "/path/to/agents.json";

exec(`python agent_manager.py "${prompt}" "${sessionId}" "${configFile}"`,
  (error, stdout, stderr) => {
    if (error) console.error(error);
    console.log(stdout);
  }
);

Advanced N8N Integration (Named Arguments)

// Execute with specific agent, runtime, and model
const { exec } = require('child_process');
const agent = "devops";
const runtime = "claude";
const model = "sonnet";
const prompt = "Check production status";
const sessionId = "n8n_session_123";

const cmd = `python agent_manager.py --agent ${agent} --runtime ${runtime} --model ${model} "${prompt}" "${sessionId}"`;

exec(cmd, (error, stdout, stderr) => {
  if (error) console.error(error);
  console.log(stdout);
});

List Agents from N8N

// Get available agents dynamically
const { exec } = require('child_process');
const configFile = "/path/to/agents.json";

exec(`python agent_manager.py --list-agents --config ${configFile}`,
  (error, stdout, stderr) => {
    if (error) console.error(error);
    // Parse stdout to get agent list
    console.log(stdout);
  }
);

Session Management

Sessions are automatically tracked and stored in:

  • Copilot: ~/.copilot/n8n-session-map.json
  • OpenCode: ~/.opencode/n8n-session-map.json
  • Claude: ~/.claude/ (debug directory)
  • Gemini: ~/.gemini/sessions/

Each N8N session ID is mapped to:

  • A unique backend session ID (for resuming AI CLI sessions)
  • Current runtime (copilot/opencode/claude/claude-agent-sdk/gemini/copilot-sdk)
  • Current model
  • Current agent

Session data persists across requests, allowing multi-turn conversations.

Query Tracking

Running queries are tracked in ~/.copilot/running-queries.json with:

  • PID: Process ID for the running query
  • Runtime: Which AI runtime is executing the query
  • Agent: Which agent context is being used
  • Start Time: When the query started
  • Last Output: Recent output snippet (last 500 characters)

This enables the /status and /cancel commands to monitor and control long-running queries.

Default Behavior

When creating a new session:

  • Runtime: copilot (use /runtime set to change)
  • Model: gpt-5-mini (Copilot) / opencode/gpt-5-nano (OpenCode) / haiku (Claude) / gemini-1.5-flash (Gemini)
  • Agent: devops (or first available agent from config)

Background Task Agent Isolation (#75)

When a background task is created without an explicit agent field, the system resolves the agent via get_default_agent() — never from an existing session. This prevents session agent leakage where a task dispatched from a specialized agent session (e.g., devops) would silently run under that agent instead of the system default.

Safe inherited fields (copied from existing same-identity sessions):

  • runtime — inherits the session's active runtime
  • model — inherits the session's active model
  • notification_preference — inherits notification routing preference

Never inherited from sessions:

  • agent — always resolved from the request body or system default

This guarantee is enforced in _compute_bg_task_defaults() via an explicit SAFE_FIELDS whitelist. Agent must be explicitly provided in the request body to override the default:

{
  "prompt": "Deploy the app",
  "agent": "devops"
}

Advanced Features

Dynamic Agent Loading

Instead of hardcoding agent paths, the system:

  1. Looks for agents.json in the current directory
  2. Falls back to the script directory if not found
  3. Supports custom config paths via argument

Session Resumption

  • The system automatically detects and resumes existing sessions
  • If a session is lost or corrupted, it starts a fresh session automatically
  • Use /session reset to explicitly clear session state

Model Resolution

The system intelligently matches model names:

  • Exact matches (case-insensitive)
  • Substring/suffix matching
  • Latest version preference for ambiguous matches

Metadata Stripping

Automatically removes CLI metadata from output:

  • Thinking tags (<think>...</think>)
  • Token usage statistics
  • Session headers and banners

Session Memory Injection

Memory context is automatically injected at session creation time for all code paths:

  • When: Memory is injected once per session in build_agent_context_prompt() when the session is first created
  • What: MEMORY.md (persistent facts) and daily notes (today/yesterday timestamps) from memories/daily/
  • Scope: All session types — background tasks, interactive sessions, queued jobs, and promoted sessions
  • Single Injection: The memory_injected flag ensures context is prepended exactly once per session, preventing duplication
  • Sub-Task Handling: Sub-tasks created from within a background task (via origin_session_id) automatically skip re-injection
  • Fail-Silent: If memory files are missing, tasks continue without context (no errors)
  • No Wrapper Block: Memory sections are injected raw without [MEMORY CONTEXT] wrapper markers for cleaner output

This unified approach ensures all agents have access to relevant context without fragile prompt-based injection or code-path-specific handling.

Testing

A comprehensive test suite is included to ensure code quality and prevent regressions when making changes.

Running Tests

Stateless Query Endpoint

Method Path Description
POST /api/v1/query One-shot stateless query endpoint

POST /api/v1/query — Execute a single query without session management

A lightweight, ephemeral-session endpoint for programmatic AI queries. Perfect for CI/CD pipelines, scripts, and integrations that don't need persistent session state.

Request body (JSON):

{
  "prompt": "What is 2 + 2?",
  "runtime": "copilot",
  "model": "claude-haiku-4.5",
  "agent": "orchestrator",
  "timeout": 60
}

Response (200 OK):

{
  "response": "2 + 2 = 4",
  "runtime": "copilot",
  "model": "claude-haiku-4.5",
  "elapsed_ms": 2150
}

Parameters:

  • prompt (string, required) — The query or command to send to the AI runtime
  • runtime (string, required) — AI runtime: copilot, opencode, claude, gemini, or codex
  • model (string, required) — Model name or alias (e.g., claude-haiku-4.5, gpt-5-mini)
  • agent (string, optional) — Agent context to use (default: orchestrator)
  • timeout (integer, optional) — Query timeout in seconds (default: 60)

Error responses:

  • 400 Bad Request — Missing required fields (prompt, runtime, model) or invalid JSON
  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — Unknown runtime, model, or agent
  • 429 Too Many Requests — Rate limit exceeded (30 requests per minute per IP)
  • 504 Gateway Timeout — Query exceeded specified timeout

Features:

  • Stateless — No session created; ephemeral context cleaned up automatically after response
  • Rate Limited — 30 requests/minute per IP address (sliding window)
  • Full Control — Choose runtime, model, and agent per request
  • Security — Requires API authentication; executes with the authority of the calling user/token

Security:

  • Requires API authentication (Bearer token or shared-key validation)
  • Runs with the authority of the calling API user (rate-limited by IP)
  • Ephemeral sessions are not persisted or visible in session history
  • Input validation prevents agent/model traversal attacks

Memory Promotion

Method Path Description
POST /api/v1/memory/promote Promote memory for a single agent (or orchestrator)
POST /api/v1/memory/promote-all Promote memory across all agents in agents.json

POST /api/v1/memory/promote — Trigger memory promotion for a single agent

Consolidates daily notes (/memories/daily/*.md) into the agent's MEMORY.md using LLM analysis. Durable facts are elevated, duplicates removed, and the knowledge base refreshed.

Request body (JSON):

{
  "agent": "devops"  // Optional — if omitted, promotes orchestrator memory
}

Response (200 OK):

{
  "status": "ok",
  "agent": "devops",
  "agent_path": "/opt/MyHomeDevops",
  "stdout": "Promoted 8 facts from 3 daily notes...",
  "stderr": "",
  "returncode": 0
}

Error responses:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — Unknown agent name
  • 503 Service Unavailable — Memory promoter script not found
  • 504 Gateway Timeout — Promotion exceeded 120-second timeout
  • 500 Internal Server Error — Subprocess error or other failure

POST /api/v1/memory/promote-all — Trigger memory promotion for ALL agents

Iterates through every agent in agents.json (including orchestrator) and runs memory promotion for each. Handles partial failures gracefully — continues promotion for other agents if one fails.

Request body: Empty or omitted

Response (200 OK):

{
  "status": "ok",
  "total": 4,
  "succeeded": 4,
  "failed": 0,
  "results": [
    {
      "agent": "orchestrator",
      "agent_path": "/opt/memories",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    },
    {
      "agent": "devops",
      "agent_path": "/opt/MyHomeDevops",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    }
  ]
}

Security:

  • Both endpoints require API authentication (Bearer token or shared-key validation)
  • Memory promotion is read-only for daily notes, write-only to MEMORY.md
  • Agent path resolved from agents.json; prevents directory traversal

Helper Script: For scheduling memory promotion via the task scheduler or cron:

bash scripts/promote_all_agents_memory.sh

PATCH /api/v1/sessions/{id}/settings — Update session settings

Modify session-level settings like verbose mode (tool call visibility). Settings are persisted and returned in subsequent session queries.

Request body (JSON):

{
  "silent_mode": false  // Show tool call lines; set to true to hide
}

Response (200 OK):

{
  "id": "sess_abc123",
  "silent_mode": false,
  "created_at": "2026-04-03T20:00:00Z",
  "updated_at": "2026-04-03T21:05:42Z"
}

Error responses:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — Session does not exist
  • 422 Unprocessable Entity — Invalid value (e.g., non-boolean for silent_mode)

Features:

  • Whitelist-based field filtering — only recognized fields are accepted (currently: silent_mode)
  • WebUI toggle button in header reflects and controls this setting
  • Tool call lines (.tc-line) hidden when silent_mode=true, shown when false
  • Does not affect logging or session history — only visual display

Security:

  • Requires API authentication (Bearer token)
  • Per-session settings — each user session has independent configuration

🔧 Tool Call Visualization

Issue #115: Inline Expandable Tool Call Blocks

The WebUI now displays tool invocations with inline expandable blocks in the streaming panel. Each tool call shows a disclosure triangle (â–ļ); clicking expands a scrollable output pane with the full tool result, markdown formatting preserved.

Features:

  • ✅ Expandable blocks — Click â–ļ to expand/collapse tool output
  • ✅ Markdown rendering — Tool results support markdown (code blocks, lists, tables)
  • ✅ Error highlighting — Failed tool calls shown in red
  • ✅ Dark/light themes — CSS automatically adapts to UI theme
  • ✅ Silent mode integration — Tool blocks hidden when silent_mode=true
  • ✅ All runtimes supported — Works with copilot-sdk, claude-sdk, claude, and gemini

UI Behavior:

  • Tool started: Shows block with "Running ⌛" spinner
  • Tool completed: Output filled in, user can expand to view result
  • Tool error: Red highlight, error message displayed
  • Silent mode on: Blocks completely hidden from view

CSS Classes:

  • .tc-block — Container for tool call block
  • .tc-toggle — Expand/collapse button (â–ļ)
  • .tc-output — Scrollable output pane
  • .tc-error — Error state styling
  • .tc-expanded — Expanded state

Related Issues:

  • #115 — Inline Expandable Tool Call Blocks (QA Approved)
  • #87 — Streaming + Tool Call support for copilot-sdk and claude-sdk

💰 Token Usage Tracking & Cost Estimation

Issue #128: Token Usage Tracking + Cost Estimation + WebUI Footer

The WebUI now displays real-time token usage statistics in the footer after each message. Tracks cumulative prompt and completion tokens across all runtimes, calculates costs based on per-model pricing, and displays live usage summary.

Features:

  • ✅ Real-time tracking — Token counts updated after each message
  • ✅ Multi-runtime support — Tracks tokens across copilot-sdk, claude-sdk, openrouter, wee (Ollama/OpenRouter/LM Studio)
  • ✅ Cost estimation — Calculates costs based on current model pricing
  • ✅ Accuracy — Âą1% margin within expected pricing for all supported models
  • ✅ Session-level aggregation — Cumulative counts show total tokens and estimated costs for entire session
  • ✅ Per-message tracking — Individual message metadata includes token counts and partial costs
  • ✅ WebUI footer display — Live stats accessible without API calls (cached locally)

Displayed Metrics:

  • Prompt tokens: Total tokens in all input messages
  • Completion tokens: Total tokens in all model responses
  • Total tokens: Sum of prompt + completion tokens
  • Estimated cost: Calculated from per-model pricing (e.g., $0.15 per 1M input tokens)
  • Model pricing: Retrieved from token_calculator.py (based on published pricing)

Footer Display Format:

💰 Tokens: 1,234 prompt + 567 completion = 1,801 total | Est. cost: $0.023 | Model: claude-3.5-sonnet

Token Calculation Logic:

  1. Each runtime reports token usage after completing a message
  2. Tokens summed by type (prompt vs completion)
  3. Cost calculated: (prompt_tokens * model_input_price + completion_tokens * model_output_price) / 1_000_000
  4. Metadata stored in session history for audit/replay purposes
  5. Wee runtime strips internal __WEE_META__ before counting to avoid inflating token estimates

Supported Models:

  • Claude (claude-sdk): claude-3.5-sonnet, claude-3-opus, claude-3-haiku
  • Copilot (copilot-sdk): GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
  • OpenRouter: 200+ models with live pricing via OpenRouter API
  • Wee (Ollama): Ollama local models (token count via token_calculator.py estimate)
  • Wee (OpenRouter): Same as OpenRouter routing
  • Wee (LM Studio): LM Studio models (token estimate via calculator)

Related Issues:

  • #128 — Token Usage Tracking + Cost Estimation + WebUI Footer (QA Approved)
  • #91 — Background task permissions (Token tracking uses elevated permissions)

POST /api/v1/query — Stateless one-shot query endpoint

Execute a prompt without managing sessions. The endpoint creates an ephemeral session internally, runs the prompt, returns the result, and cleans up automatically. Ideal for evaluators, CI checks, and fire-and-forget queries.

Method Path Description
POST /api/v1/query Execute a one-shot query; no session state retained

Request body (JSON):

{
  "prompt": "What is 2 + 2?",
  "runtime": "copilot",
  "model": "claude-haiku-4.5",
  "agent": "orchestrator",
  "timeout": 120
}
Field Type Required Description
prompt string ✅ Query text (max 10,000 characters)
runtime string No Runtime to use: copilot, claude (default: copilot)
model string No Model name; defaults to runtime's configured default
agent string No Agent name from agents.json; defaults to orchestrator
timeout integer No Execution timeout in seconds (default: 120)

Response (200 OK — successful execution):

{
  "result": "4",
  "runtime": "copilot",
  "model": "claude-haiku-4.5",
  "agent": "orchestrator",
  "elapsed": 1.42
}

Error Detection (#67): When the runtime response contains a known error pattern, the endpoint returns the appropriate HTTP error status instead of 200 with error text:

HTTP Status Error Code Triggers
422 model_not_found ProviderModelNotFoundError, model not found, unknown model
429 rate_limit_exceeded RateLimitError, rate limit, too many requests
403 permission_denied PermissionDeniedError, permission denied, access denied
401 authentication_failed AuthenticationError, invalid api key, authentication failed
503 service_unavailable ServiceUnavailableError, service unavailable, temporarily unavailable

Error response body (JSON):

{
  "detail": {
    "error": "model_not_found",
    "message": "ProviderModelNotFoundError: gemma4-26b not found (truncated to 500 chars)",
    "runtime": "opencode",
    "model": "gemma4-26b"
  }
}

Code Generation Improvements (#68): Additional handling for empty/null responses and connection errors:

HTTP Status Error Code Triggers
502 empty_response Null, empty, or whitespace-only runtime output
502 connection_refused ECONNREFUSED — Model server not running (e.g., local Ollama/OpenCode instance down)
502 connection_reset ECONNRESET or socket hang up — Server closed connection unexpectedly
504 connection_timeout ETIMEDOUT — Model server slow or hung

Additional processing (#68):

  • ANSI Stripping: ANSI escape codes (color, formatting) stripped from runtime output before error detection — prevents formatting codes from interfering with pattern matching in code generation scenarios
  • Example empty response:
    {
      "detail": {
        "error": "empty_response",
        "message": "Runtime returned empty/null output",
        "runtime": "opencode",
        "model": "gemma4-26b"
      }
    }
  • Example connection error:
    {
      "detail": {
        "error": "connection_refused",
        "message": "Error: connect ECONNREFUSED 127.0.0.1:5000",
        "runtime": "opencode",
        "code": "ECONNREFUSED"
      }
    }

Other error responses: Other error responses:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 422 Unprocessable Entity — prompt missing, exceeds 10,000 chars, or invalid field type
  • 429 Too Many Requests — Rate limit exceeded (30 requests/minute per IP)
  • 503 Service Unavailable — Session execution failed (non-error-pattern failure)

Security:

  • Requires API authentication (Bearer token)
  • Prompt validated to 10,000 character maximum
  • Rate-limited to 30 requests/minute per IP address
  • Ephemeral sessions cleaned up after execution regardless of success or failure

Example (curl):

curl -s -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_TOKEN" \
  -d '{"prompt": "What is 2 + 2?", "runtime": "copilot", "model": "claude-haiku-4.5"}'

POST /api/v1/history/sessions/{session_id}/generate-title — LLM title generation

Force (re)generate a descriptive title for a session using an LLM or smart heuristic fallback. Useful when you want an immediate title refresh outside of the auto-trigger cycle.

Method Path Description
POST /api/v1/history/sessions/{session_id}/generate-title Generate or refresh an LLM title for the specified session

Response (200 OK — title generated):

{
  "session_id": "abc123",
  "title": "Kubernetes cluster health check",
  "source": "llm"
}
Field Type Description
session_id string The session whose title was updated
title string The generated title (max 120 chars)
source string "llm" (Ollama or Anthropic) or "heuristic" (no LLM used)

Error responses:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — Session does not exist or belongs to a different user
  • 400 Bad Request — Session has no messages (nothing to summarize)
  • 500 Internal Server Error — All title generation methods failed

Title generation cascade:

  1. Ollama (local, free) — POST {TITLE_GEN_OLLAMA_URL}/api/generate with model TITLE_GEN_MODEL
  2. Anthropic API — claude-haiku-4.5 when ANTHROPIC_API_KEY is set and Ollama is unavailable
  3. Smart heuristic — Extracts first substantive user message, strips markdown/code/URLs, word-boundary truncate to 60 chars

Auto-generation behavior (background):

  • _maybe_auto_generate_title() is called non-blocking after every session response
  • First LLM title generated at â‰Ĩ 2 messages
  • Title refreshed every TITLE_REFRESH_INTERVAL messages (default: 10) if source is "llm"
  • User-set titles (title_source == "user") are never overwritten

Configuration (env vars):

Variable Default Description
TITLE_GEN_OLLAMA_URL http://192.168.1.101:11434 Ollama API base URL
TITLE_GEN_MODEL granite3.3-tuned Ollama model for title generation
TITLE_REFRESH_INTERVAL 10 Messages between auto-refresh cycles
# Force regenerate a title
curl -s -X POST http://localhost:8000/api/v1/history/sessions/abc123/generate-title \
  -H "Authorization: Bearer $API_TOKEN"

Quick Start

# Run all tests
./run_tests.sh

# Or using Python directly
python3 -m unittest discover -s tests -p "test_*.py" -v

Test Options

# Run with verbose output
./run_tests.sh -v

# Run specific test class
./run_tests.sh -t tests.test_agent_manager.TestSlashCommands

# Generate coverage report
./run_tests.sh -c

Test Coverage

The test suite includes 209 tests across multiple test files:

Orchestrator Core Tests

tests/test_agent_manager.py (62 tests) — core orchestrator functionality:

  • Session Management (5 tests) - Creating, resuming, and persisting sessions
  • Agent Configuration (4 tests) - Loading and managing agent configurations
  • Slash Commands (9 tests) - All interactive commands (/help, /runtime, /model, /agent, /session)
  • Query Tracking (8 tests) - Process tracking for /status and /cancel commands
  • Model Resolution (5 tests) - Converting model names/aliases to full IDs
  • Metadata Stripping (4 tests) - Cleaning CLI output from different runtimes
  • Agent Switching (3 tests) - Changing agents and session context
  • Session Existence (2 tests) - Checking session state file existence

tests/test_new_features.py (79 tests) — WebUI and scheduler features:

  • Auth / pairing flow — pairing code generation, session token validation
  • History Manager — per-user session history CRUD
  • File upload / download — upload endpoint, file serving, cleanup
  • Scheduler endpoints — create, list, get, update, delete, pause, resume, results, logs
  • Image search — DuckDuckGo image search integration
  • Rate limiting — per-IP sliding window

Wee Native Runtime Tests

tests/test_wee_runtime_agentic.py (68 tests) — wee_runtime.py agentic capabilities:

  • Model Resolution (12 tests) - Ollama/OpenRouter prefix stripping, preset resolution, cross-provider parametrization
  • Tool Definitions (6 tests) - Schema validation, tool registration, JSON schema correctness
  • Tool Execution (11 tests) - Bash/Python execution, error handling, output capture, timeouts
  • SSH Sanitization (5 tests) - Word-boundary validation, injection prevention (Issue #111)
  • CLI Argument Parsing (3 tests) - Flag handling, defaults, priority resolution
  • Tool-Calling Loop (4 tests) - Single/multi-round mocked flows, max rounds enforcement
  • Permission Levels (5 tests) - Restricted/auto/elevated access control
  • Streaming Output (2 tests) - Empty response handling, newline termination
  • Error Handling (4 tests) - API failures, malformed arguments, invalid API base, timeouts
  • Performance Baselines (2 tests) - Import time <1s, model resolution <100ms
  • Ollama Integration (7 tests) - Live connection, single/multi-turn chat, tool calling
  • OpenRouter Integration (7 tests) - Live connection, API key verification, tool calling

Test Results

All tests pass with minimal external dependencies:

Orchestrator: 141 tests, 0.185s
Wee Runtime: 61 passed, 7 skipped (OpenRouter key), 0 failures
Total: 202+ passed

Tests use mocking to isolate orchestrator functionality and avoid:

  • Executing real CLI commands (Copilot, OpenCode, Claude)
  • Modifying user's home directory
  • Making real API calls to runtime providers

Wee runtime tests support both mocked tool-calling loops and optional live integration with Ollama and OpenRouter.

Adding Tests

When adding new features to agent_manager.py:

  1. Add corresponding test cases to tests/test_agent_manager.py
  2. Run the full test suite to ensure no regressions
  3. Aim for high coverage of new functionality

For detailed testing documentation, see tests/README.md.

Web UI

Wee-Orchestrator ships a browser-based chat interface served at /ui by the API server.

Features

  • 🍀 Glassmorphism design — frosted-glass panels, animated background blobs, responsive layout
  • đŸ’Ŧ Chat panel — markdown rendering, syntax highlighting, image display (no overflow), clickable meta pills
  • ⚡ Streaming responses — AI output streams to the browser in real-time via SSE; a blinking cursor shows progress and the bubble is replaced with fully-rendered markdown when complete
  • âąī¸ Response generation timing — each assistant message displays how long it took to generate (format: "âąī¸ Generated in X.Xs"), helping you understand performance across different runtimes
  • 👤 @username display — shows @handle instead of raw numeric IDs in message headers
  • 🔍 Typeahead — /command highlighting and autocomplete in the input box
  • 📸 File uploads — drag-and-drop or click to attach images and files to messages
  • đŸ–ŧī¸ Auto image search — AI can trigger DuckDuckGo image searches; results are served inline
  • 📅 Scheduler panel — switch between Chat and Scheduler from the sidebar navigation (hidden when SCHEDULER_ENABLED=false)
    • Job list with status badges (active / paused / disabled)
    • Detail drawer with full job configuration
    • Create / edit form with agent, runtime, model, and mode (yolo / restricted) selectors
    • Daemon status badge showing scheduler health
    • Toast notifications for CRUD operations
  • 🔐 Pairing auth — 6-digit one-time code sent via Telegram or WebEx; no passwords

Accessing the UI

http://<host>:<port>/ui

Default port is set by API_PORT in .env (default 8000).

🔒 See Network Binding & Secure Access below for guidance on restricting which interfaces the server listens on.

Network Binding & Secure Access

âš ī¸ WARNING: Do NOT bind to 0.0.0.0

Binding to 0.0.0.0 exposes the API and Web UI on every network interface — including your LAN and any public-facing NIC. This server grants executing arbitrary shell commands and full file-system access to connected AI agents. A malicious actor on your LAN or internet could take over your machine.

Always restrict API_HOST to trusted interfaces only.

Recommended: Tailscale + Localhost

Set API_HOST in .env to a comma-separated list of the interfaces you want to bind (the server spawns a listener for each):

# ✅ GOOD — localhost and Tailscale only
API_HOST=127.0.0.1,100.x.x.x   # replace with your Tailscale IPv4 (tailscale ip -4)
API_PORT=8001

# ❌ BAD — exposes to entire LAN/internet
# API_HOST=0.0.0.0

After changing .env, restart the API service:

sudo systemctl restart agent-manager-api-dev.service
# Verify — should show ONLY 127.0.0.1 and Tailscale IP:
ss -tlnp | grep 8001

Accessing the Dev Environment Remotely

Option 1 – Tailscale (Recommended)

  1. Install Tailscale: https://tailscale.com/download
  2. Join the same Tailscale network (get invite key from admin)
  3. Access directly via Tailscale IP:
    http://100.x.x.x:8001/ui
    

Option 2 – SSH SOCKS Proxy

# Start SOCKS proxy (-f backgrounds it, -N means no command)
ssh -fN -D 1080 user@your-host

# Browser: configure SOCKS5 proxy  127.0.0.1:1080  (proxy DNS enabled)
# Then open: http://127.0.0.1:8001/ui

Firefox: Settings → Network Settings → Manual proxy → SOCKS Host 127.0.0.1 Port 1080 SOCKS v5 → ✓ Proxy DNS

Chrome/Edge:

google-chrome --proxy-server="socks5://127.0.0.1:1080"

Option 3 – SSH Port Forwarding (single port)

ssh -N -L 8001:127.0.0.1:8001 user@your-host
# Then open: http://localhost:8001/ui

Full details: docs/dev-access.md


Streaming (SSE)

Chat responses from the Web UI use POST /api/v1/sessions/{id}/stream instead of the blocking execute endpoint. The browser receives Server-Sent Events:

Event Payload Description
start {} Streaming bubble created in the UI
chunk {"text": "â€Ļ"} Raw stdout line from the AI CLI as it arrives
done {"response":"â€Ļ","runtime":"â€Ļ","model":"â€Ļ"} Final stripped response; bubble replaced with rendered markdown
error {"message":"â€Ļ"} On failure

Keepalive comments (: keepalive) are sent every second to prevent proxy/browser timeouts. Slash commands and bash commands (!) skip the chunk loop and emit start → done immediately. All other channels (Telegram, WebEx, N8N) use the original blocking endpoint — streaming is WebUI-only.

Task Scheduler

The built-in task scheduler (task_scheduler.py) runs AI jobs on a schedule without human interaction.

Feature flag: The scheduler can be fully disabled by setting SCHEDULER_ENABLED=false in .env. This removes all /api/v1/scheduler/* API endpoints and hides the Scheduler tab in the Web UI. See Feature Flags below.

Features

  • 📅 Natural-language schedules — in 10 minutes, every 2 hours, every day at 9am
  • 🔄 Recurring or one-shot jobs
  • 🤖 Per-job AI config — choose agent, runtime, model, and mode independently for each job
  • 🔔 Creator-targeted notifications — results sent back to the Telegram or WebEx user who created the job
  • 🔒 Per-user ACL — only allowed users (configured via SCHEDULER_ALLOWED_TELEGRAM / SCHEDULER_ALLOWED_WEBEX env vars) can create/manage jobs
  • â¸ī¸ Pause / Resume — temporarily disable jobs without deleting them
  • 📋 Results history — last N results stored per job, viewable via API or Web UI

Clock Drift Handling

The scheduler is resilient to system clock adjustments (NTP corrections, manual time changes, etc.). Five complementary mechanisms ensure consistent job execution:

  • Drift Detection — Compares wall-clock vs monotonic time each cycle. Logs warnings when drift exceeds 30 seconds with direction and magnitude.
  • Per-Job Monotonic Cooldown — Records monotonic time of last execution for each job. Prevents double-execution when a backward clock jump reschedules a job into an already-executed time slot.
  • Stale Job Recalculation — Recurring jobs more than 1 hour overdue get their next run advanced to the next future slot instead of executing stale runs. One-time jobs are never recalculated.
  • Drift-Aware Readiness Check — Applies all three guards before execution. Logs info when executing catchup runs.
  • Wall-Clock Debt Compensation (#71) — Tracks accumulated backward drift as a running debt. In each readiness check, compensated_now = now + debt expands the current-time window so jobs skipped during a backward jump are recovered automatically. Debt drains as the clock moves forward; capped at 600 seconds to prevent runaway compensation.

Bottom line: If your system experiences a clock adjustment, the scheduler will:

  • Skip any jobs that have already been executed (monotonic cooldown)
  • Advance any recurring jobs that would be stale (1+ hour old)
  • Recover jobs missed during a backward clock jump (wall-clock debt compensation, up to 10 min)
  • Continue executing new jobs normally

Drift Diagnostics: Call executor.get_drift_diagnostics() to inspect current compensation state:

{
    "wall_clock_debt_seconds": 15.3,     # accumulated backward drift (0 = inactive)
    "drift_compensation_active": True,   # True when debt > 0
    "drift_recovered_jobs": 4,           # total jobs recovered via compensation
    "recent_drift_events": [...],        # last 10 drift events (direction + magnitude)
    "compensation_cap_seconds": 600      # max compensation window
}

REST API Endpoints

Method Path Description
GET /api/v1/scheduler/status Daemon health / doctor report
GET /api/v1/scheduler/jobs List all jobs
POST /api/v1/scheduler/jobs Create a new job
GET /api/v1/scheduler/jobs/{id} Get job details
PUT /api/v1/scheduler/jobs/{id} Update a job
DELETE /api/v1/scheduler/jobs/{id} Delete a job
POST /api/v1/scheduler/jobs/{id}/pause Pause a job
POST /api/v1/scheduler/jobs/{id}/resume Resume a paused job
GET /api/v1/scheduler/jobs/{id}/results Retrieve execution results
GET /api/v1/scheduler/jobs/{id}/logs Retrieve execution logs

TODO Management

Method Path Description
GET /api/v1/todos Fetch TODOs from both GitHub Issues and flat files (deduplicated)
POST /api/v1/todos Create a new TODO in both GitHub Issues and flat file
POST /api/v1/todos/{title}/complete Complete/close a TODO in both sources

Dual-Source TODOs — Fetches from GitHub Issues (primary, labeled with todo) and flat files (fallback), automatically merged with deduplication by title. GitHub Issues take precedence on conflicts.

GET /api/v1/todos — Fetch all TODOs from GitHub Issues + flat files

Request parameters (query string):

?limit=50          # Number of TODOs to return (default: 100)
?offset=0          # Pagination offset (default: 0)
?source=all|github|flat    # Filter by source (default: all)

Response (200 OK):

{
  "todos": [
    {
      "id": "To1a2b3",
      "title": "Fix auth bug",
      "status": "open",
      "source": "github",
      "issue_number": 42,
      "labels": ["bug", "urgent"],
      "created_at": "2026-04-01T10:00:00Z"
    },
    {
      "id": "Ta4b5c6",
      "title": "Refactor database layer",
      "status": "open",
      "source": "flat",
      "created_at": "2026-04-02T14:30:00Z"
    }
  ],
  "total": 2,
  "offset": 0,
  "limit": 50
}

POST /api/v1/todos — Create a new TODO in both sources

Request body:

{
  "title": "Complete user auth flow",
  "due_date": "2026-04-15",
  "labels": ["backend", "security"],
  "details": "Implement JWT tokens and refresh logic"
}

Response (201 Created):

{
  "id": "To1a2b3",
  "title": "Complete user auth flow",
  "due_date": "2026-04-15",
  "labels": ["backend", "security"],
  "labels_stripped": [],
  "issue_number": 43,
  "source": "github+flat",
  "details": "Implement JWT tokens and refresh logic",
  "created_at": "2026-04-01T00:26:27Z"
}

Label Validation & Retry:

  • If provided labels don't exist in the GitHub repo, invalid labels are automatically stripped
  • Issue creation is retried without the invalid labels
  • labels_stripped field in response shows which labels were removed
  • If all labels are invalid, issue is created without the --label flag

Errors:

  • 400 Bad Request — Missing required title field or invalid JSON
  • 401 Unauthorized — Missing or invalid Bearer token
  • 409 Conflict — TODO with this title already exists (in either source)
  • 422 Unprocessable Entity — Path traversal detected (invalid characters in title)

POST /api/v1/todos/{title}/complete — Mark TODO as complete in both sources

Request: POST /api/v1/todos/Complete%20user%20auth%20flow/complete

Response (200 OK):

{
  "id": "To1a2b3",
  "title": "Complete user auth flow",
  "status": "closed",
  "github_issue_closed": 43,
  "flat_file_marked_done": true,
  "completed_at": "2026-04-05T15:45:00Z"
}

Errors:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — TODO with this title not found in either source
  • 500 Internal Server Error — Subprocess error closing GitHub Issue or updating flat file

Security:

  • Path traversal protection: rejects /, \, .., and control characters in the title
  • Duplicate title detection prevents accidental overwrites
  • Authentication required: Bearer token or shared-key validation
  • Invalid label detection prevents API errors on malformed label names

Memory Promotion

Method Path Description
POST /api/v1/memory/promote Promote memory for a single agent (or orchestrator)
POST /api/v1/memory/promote-all Promote memory across all agents in agents.json

POST /api/v1/memory/promote — Trigger memory promotion for a single agent

Consolidates daily notes (/memories/daily/*.md) into the agent's MEMORY.md using LLM analysis. Durable facts are elevated, duplicates removed, and the knowledge base refreshed.

Request body (JSON):

{
  "agent": "devops"  // Optional — if omitted, promotes orchestrator memory
}

Response (200 OK):

{
  "status": "ok",
  "agent": "devops",
  "agent_path": "/opt/MyHomeDevops",
  "stdout": "Promoted 8 facts from 3 daily notes...",
  "stderr": "",
  "returncode": 0
}

Error responses:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — Unknown agent name
  • 503 Service Unavailable — Memory promoter script not found
  • 504 Gateway Timeout — Promotion exceeded 120-second timeout
  • 500 Internal Server Error — Subprocess error or other failure

POST /api/v1/memory/promote-all — Trigger memory promotion for ALL agents

Iterates through every agent in agents.json (including orchestrator) and runs memory promotion for each. Handles partial failures gracefully — continues promotion for other agents if one fails.

Request body: Empty or omitted

Response (200 OK):

{
  "status": "ok",
  "total": 4,
  "succeeded": 4,
  "failed": 0,
  "results": [
    {
      "agent": "orchestrator",
      "agent_path": "/opt/memories",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    },
    {
      "agent": "devops",
      "agent_path": "/opt/MyHomeDevops",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    }
  ]
}

Security:

  • Both endpoints require API authentication (Bearer token or shared-key validation)
  • Memory promotion is read-only for daily notes, write-only to MEMORY.md
  • Agent path resolved from agents.json; prevents directory traversal

Helper Script: For scheduling memory promotion via the task scheduler or cron:

bash scripts/promote_all_agents_memory.sh

PATCH /api/v1/sessions/{id}/settings — Update session settings

Modify session-level settings like verbose mode (tool call visibility). Settings are persisted and returned in subsequent session queries.

Request body (JSON):

{
  "silent_mode": false  // Show tool call lines; set to true to hide
}

Response (200 OK):

{
  "id": "sess_abc123",
  "silent_mode": false,
  "created_at": "2026-04-03T20:00:00Z",
  "updated_at": "2026-04-03T21:05:42Z"
}

Error responses:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — Session does not exist
  • 422 Unprocessable Entity — Invalid value (e.g., non-boolean for silent_mode)

Features:

  • Whitelist-based field filtering — only recognized fields are accepted (currently: silent_mode)
  • WebUI toggle button in header reflects and controls this setting
  • Tool call lines (.tc-line) hidden when silent_mode=true, shown when false
  • Does not affect logging or session history — only visual display

Security:

  • Requires API authentication (Bearer token)
  • Per-session settings — each user session has independent configuration

Quick Start

# Create a daily summary job (via API)

curl -X POST http://localhost:8000/api/v1/scheduler/jobs \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Daily standup",
    "schedule": "every day at 9am",
    "agent": "devops",
    "runtime": "copilot",
    "model": "gpt-5-mini",
    "mode": "restricted",
    "task": "Summarise open pull requests and any failing CI jobs",
    "notify": true,
    "recurring": true
  }'

Data is stored in /opt/.task-scheduler/ (jobs.json, results/, logs/).

Feature Flags

Wee-Orchestrator exposes a public GET /api/v1/config endpoint that the Web UI reads at boot to determine which features to display. Backend routes for disabled features are never registered.

Variable Default Description
SCHEDULER_ENABLED true Enable/disable the Task Scheduler API and Web UI panel

Disabling the Scheduler

# In .env
SCHEDULER_ENABLED=false

Effects when false:

  • All /api/v1/scheduler/* endpoints return 404 (routes not registered)
  • The 📅 Scheduler tab is hidden from the Web UI sidebar before auth — it never appears
  • GET /api/v1/config returns {"scheduler_enabled": false} for the browser to act on

To re-enable, set SCHEDULER_ENABLED=true (or remove the variable) and restart the service.

File Handling

Both the Telegram and WebEx connectors support sending and receiving files and images.

  • Receiving: files are downloaded to webex_downloads/ and injected into the agent context as a file path prompt
  • Sending: agents can produce local file paths that the connector uploads back to the user
  • Images: the Web UI serves AI-fetched images from /ai-media/ so the browser can render them inline

See WEBEX_FILE_HANDLING.md and FILE_MEDIA_HANDLING_SKILL.md for details.

Per-User Access Control

Agent & Model Pinning

Users can be locked to a specific agent, runtime, and model via pinned_users in the connector config:

"pinned_users": {
  "8193231291": {
    "agent": "family",
    "runtime": "copilot",
    "model": "gpt-5-mini"
  }
}

Pinned users cannot run /agent set — they receive a clear admin message. The pinned config is re-applied before every query, so even a session reset cannot bypass it.

Yolo Mode Restriction

By default all users may run /mode yolo. To restrict yolo access to a list of user IDs:

"yolo_allowed_users": ["8193231291", "9876543210"]

An empty list preserves the permissive default (all allowed).

File Structure

n8n-copilot-shim/
├── agent_manager.py           # Core: SessionManager, FastAPI app factory, all /api/v1/ endpoints
├── task_scheduler.py          # TaskScheduler class — schedule, pause, resume, results
├── telegram_connector.py      # Telegram bot long-polling connector
├── webex_connector.py         # WebEx webhook/RabbitMQ connector
├── agents.json                # Agent configuration (git-ignored)
├── agents.example.json        # Example configuration template
├── webui/
│   └── dist/                  # Built Web UI assets (index.html, app.js, app.css)
├── tests/
│   ├── test_agent_manager.py  # Core unit tests (62 tests)
│   └── test_new_features.py   # WebUI + scheduler feature tests (79 tests)
├── docs/plans/                # Planning docs
├── run_tests.sh               # Test runner
├── .testrc                    # Test configuration
├── .env.example               # Environment variable template
├── EXAMPLE_WORKFLOW.json      # N8N workflow example
├── ARCHITECTURE.md            # System architecture and Mermaid diagrams
├── RELEASE_NOTES.md           # Version history
└── README.md                  # This file

Architecture Summary

See ARCHITECTURE.md for full detail and Mermaid diagrams.

Key components:

Component Description
SessionManager Core AI execution engine — session state, slash commands, CLI dispatch, streaming queues
HistoryManager Per-user, per-channel chat history persistence
AuthManager Pairing-code auth, session token issuance, shared-key validation
RateLimiter Per-IP, per-endpoint sliding-window rate limiting
TaskScheduler Cron-like AI job scheduler embedded in the orchestrator (feature-flagged)
FastAPI app REST API (/api/v1/) + SSE streaming (/stream) + static Web UI mount (/ui)
TelegramConnector Long-polling Telegram bot → SessionManager bridge
WebEXConnector WebEx webhook / RabbitMQ → SessionManager bridge

Troubleshooting

Agents not loading

  • Check that agents.json exists in the script directory or current directory
  • Verify JSON syntax with python -m json.tool agents.json
  • Check file permissions

Session issues

  • Run /session reset to start fresh
  • Check session storage directories exist:
    • ~/.copilot/session-state/
    • ~/.local/share/opencode/storage/session/global/
    • ~/.claude/debug/
    • ~/.gemini/sessions/

Scheduler not running

  • Check SCHEDULER_JOBS_FILE path exists and is writable (/opt/.task-scheduler/jobs.json)
  • Verify the API server is running: sudo systemctl status agent-manager-api.service
  • Hit GET /api/v1/scheduler/status to see the daemon health report

CLI not found

  • Ensure copilot, opencode, claude, and gemini binaries are in PATH or at expected locations
  • Check /usr/bin/copilot, /usr/bin/claude, ~/.opencode/bin/opencode, and gemini in PATH

Web UI auth loop

  • Confirm the API server can reach your Telegram or WebEx bot to deliver the pairing code
  • Check PAIRING_CODE_TTL (default 300 s) — request a new code if it expired

Agent Orchestration

This project supports multi-agent orchestration with dynamic agent discovery. See the comprehensive agent documentation:

Quick Agent Start

# List available agents
/agent list

# Switch to an agent
/agent set devops

# Execute in agent context
"Deploy the latest version"

# Resume agent session
"What's the status?"

# Switch to different agent
/agent set family

All agents are loaded dynamically from agents.json, enabling easy expansion and customization.

Telegram Connector

The Telegram connector bridges Telegram chat with your N8N Copilot Shim agents.

Features

  • đŸ’Ŧ Receive messages from Telegram users
  • 👤 User pairing by Telegram user ID
  • 🔐 User access control (whitelist/blacklist)
  • đŸŽ¯ Route to any configured agent
  • âš™ī¸ Per-user session management

Memory Promotion

Method Path Description
POST /api/v1/memory/promote Promote memory for a single agent (or orchestrator)
POST /api/v1/memory/promote-all Promote memory across all agents in agents.json

POST /api/v1/memory/promote — Trigger memory promotion for a single agent

Consolidates daily notes (/memories/daily/*.md) into the agent's MEMORY.md using LLM analysis. Durable facts are elevated, duplicates removed, and the knowledge base refreshed.

Request body (JSON):

{
  "agent": "devops"  // Optional — if omitted, promotes orchestrator memory
}

Response (200 OK):

{
  "status": "ok",
  "agent": "devops",
  "agent_path": "/opt/MyHomeDevops",
  "stdout": "Promoted 8 facts from 3 daily notes...",
  "stderr": "",
  "returncode": 0
}

Error responses:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — Unknown agent name
  • 503 Service Unavailable — Memory promoter script not found
  • 504 Gateway Timeout — Promotion exceeded 120-second timeout
  • 500 Internal Server Error — Subprocess error or other failure

POST /api/v1/memory/promote-all — Trigger memory promotion for ALL agents

Iterates through every agent in agents.json (including orchestrator) and runs memory promotion for each. Handles partial failures gracefully — continues promotion for other agents if one fails.

Request body: Empty or omitted

Response (200 OK):

{
  "status": "ok",
  "total": 4,
  "succeeded": 4,
  "failed": 0,
  "results": [
    {
      "agent": "orchestrator",
      "agent_path": "/opt/memories",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    },
    {
      "agent": "devops",
      "agent_path": "/opt/MyHomeDevops",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    }
  ]
}

Security:

  • Both endpoints require API authentication (Bearer token or shared-key validation)
  • Memory promotion is read-only for daily notes, write-only to MEMORY.md
  • Agent path resolved from agents.json; prevents directory traversal

Helper Script: For scheduling memory promotion via the task scheduler or cron:

bash scripts/promote_all_agents_memory.sh

PATCH /api/v1/sessions/{id}/settings — Update session settings

Modify session-level settings like verbose mode (tool call visibility). Settings are persisted and returned in subsequent session queries.

Request body (JSON):

{
  "silent_mode": false  // Show tool call lines; set to true to hide
}

Response (200 OK):

{
  "id": "sess_abc123",
  "silent_mode": false,
  "created_at": "2026-04-03T20:00:00Z",
  "updated_at": "2026-04-03T21:05:42Z"
}

Error responses:

  • 401 Unauthorized — Missing or invalid Bearer token
  • 404 Not Found — Session does not exist
  • 422 Unprocessable Entity — Invalid value (e.g., non-boolean for silent_mode)

Features:

  • Whitelist-based field filtering — only recognized fields are accepted (currently: silent_mode)
  • WebUI toggle button in header reflects and controls this setting
  • Tool call lines (.tc-line) hidden when silent_mode=true, shown when false
  • Does not affect logging or session history — only visual display

Security:

  • Requires API authentication (Bearer token)
  • Per-session settings — each user session has independent configuration

Quick Start

# With environment variable
export TELEGRAM_BOT_TOKEN="your-token-here"
python telegram_connector.py

# Or with token argument
python telegram_connector.py --token "your-token-here"

Managing Users

# Allow specific user
python telegram_connector.py --token TOKEN --allow-user 123456789

# Deny user
python telegram_connector.py --token TOKEN --deny-user 123456789

# List allowed users
python telegram_connector.py --token TOKEN --list-users

See TELEGRAM_CONNECTOR.md for full documentation.

Contributing & Issue Tracking

GitHub Issues for Project Management

This project uses GitHub Issues as the single source of truth for all TODOs, feature requests, and bug reports.

Why GitHub Issues?

  • ✅ Centralized tracking across all sub-agents and features
  • ✅ Linked to code commits and pull requests
  • ✅ Searchable history of decisions and implementations
  • ✅ Clear ownership and assignment of work
  • ✅ Prioritization through labels and milestones

Issue Categories

We use labels to organize work:

Label Purpose Example
bug Bugs and defects "Message editing fails with 400 error"
feature New features "Add message reaction support"
enhancement Improvements to existing features "Improve error messages"
documentation Docs and guides "Add user guide for slash commands"
WebEX WebEX connector specific "Implement pinning in group rooms"
Telegram Telegram connector specific "Add Telegram reactions"
help wanted Open for contributions Any issue needing external help
blocked Blocked on external dependency "Waiting for WebEX API update"

Creating Issues

Before starting work, check for existing issues:

# View all open issues
gh issue list

# View WebEX-related issues
gh issue list --label WebEX

# View bugs
gh issue list --label bug

When NOT to Use TODO Comments

âš ī¸ Do NOT add TODO comments in code. Instead:

  1. Create a GitHub issue describing the work needed
  2. Reference the issue in commit messages: fix: resolve #42
  3. Assign ownership so it's tracked and visible
  4. Move to In Progress when you start work

Example:

# ❌ BAD - TODO in code
def pin_message(self, msg_id, room_id):
    # TODO: implement proper pinning when WebEX adds support
    pass

# ✅ GOOD - GitHub issue + clear code
def pin_message(self, msg_id, room_id):
    """Pin a message.

    Note: WebEX API doesn't support pinning in direct messages.
    See issue #42 for status on group room support.
    """
    pass

Outstanding Work

All outstanding work is tracked in GitHub Issues. Check the repository issues board to see:

  • In Progress - Work actively being done
  • Backlog - Planned but not started
  • Help Wanted - Open for contributions
  • Blocked - Waiting on dependencies

Start here: GitHub Issues

Release History

VersionChangesUrgencyDate
main@2026-05-08Latest activity on main branchHigh5/8/2026
v1.0.0Latest release: v1.0.0High4/9/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

saas-builderAI-native SaaS framework that builds full-stack apps using autonomous AI agents0.0.0
opentulpaSelf-hosted personal AI agent that lives in your DMs. Describe any workflow: triage Gmail, pull a Giphy feed, build a Slack bot, monitor markets. It writes the code, runs it, schedules it, and saves imain@2026-06-05
@dcyfr/aiPortable AI agent harness with plugin architecturev3.2.1
JackrabbitAIA Python framework for building flexible, multi-provider AI applications and Discord bots using OpenAI and Cohere.main@2026-05-28
GenericAgentSelf-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumptionv0.1.0

More in Frameworks

langchainThe agent engineering platform
deer-flowAn open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of ta
tqdmFast, Extensible Progress Meter
simBuild, deploy, and orchestrate AI agents. Sim is the central intelligence layer for your AI workforce.