🍀 Wee-Orchestrator

One platform. Every AI. Any channel.

Wee-Orchestrator is a unified AI agent platform that lets you chat with any AI CLI runtime — GitHub Copilot, Claude Code, OpenCode, Google Gemini, or OpenAI Codex — from Telegram, WebEx, or a beautiful browser-based Web UI. Switch models, agents, and runtimes on the fly with slash commands. Schedule recurring AI tasks. Send files and images. All from one place.

✨ Why Wee-Orchestrator?

Problem	Wee-Orchestrator Solution
Juggling multiple AI tools and CLIs	One unified interface across 5 runtimes and 17+ models
AI is stuck in the terminal	Chat from anywhere — Telegram, WebEx, or the Web UI
No memory between sessions	Persistent sessions with full conversation history
Can't automate AI tasks	Built-in task scheduler with cron-like scheduling
One-size-fits-all agents	Multi-agent architecture — switch agents per task
Complex setup	Zero-config bot creation with the Starter Kit

📸 Screenshots

Chat Interface	Task Scheduler

Secure Pairing Login	Architecture Overview

🚀 Key Features

🔀 5 AI Runtimes — GitHub Copilot CLI, Claude Code, OpenCode, Google Gemini, OpenAI Codex
💬 3 Channels — Telegram bot, WebEx bot (via RabbitMQ), glassmorphism Web UI with SSE streaming
🤖 Multi-Agent — Define specialized agents in agents.json, switch with /agent; hot-reload on change (no restart needed)
🔄 Live Model Switching — Change models mid-conversation with /model
📅 Task Scheduler — Schedule recurring AI jobs with natural language (every day at 9am)
📁 File & Image Support — Upload, download, and inline images across all channels
🎤 Audio Transcription — Voice messages auto-transcribed via Whisper (OpenAI or local)
🔐 Secure Auth — Pairing-code login, per-user ACLs, agent/model pinning, yolo/restricted modes
📜 Session History — Full conversation persistence with search and resume
⚡ Background Tasks — Delegate long-running work to background agents with in-thread status updates
🔔 In-Thread Notifications — Real-time task lifecycle updates (queued → running → complete) in your conversation
📋 Dual-Source TODOs — Sync TODOs between GitHub Issues (primary) and flat files (fallback) with auto-deduplication
🔧 Expandable Tool Calls — View tool invocations with collapsible output panels in WebUI; markdown rendering, error highlighting, silent mode support
💰 Token Usage Tracking — Real-time tracking of prompt/completion tokens and cost estimation across all runtimes; live stats displayed in WebUI footer
🔌 Extensible Skills — Plugin architecture for adding capabilities (Cisco Meraki, Home Assistant, etc.)
**⚙️ Slash Command Registry — Pure-server commands that bypass the LLM for reduced latency; auto-registers with Telegram BotFather for autocomplete; built-in /secret command for secure credential management

🏗️ Architecture

  Telegram ──► TelegramConnector ──┐
                                   │
  WebEx ─────► WebEXConnector ─────┼──► SessionManager ──► AI CLI Runtimes
                                   │       │                (Copilot, Claude,
  Browser ───► FastAPI /api/v1 ────┘       │                 OpenCode, Gemini,
                                           │                 Codex)
                                    TaskScheduler

Each inbound message flows through a channel connector, into the shared SessionManager (which handles slash commands, session state, and agent routing), and out to the selected AI CLI runtime as a subprocess. Responses stream back in real time.

For the full component diagram, sequence diagrams, and deployment topology, see ARCHITECTURE.md.

📋 Overview

Wee-Orchestrator provides a flexible framework to:

Chat with AI agents from Telegram, WebEx, or the browser-based Web UI
Call AI CLIs (Copilot, OpenCode, Claude Code, Gemini, Codex) from N8N workflows
Maintain session affinity across multiple conversation turns
Switch between different agent repositories dynamically
Configure agents via JSON config files instead of hardcoding
Support multiple AI models and runtimes
Schedule recurring AI tasks with the built-in Task Scheduler
Execute bash commands directly with ! prefix
Send and receive files and images over Telegram and WebEx
Enforce per-user agent pinning, model pinning, and yolo/restricted mode ACLs

For release history and feature documentation see CHANGELOG.md and RELEASE_NOTES.md.

⚡ Quick Start

# 1. Clone the repo
git clone https://github.com/leprachuan/Wee-Orchestrator.git
cd Wee-Orchestrator

# 2. Install dependencies
pip install -r requirements.txt

# 3. Configure your environment
cp .env.example .env    # Edit with your API keys and bot tokens

# 4. Define your agents
vi agents.json           # Add your agent definitions

# 5. Start the API server
python3 agent_manager.py --api

# 6. (Optional) Start channel connectors
python3 telegram_connector.py   # Telegram bot
python3 webex_connector.py      # WebEx bot

Then open http://localhost:8000/ui in your browser and pair via Telegram or WebEx.

🚀 Want to create your own bot? Use the Wee-Orchestrator Starter Kit to scaffold one in minutes.

💬 Slash Commands

Command	Description
`/agent <name>`	Switch to a different agent
`/model <model>`	Change AI model mid-conversation
`/runtime <runtime>`	Switch AI runtime (copilot, claude, claude-agent-sdk, gemini, opencode, copilot-sdk, codex, devin)
`/timeout <seconds>`	Adjust execution timeout
`/status`	Check running task status
`/cancel`	Cancel the current running task
`/schedule list`	List all scheduled jobs
`/schedule add <name> \| <schedule> \| <task>`	Create a scheduled job
`/help`	Show all available commands

Bot Setup Guide

Wee-Orchestrator enables you to create custom bots — specialized AI agents with their own configuration, knowledge base, and capabilities. Each bot is a self-contained repository that can be integrated with Wee-Orchestrator.

🚀 New here? Use the Wee-Orchestrator Starter Kit to scaffold a new bot in minutes — includes AGENTS.md, skill management with security scanning, memory structure, and setup scripts.

What is a Bot?

A bot is a Git repository containing:

Core Configuration — An AGENTS.md file defining agent behavior, preferences, and runtime configurations
Knowledge Base — A memory/ directory using the PARA methodology (Projects, Areas, Resources, Archive) for organizing operational knowledge
Focus Areas — Organized folders for specific domains (e.g., email_triage/, smart_home/, infrastructure/)
Skills Integration — References to specialized skills from pot-o-skills or custom skills
Documentation — README, guides, and workflow documentation

Example Bot Structure

my-bot/
├── README.md                  # Bot overview & usage
├── AGENTS.md                  # Agent behavior & configuration
├── .env                       # Credentials (git-ignored)
├── .gitignore                 # Protect secrets
│
├── memory/                    # Knowledge base (PARA methodology)
│   ├── projects/              # Active multi-step initiatives
│   ├── areas/                 # Ongoing responsibility areas
│   ├── resources/             # Reference material & best practices
│   └── archive/               # Completed/deprecated items
│
├── skills/                    # Custom skill implementations
│   ├── custom-skill-1/
│   └── custom-skill-2/
│
└── domain-folders/            # Domain-specific organization
    ├── email/                 # Email processing
    ├── home-automation/       # Smart home tasks
    └── infrastructure/        # Infrastructure management

Key Components

AGENTS.md

Defines the bot's behavior, preferences, and runtime configuration:

Agent name, purpose, and timezone
Preferred models and runtimes (Claude, Copilot, Gemini)
Tool permissions and access control
Sub-agent delegation rules
Skill definitions and repository locations
Security and credential management

Example excerpt:

---
name: my-bot
runtime: copilot
model: gpt-5-sonnet
timezone: EST/EDT
---

## Behavior

- Preferred AI runtime: Claude > Copilot > Gemini
- Task routing: Delegate to specialized sub-agents for domain expertise
- Notification channel: Telegram

Memory Structure (PARA)

Organize knowledge for long-term retention and reuse:

Projects/ — Active multi-step work (e.g., home-automation-setup.md)
Areas/ — Ongoing responsibilities (e.g., orchestration.md, security.md)
Resources/ — Reference material (e.g., best-practices.md, api-docs.md)
Archive/ — Completed or deprecated knowledge

Skills

Skills extend your bot's capabilities by providing pre-built integrations with external APIs and services. Skills should be sourced from reputable, official repositories to minimize security risks.

Recommended Skill Sources

pot-o-skills — Community skills for cloud networking and security
- Repository: https://github.com/leprachuan/pot-o-skills
- Skills: Cisco Meraki, Cisco Security Cloud Control, and more
- Status: Public, open-source, actively maintained
- Usage: Clone and link into your bot's skills/ directory
Anthropic Official Skills — Official skills from Anthropic
- Repository: https://github.com/anthropics/skills
- Status: Official, production-ready
- Security: Vetted and maintained by Anthropic team
- Best for: Claude AI integration, code generation, analysis
Custom Skills — Implement your own domain-specific skills
- Location: ./skills/ directory in your bot repository
- Documentation: Must include SKILL.md, README, and examples
- Security: You control the code and updates

⚠️ Skills Security Guidelines

Skills have full access to your system — they can execute commands, read files, and call APIs. Follow these practices:

✅ Only use official skills from original software/service authors
- Example: Use Cisco's official Meraki skill, not community forks
- Example: Use Anthropic's official skills, not third-party versions
✅ Validate before installation
- Review the source code in the skill repository
- Check for hardcoded credentials or suspicious patterns
- Verify the repository is actively maintained
- Look for security issues reported in GitHub Issues
✅ Use trusted repositories
- Official repos (Anthropic, GitHub, etc.)
- Long-standing community projects with active maintainers
- Projects with security policies and issue tracking
- Avoid random GitHub repos without documentation or maintenance
⚠️ Audit custom skills carefully
- Never trust a skill without reviewing its code first
- Check for unintended API calls or data exfiltration
- Validate input sanitization
- Ensure credentials are handled safely
✅ Keep skills updated
- Periodically review and update to latest versions
- Subscribe to security advisories from skill repositories
- Remove unused skills to reduce attack surface

Using Skills in Your Bot

# Link public skills from pot-o-skills (verified, open-source)
ln -s /opt/pot-o-skills/cisco-meraki ./skills/
ln -s /opt/pot-o-skills/cisco-security-cloud-control ./skills/

# Link Anthropic official skills (verified, official)
ln -s /opt/anthropic-skills/code-analysis ./skills/
ln -s /opt/anthropic-skills/file-operations ./skills/

# Or implement custom skills in skills/ directory
mkdir skills/my-custom-skill

Discovering Skills

pot-o-skills: https://github.com/leprachuan/pot-o-skills

cd /opt && git clone https://github.com/leprachuan/pot-o-skills.git

Anthropic Skills: https://github.com/anthropics/skills

cd /opt && git clone https://github.com/anthropics/skills.git

Custom Community Skills: Search GitHub for topic:agent-skills with verification:
- ✅ Active maintenance (recent commits)
- ✅ Clear documentation
- ✅ Security policy file
- ✅ Public issue tracking

Domain Folders

Organize bot work by area of focus:

Keep related scripts, templates, and documentation together
Example: email/ for email processing, home/ for automation tasks
Each folder can have its own README with domain-specific guidance

Getting Started

💡 Recommended: Fork the Wee-Orchestrator Starter Kit instead of starting from scratch — it includes everything below pre-configured with best practices, security scanning, and setup scripts.

Create your bot repository:

mkdir my-bot && cd my-bot
git init
git remote add origin https://github.com/username/my-bot.git

Add AGENTS.md: Copy and customize the AGENTS.md template from Wee-Orchestrator with your bot's preferences

Create memory directory:

mkdir -p memory/{projects,areas,resources,archive}
echo "# Knowledge Base" > memory/INDEX.md

Add .env and .gitignore:

cp /opt/n8n-copilot-shim-dev/.env.example .env
echo ".env" >> .gitignore
echo "*.key" >> .gitignore
echo "secrets.json" >> .gitignore

Link or implement skills:

mkdir skills
ln -s /opt/pot-o-skills skills/cisco-meraki

Register with Wee-Orchestrator: Update Wee-Orchestrator's agents.json to include your bot:

{
  "agents": [
    {
      "name": "my-bot",
      "path": "/opt/my-bot",
      "enabled": true
    }
  ]
}

Best Practices

Secrets First: Store all credentials in .env (git-ignored), never commit secrets
Document Decisions: Use memory/areas/ to record architectural decisions and conventions
Skill Reuse: Leverage pot-o-skills before building custom skills
Domain Organization: Group related work into focused folders for maintainability
README Clarity: Each folder should have clear purpose and examples

Resources

Wee-Orchestrator: https://github.com/leprachuan/Wee-Orchestrator
pot-o-skills: https://github.com/leprachuan/pot-o-skills (Cisco Meraki, SCC, and more)
AGENTS.md Template: See ./AGENTS.md for full configuration reference

Requirements

This project requires one or more of the following AI CLI tools to be installed:

Claude Code CLI

Prerequisites:

Node.js 18+ (for npm installation) OR native binary support
Anthropic API key for authentication

Installation:

Native binary (recommended):

curl -fsSL https://claude.ai/install.sh | bash

Or via npm:

npm install -g @anthropic-ai/claude-code

Supported Systems: macOS 10.15+, Linux (Ubuntu 20.04+/Debian 10+, Alpine), Windows 10+ (via WSL)

Reference: Claude Code Quickstart Documentation

GitHub Copilot CLI

Prerequisites:

Node.js 22 or higher
Active GitHub Copilot subscription (Pro, Pro+, Business, or Enterprise plan)
GitHub account for authentication

Installation:

npm install -g @github/copilot
copilot  # Launch and authenticate

For authentication, use the /login command or set GH_TOKEN environment variable with a fine-grained PAT.

Supported Systems: macOS, Linux, Windows (via WSL)

Reference: GitHub Copilot CLI Installation Guide

OpenCode CLI

Prerequisites:

Node.js or compatible runtime

Installation (Recommended):

curl -fsSL https://opencode.ai/install | bash

Or via npm:

npm i -g opencode-ai@latest

Alternative package managers:

Homebrew: brew install opencode
Scoop (Windows): scoop bucket add extras && scoop install extras/opencode
Arch Linux: paru -S opencode-bin

Supported Systems: Windows, macOS, Linux

Reference: OpenCode Documentation

Google Gemini CLI

Prerequisites:

Python 3.7 or higher
Google Cloud account with Gemini API access
Google API key for authentication

Installation:

pip install google-generativeai
# Or using the CLI wrapper
pip install gemini-cli

Authentication:

Set your API key as an environment variable:

export GOOGLE_API_KEY='your-api-key-here'

Or configure it in your shell profile:

echo 'export GOOGLE_API_KEY="your-api-key-here"' >> ~/.bashrc
source ~/.bashrc

Supported Systems: Windows, macOS, Linux

Reference: Google Gemini API Documentation

Tool Permissions & Access Control

All AI runtimes in this system are configured with full tool access to enable read, write, and execute operations without approval prompts. This provides maximum automation capabilities.

Permission Configuration by Runtime

GitHub Copilot CLI

Flags Used: --allow-all-tools --allow-all-paths
Enables:
- All MCP tools and shell commands without approval
- Read/write/execute permissions for all files and directories
Security Note: Gives Copilot the same permissions as your user account

Claude Code CLI

Flags Used: --permission-mode bypassPermissions
Enables:
- Auto-approve all file edits, writes, and reads
- Execute shell commands without approval
- Access web/network tools without prompts
Also Known As: YOLO mode or dontAsk mode

OpenCode CLI

Configuration: Uses opencode.json file for permission settings
Required Setup:
1. Copy the example config: cp opencode.example.json opencode.json
2. Place opencode.json in your agent directories or project root
Permissions Enabled:
- edit: allow
- write: allow
- bash: allow
- read: allow
- webfetch: allow
Reference: OpenCode Permissions Documentation

Google Gemini CLI

Flags Used: --yolo
Enables:
- Read/write file operations without confirmation
- Shell command execution without approval
- All built-in tools with unrestricted access
Built-in Tools: read_file, write_file, run_shell_command

OpenAI Codex CLI

Flags Used: --dangerously-bypass-approvals-and-sandbox
Enables:
- Disables all approval prompts
- Removes sandbox restrictions (full file system access)
- Allows all shell commands and tools without confirmation
Security Note: Only use in trusted, controlled environments

Claude Agent SDK (Python)

Package: claude-agent-sdk>=0.1.0 (install via pip install claude-agent-sdk)
Enables:
- In-process async execution (no subprocess spawn)
- Structured error types (CLINotFoundError, CLIConnectionError, ProcessError)
- Native permission_mode field instead of CLI flags
- Session continuity via ResultMessage.session_id capture
Permission Modes:
- elevated → bypassPermissions (full access, no prompts)
- sandboxed → plan (read-only + approval for writes)
- restricted → default (standard safety checks)
Streaming: Real-time text chunks pushed to WebUI SSE consumers via _StreamBuffer
Tool Calls: ToolUseBlock/ToolResultBlock detection emits standardized tool_call events
Usage: /runtime set claude-agent-sdk
Issues: #77, #87, #91, #94

GitHub Copilot SDK (Python)

Package: github-copilot-sdk>=0.1.0 (install via pip install github-copilot-sdk)
Enables:
- In-process async execution via CopilotClient
- Real-time streaming via ASSISTANT_STREAMING_DELTA/ASSISTANT_MESSAGE_DELTA events
- Tool call tracking via TOOL_EXECUTION_START/COMPLETE and COMMAND_EXECUTE events
- Session resumption and structured error handling
Usage: /runtime set copilot-sdk
Issues: #76, #87, #91

Wee Native Runtime

Also Known As: wee — OpenAI-compatible API backend runtime
Description: Connects to any OpenAI-compatible API endpoint (Ollama, OpenRouter, LM Studio, etc.) without depending on external CLI tools like GitHub Copilot CLI, Claude Code, or OpenCode.
Supported Backends:
- Ollama at http://192.168.1.101:11434/v1 — local, free (Kubuntu)
- OpenRouter at https://openrouter.ai/api/v1 — cloud fallback, 100+ models
- LM Studio at http://localhost:1234/v1 — local alternative
Model Format: Uses provider/model_name prefix syntax for auto-resolving API base URL and API key:
- ollama/gemma4:e4b — Ollama on Kubuntu (default)
- openrouter/meta-llama/llama-4-scout — OpenRouter cloud
- lmstudio/qwen2.5-7b — LM Studio local

Configuration Example:

{
  "runtime": "wee",
  "model": "ollama/gemma4:e4b"
}

Environment Variables:
- WEE_API_BASE — Override API base URL (e.g., http://192.168.1.101:11434/v1)
- WEE_API_KEY — API key for authenticated endpoints (OpenRouter, etc.)
- WEE_DEFAULT_MODEL — Default model when model not specified in config
Features:
- In-process execution using OpenAI Python SDK
- Real-time SSE streaming to WebUI
- Provider presets auto-resolve API base URLs and API keys
- Graceful error handling with informative messages
- Background task subprocess execution via wee_runtime.py
Implementation: run_wee_native() in agent_manager.py; wee_runtime.py standalone CLI for background tasks
Usage: /runtime set wee
Features & Improvements:
- OpenRouter integration: Full UI support for cloud-based models with 300s cached discovery & keyring-based API key management (Issue #119)
- Model grouping in UI: Ollama and OpenRouter models displayed in separate dropdown optgroups
- Dynamic OpenRouter model discovery: Live catalog fetch from OpenRouter API with per-provider grouping (Issue #157)
Bug Fixes:
- Wrong Ollama port corrected: 11436 → 11434 (Issue #105)
- httpx.Timeout(connect=15s) and max_retries=0 added to OpenAI client for fast-fail on bad endpoints (Issue #105)
- Model resolution fixed: get_models_for_runtime('wee') returns flat strings; get_model_from_name() strips provider prefix (ollama/) and prefers exact/shortest match (Issue #105)
Bug Fixes (continued):
- OpenRouter 401 auth fixed: OPENROUTER_API_KEY env var + keyring resolution replaces silent 'ollama' fallback; raises clear error when no key found (Issue #153)
Issues: #88, #105, #119, #153, #157

Wee CLI (`wee_cli.py`)

Also Known As: wee — standalone terminal AI assistant
Description: A user-facing command-line tool for the Wee ecosystem. Similar in style to GitHub Copilot CLI, Claude Code CLI, and Codex CLI. Supports single-shot prompts, interactive REPL, stdin piping, and tool calling via any OpenAI-compatible backend.
Supported Backends: Same as Wee Native Runtime (Ollama, OpenRouter, LM Studio)

Quick Start:

# Single-shot
python3 wee_cli.py "What is the capital of France?"
# Interactive REPL
python3 wee_cli.py --interactive
# Pipe from stdin
echo "summarize this" | python3 wee_cli.py --model ollama/qwen3:8b

Key Flags:

Flag	Short	Default	Description
`--model`	`-m`	`ollama/qwen3:8b`	Model ID with provider prefix
`--permission`	`-p`	`restricted`	Tool execution level: `restricted` / `auto` / `elevated`
`--output`	`-o`	`text`	Output format: `text` / `json` / `markdown`
`--tools`	`-t`	off	Enable tool calling (bash, python)
`--interactive`	`-i`	off	Enter interactive REPL mode
`--system`	`-s`	none	System prompt override
`--temperature`	`-T`	none	Sampling temperature
`--timeout`		120s	Request timeout
`--api-key`	`-k`	env/keyring	API key override (prefer env var)
`--api-base`	`-b`	auto	Custom API base URL
`--config`		`~/.wee/config.json`	Config file path

Permission Levels:
- restricted (default) — tool calls blocked; safe for untrusted input
- auto — tool calls confirmed per invocation; suitable for interactive use
- elevated — tool calls unrestricted; use in trusted automation
Output Formats:
- text (default) — plain streamed output
- json — full response as a JSON object {"response": "...", "model": "..."}
- markdown — rich-rendered markdown via rich library (falls back to plain text)

Config File (~/.wee/config.json):

{
  "model": "ollama/qwen3:8b",
  "system_prompt": "You are a helpful assistant",
  "tools": false,
  "permission": "restricted",
  "output_format": "text"
}

Environment Variables:
- WEE_MODEL — Default model (overridden by --model)
- WEE_API_KEY — API key (prefer over --api-key to avoid exposure in ps aux)
- WEE_API_BASE — API base URL override
Implementation: wee_cli.py (re-uses core from wee_runtime.py)
Issues: #158

Security Considerations

⚠️ Warning: These configurations grant AI agents extensive system access:

Full file system access: Can read, modify, or delete any file your user can access
Command execution: Can run any shell command with your user privileges
No safety prompts: All operations execute automatically without confirmation

Best Practices:

Use in controlled environments: Development containers, VMs, or sandboxed systems
Regular backups: Maintain backups of critical files and directories
Code review: Review AI-generated changes before committing to production
Limit agent scope: Configure agents to work in specific project directories
Monitor activity: Review session logs and agent outputs regularly

Recommended Use Cases:

✅ Development and testing environments
✅ Automated CI/CD pipelines in isolated containers
✅ Personal projects with version control
❌ Production systems without review
❌ Shared systems with sensitive data
❌ Public or untrusted environments

Configuration

Agent Configuration

The system loads agents from agents.json or a custom config file. Each agent represents a repository context where the AI CLI will operate.

Config Format:

{
  "agents": [
    {
      "name": "devops",
      "description": "DevOps and infrastructure management",
      "path": "/path/to/MyHomeDevops"
    },
    {
      "name": "projects",
      "description": "Software development projects",
      "path": "/path/to/projects"
    }
  ]
}

Configuration Fields:

name (required): Short identifier for the agent (used in /agent set commands)
description (required): Brief human-readable description of the agent
path (required): Full path to the repository or project directory

Environment Configuration

⚠️ API_HOST Security Warning Never set API_HOST=0.0.0.0 — this exposes the server on every network interface including your LAN and any public NIC. Always bind to specific trusted interfaces (e.g. 127.0.0.1,<tailscale-ip>). See Network Binding & Secure Access.

The default agent, model, and runtime can be customized via environment variables. This is useful for:

Different users having different defaults
Docker container configuration
CI/CD pipeline customization
Development vs. production setups

Available Environment Variables:

# Default agent for new sessions
COPILOT_DEFAULT_AGENT=orchestrator        # Default: orchestrator

# Default model for new sessions  
COPILOT_DEFAULT_MODEL=gpt-5-mini          # Default: gpt-5-mini

# Default runtime for new sessions
COPILOT_DEFAULT_RUNTIME=copilot           # Default: copilot

Usage Examples:

# Set orchestrator as default
export COPILOT_DEFAULT_AGENT=orchestrator
export COPILOT_DEFAULT_RUNTIME=copilot

# Or set family agent with Claude runtime
export COPILOT_DEFAULT_AGENT=family
export COPILOT_DEFAULT_MODEL=claude-sonnet
export COPILOT_DEFAULT_RUNTIME=claude

# Run the agent
python3 agent_manager.py "Your prompt" "session_id"

Docker Example:

ENV COPILOT_DEFAULT_AGENT=orchestrator
ENV COPILOT_DEFAULT_MODEL=gpt-5-mini
ENV COPILOT_DEFAULT_RUNTIME=copilot

Reference Configuration:

Copy .env.example to .env and customize:

cp .env.example .env
# Edit .env with your defaults

When environment variables are not set, the system uses these hardcoded defaults:

Agent: orchestrator
Model: gpt-5-mini
Runtime: copilot

Setup

Copy the agent manager script:

cp agent_manager.py /usr/local/bin/agent-manager
chmod +x /usr/local/bin/agent-manager

Configure your agents:
- Copy agents.example.json to agents.json
- Edit agents.json with your actual repository paths
- Place agents.json in the same directory as the script or current working directory
Optional: Specify config location via environment variable
```
export AGENTS_CONFIG=/path/to/custom/agents.json
```

Usage

Command Line

The agent manager supports both positional arguments (for backwards compatibility) and named options for more flexibility.

Basic Usage (Positional Arguments)

python agent_manager.py "<prompt>" [session_id] [config_file]

Arguments:

prompt: The prompt/command to send to the AI CLI
session_id (optional): N8N session identifier for tracking conversations (default: "default")
config_file (optional): Path to agents.json config file

Examples:

# Basic usage
python agent_manager.py "List all files in the current directory"

# With session ID
python agent_manager.py "Continue debugging the issue" "session-123"

# With custom config file
python agent_manager.py "Deploy the app" "session-456" "/etc/agents.json"

Advanced Usage (Named Arguments)

python agent_manager.py [options] "<prompt>" [session_id]

Options:

Agent Options:

--agent NAME - Set the agent to use (e.g., devops, family, projects)
--list-agents - List all available agents and exit

Model Options:

--model NAME - Set the model to use (e.g., gpt-5, sonnet, gemini-1.5-pro)
--list-models - List all available models for current runtime and exit

Runtime Options:

--runtime NAME - Set the runtime to use (choices: copilot, opencode, claude, claude-agent-sdk, gemini, copilot-sdk, codex, devin)
--list-runtimes - List all available runtimes and exit

Configuration:

--config FILE or -c FILE - Path to agents.json configuration file

Examples:

# List available agents
python agent_manager.py --list-agents

# List available agents with custom config
python agent_manager.py --list-agents --config my-agents.json

# List available runtimes
python agent_manager.py --list-runtimes

# List available models
python agent_manager.py --list-models

# Set agent via CLI
python agent_manager.py --agent devops "Check server status"

# Set runtime and model via CLI
python agent_manager.py --runtime gemini --model gemini-1.5-pro "Analyze this code"

# Combine multiple options
python agent_manager.py --agent family --runtime claude --model sonnet "Find recipes for dinner"

# Use custom configuration file
python agent_manager.py --config /etc/my-agents.json --agent projects "Review pull requests"

# All options together
python agent_manager.py --config my-agents.json --agent devops --runtime claude --model haiku "Deploy to production" "session-123"

Getting Help:

python agent_manager.py --help

Slash Commands

Interact with the agent manager using slash commands:

Bash Commands

!<command>                 # Execute bash command directly (e.g., !pwd, !ls -la)

Examples:

!pwd                       # Show current working directory
!echo "Hello World"        # Echo a message
!ls -lh                    # List files with details
!date                      # Show current date/time
!git status                # Run git commands
!python3 --version         # Check installed versions

Features:

Commands execute directly without hitting any AI runtime
10-second timeout for safety
Runs in current working directory
Supports pipes, redirects, and command chaining (&&, ||, |)
Returns stdout/stderr output

Runtime Management

/runtime list              # Show available runtimes (copilot, opencode, claude, gemini)
/runtime set <runtime>     # Switch runtime (e.g., /runtime set gemini)
/runtime current           # Show current runtime

Model Management

/model list                # Show available models for current runtime
/model set "<model>"       # Switch model (e.g., /model set "claude-opus-4.5")
/model current             # Show current model

Agent Management

/agent list                # Show all available agents with descriptions
/agent set "<agent>"       # Switch to an agent (e.g., /agent set "projects")
/agent current             # Show current agent and its context

Session Management

/session reset             # Reset the current session (starts fresh next message)
/help                      # Show all available commands

Query Management

/status                    # Check status of running query for this session
/cancel                    # Cancel running query for this session

Query Tracking: When a query is executing, the agent manager tracks its process ID (PID), runtime, agent, and output. Use /status to check if a query is running and see recent output, or /cancel to terminate a long-running query.

Secrets Management

/secret set <name>         # Create/update a secret (value read from stdin)
/secret get <name>         # Retrieve a secret value
/secret list               # List all secret names (values redacted)
/secret delete <name>      # Remove a secret

Features:

Secrets stored securely via secret_tool.py (never exposed in shell history or LLM context)
Name validation: alphanumeric, dots, hyphens, underscores only (^[A-Za-z0-9._-]+$)
stdin-based input prevents secrets from appearing in command history
Pre-LLM dispatch — secrets never touch the AI model
Supported on all channels (Telegram, WebEx, Web UI)

Examples:

echo "my-db-password" | /secret set db_password
/secret get db_password    # Returns: my-db-password
/secret list               # Returns: db_password, api_key, github_token
/secret delete db_password

Programmatic Secret Access in AI Agents (wee_executor)

AI agents running in privileged modes can retrieve secrets programmatically via the get_secret() capability in wee_executor.py (F024).

When to use:

AI agents need secure access to credentials (API keys, database passwords) during task execution
Secrets must never be logged or exposed to LLM context
Only available in interactive and sync modes; blocked in background and api modes for security

Requirements:

Elevation flag: Task must run with WEE_ELEVATED=true in the session environment
Name validation: Secret names must match ^[A-Za-z0-9._-]+$ (alphanumeric, dot, hyphen, underscore)
Mode restriction: Only callable from interactive or sync mode sessions

Capability signature:

# Called within an AI agent's context
get_secret(
    name: str,           # Secret name (e.g., "GITHUB_TOKEN")
    backend: str = "keyring"  # Storage backend: "keyring" or "file"
) -> Dict
# Returns: {status, name, backend, value} on success
#          {error, code} on failure (e.g., ELEVATION_REQUIRED, INVALID_NAME)

Agent Context Injection: When an agent runs with WEE_ELEVATED=true, agent_manager.py automatically injects get_secret() documentation and usage examples into the agent's context. The agent can then call get_secret() to retrieve secrets needed for the task.

Security:

🔐 Elevation requirement: Prevents accidental secret access from untrusted agents
🛡️ Name validation: Blocks path traversal attempts (e.g., ../etc/passwd rejected)
🚫 Mode filtering: Only in interactive/sync modes; disabled in background/api for API call safety
📋 Audit logging: All calls logged with name + backend; secret values never logged for compliance
⏱️ Rate limiting: 50 requests/minute per session to prevent brute-force attacks

How it works:

AI agent calls get_secret(name="GITHUB_TOKEN", backend="keyring")
wee_executor.py validates the name and checks WEE_ELEVATED=true
Subprocess delegates to secret_tool.py to retrieve the secret value
Secret is returned to the agent but never written to logs
Agent can use the secret for its task (e.g., authenticate to GitHub API)

Example agent usage (conceptual):

Agent (with WEE_ELEVATED=true):
  "I need to push code to GitHub. Let me get my credentials."
  
  get_secret(name="GITHUB_TOKEN", backend="keyring")
  → {status: "success", name: "GITHUB_TOKEN", value: "ghp_...", backend: "keyring"}
  
  # Now the agent has the token and can authenticate API calls

Available backends:

keyring (default): System keyring (GNOME Keyring, Macos Keychain, etc.)
file: Encrypted JSON store (requires cryptography + python-keyring)

See docs/secret-tool.md for CLI and storage backend details.

N8N Integration

Use in an N8N workflow:

Basic N8N Integration (Positional Arguments)

// Execute the agent manager from N8N
const { exec } = require('child_process');
const prompt = "Your prompt here";
const sessionId = "n8n_session_123";
const configFile = "/path/to/agents.json";

exec(`python agent_manager.py "${prompt}" "${sessionId}" "${configFile}"`,
  (error, stdout, stderr) => {
    if (error) console.error(error);
    console.log(stdout);
  }
);

Advanced N8N Integration (Named Arguments)

// Execute with specific agent, runtime, and model
const { exec } = require('child_process');
const agent = "devops";
const runtime = "claude";
const model = "sonnet";
const prompt = "Check production status";
const sessionId = "n8n_session_123";

const cmd = `python agent_manager.py --agent ${agent} --runtime ${runtime} --model ${model} "${prompt}" "${sessionId}"`;

exec(cmd, (error, stdout, stderr) => {
  if (error) console.error(error);
  console.log(stdout);
});

List Agents from N8N

// Get available agents dynamically
const { exec } = require('child_process');
const configFile = "/path/to/agents.json";

exec(`python agent_manager.py --list-agents --config ${configFile}`,
  (error, stdout, stderr) => {
    if (error) console.error(error);
    // Parse stdout to get agent list
    console.log(stdout);
  }
);

Session Management

Sessions are automatically tracked and stored in:

Copilot: ~/.copilot/n8n-session-map.json
OpenCode: ~/.opencode/n8n-session-map.json
Claude: ~/.claude/ (debug directory)
Gemini: ~/.gemini/sessions/

Each N8N session ID is mapped to:

A unique backend session ID (for resuming AI CLI sessions)
Current runtime (copilot/opencode/claude/claude-agent-sdk/gemini/copilot-sdk)
Current model
Current agent

Session data persists across requests, allowing multi-turn conversations.

Query Tracking

Running queries are tracked in ~/.copilot/running-queries.json with:

PID: Process ID for the running query
Runtime: Which AI runtime is executing the query
Agent: Which agent context is being used
Start Time: When the query started
Last Output: Recent output snippet (last 500 characters)

This enables the /status and /cancel commands to monitor and control long-running queries.

Default Behavior

When creating a new session:

Runtime: copilot (use /runtime set to change)
Model: gpt-5-mini (Copilot) / opencode/gpt-5-nano (OpenCode) / haiku (Claude) / gemini-1.5-flash (Gemini)
Agent: devops (or first available agent from config)

Background Task Agent Isolation (#75)

When a background task is created without an explicit agent field, the system resolves the agent via get_default_agent() — never from an existing session. This prevents session agent leakage where a task dispatched from a specialized agent session (e.g., devops) would silently run under that agent instead of the system default.

Safe inherited fields (copied from existing same-identity sessions):

runtime — inherits the session's active runtime
model — inherits the session's active model
notification_preference — inherits notification routing preference

Never inherited from sessions:

agent — always resolved from the request body or system default

This guarantee is enforced in _compute_bg_task_defaults() via an explicit SAFE_FIELDS whitelist. Agent must be explicitly provided in the request body to override the default:

{
  "prompt": "Deploy the app",
  "agent": "devops"
}

Advanced Features

Dynamic Agent Loading

Instead of hardcoding agent paths, the system:

Looks for agents.json in the current directory
Falls back to the script directory if not found
Supports custom config paths via argument

Session Resumption

The system automatically detects and resumes existing sessions
If a session is lost or corrupted, it starts a fresh session automatically
Use /session reset to explicitly clear session state

Model Resolution

The system intelligently matches model names:

Exact matches (case-insensitive)
Substring/suffix matching
Latest version preference for ambiguous matches

Metadata Stripping

Automatically removes CLI metadata from output:

Thinking tags (<think>...</think>)
Token usage statistics
Session headers and banners

Session Memory Injection

Memory context is automatically injected at session creation time for all code paths:

When: Memory is injected once per session in build_agent_context_prompt() when the session is first created
What: MEMORY.md (persistent facts) and daily notes (today/yesterday timestamps) from memories/daily/
Scope: All session types — background tasks, interactive sessions, queued jobs, and promoted sessions
Single Injection: The memory_injected flag ensures context is prepended exactly once per session, preventing duplication
Sub-Task Handling: Sub-tasks created from within a background task (via origin_session_id) automatically skip re-injection
Fail-Silent: If memory files are missing, tasks continue without context (no errors)
No Wrapper Block: Memory sections are injected raw without [MEMORY CONTEXT] wrapper markers for cleaner output

This unified approach ensures all agents have access to relevant context without fragile prompt-based injection or code-path-specific handling.

Testing

A comprehensive test suite is included to ensure code quality and prevent regressions when making changes.

Running Tests

Stateless Query Endpoint

Method	Path	Description
`POST`	`/api/v1/query`	One-shot stateless query endpoint

POST /api/v1/query — Execute a single query without session management

A lightweight, ephemeral-session endpoint for programmatic AI queries. Perfect for CI/CD pipelines, scripts, and integrations that don't need persistent session state.

Request body (JSON):

{
  "prompt": "What is 2 + 2?",
  "runtime": "copilot",
  "model": "claude-haiku-4.5",
  "agent": "orchestrator",
  "timeout": 60
}

Response (200 OK):

{
  "response": "2 + 2 = 4",
  "runtime": "copilot",
  "model": "claude-haiku-4.5",
  "elapsed_ms": 2150
}

Parameters:

prompt (string, required) — The query or command to send to the AI runtime
runtime (string, required) — AI runtime: copilot, opencode, claude, gemini, or codex
model (string, required) — Model name or alias (e.g., claude-haiku-4.5, gpt-5-mini)
agent (string, optional) — Agent context to use (default: orchestrator)
timeout (integer, optional) — Query timeout in seconds (default: 60)

Error responses:

400 Bad Request — Missing required fields (prompt, runtime, model) or invalid JSON
401 Unauthorized — Missing or invalid Bearer token
404 Not Found — Unknown runtime, model, or agent
429 Too Many Requests — Rate limit exceeded (30 requests per minute per IP)
504 Gateway Timeout — Query exceeded specified timeout

Features:

Stateless — No session created; ephemeral context cleaned up automatically after response
Rate Limited — 30 requests/minute per IP address (sliding window)
Full Control — Choose runtime, model, and agent per request
Security — Requires API authentication; executes with the authority of the calling user/token

Security:

Requires API authentication (Bearer token or shared-key validation)
Runs with the authority of the calling API user (rate-limited by IP)
Ephemeral sessions are not persisted or visible in session history
Input validation prevents agent/model traversal attacks

Memory Promotion

Method	Path	Description
`POST`	`/api/v1/memory/promote`	Promote memory for a single agent (or orchestrator)
`POST`	`/api/v1/memory/promote-all`	Promote memory across all agents in agents.json

POST /api/v1/memory/promote — Trigger memory promotion for a single agent

Consolidates daily notes (/memories/daily/*.md) into the agent's MEMORY.md using LLM analysis. Durable facts are elevated, duplicates removed, and the knowledge base refreshed.

Request body (JSON):

{
  "agent": "devops"  // Optional — if omitted, promotes orchestrator memory
}

Response (200 OK):

{
  "status": "ok",
  "agent": "devops",
  "agent_path": "/opt/MyHomeDevops",
  "stdout": "Promoted 8 facts from 3 daily notes...",
  "stderr": "",
  "returncode": 0
}

Error responses:

401 Unauthorized — Missing or invalid Bearer token
404 Not Found — Unknown agent name
503 Service Unavailable — Memory promoter script not found
504 Gateway Timeout — Promotion exceeded 120-second timeout
500 Internal Server Error — Subprocess error or other failure

POST /api/v1/memory/promote-all — Trigger memory promotion for ALL agents

Iterates through every agent in agents.json (including orchestrator) and runs memory promotion for each. Handles partial failures gracefully — continues promotion for other agents if one fails.

Request body: Empty or omitted

Response (200 OK):

{
  "status": "ok",
  "total": 4,
  "succeeded": 4,
  "failed": 0,
  "results": [
    {
      "agent": "orchestrator",
      "agent_path": "/opt/memories",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    },
    {
      "agent": "devops",
      "agent_path": "/opt/MyHomeDevops",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    }
  ]
}

Security:

Both endpoints require API authentication (Bearer token or shared-key validation)
Memory promotion is read-only for daily notes, write-only to MEMORY.md
Agent path resolved from agents.json; prevents directory traversal

Helper Script: For scheduling memory promotion via the task scheduler or cron:

bash scripts/promote_all_agents_memory.sh

PATCH /api/v1/sessions/{id}/settings — Update session settings

Modify session-level settings like verbose mode (tool call visibility). Settings are persisted and returned in subsequent session queries.

Request body (JSON):

{
  "silent_mode": false  // Show tool call lines; set to true to hide
}

Response (200 OK):

{
  "id": "sess_abc123",
  "silent_mode": false,
  "created_at": "2026-04-03T20:00:00Z",
  "updated_at": "2026-04-03T21:05:42Z"
}

Error responses:

401 Unauthorized — Missing or invalid Bearer token
404 Not Found — Session does not exist
422 Unprocessable Entity — Invalid value (e.g., non-boolean for silent_mode)

Features:

Whitelist-based field filtering — only recognized fields are accepted (currently: silent_mode)
WebUI toggle button in header reflects and controls this setting
Tool call lines (.tc-line) hidden when silent_mode=true, shown when false
Does not affect logging or session history — only visual display

Security:

Requires API authentication (Bearer token)
Per-session settings — each user session has independent configuration

🔧 Tool Call Visualization

Issue #115: Inline Expandable Tool Call Blocks

The WebUI now displays tool invocations with inline expandable blocks in the streaming panel. Each tool call shows a disclosure triangle (▶); clicking expands a scrollable output pane with the full tool result, markdown formatting preserved.

Features:

✅ Expandable blocks — Click ▶ to expand/collapse tool output
✅ Markdown rendering — Tool results support markdown (code blocks, lists, tables)
✅ Error highlighting — Failed tool calls shown in red
✅ Dark/light themes — CSS automatically adapts to UI theme
✅ Silent mode integration — Tool blocks hidden when silent_mode=true
✅ All runtimes supported — Works with copilot-sdk, claude-sdk, claude, and gemini

UI Behavior:

Tool started: Shows block with "Running ⌛" spinner
Tool completed: Output filled in, user can expand to view result
Tool error: Red highlight, error message displayed
Silent mode on: Blocks completely hidden from view

CSS Classes:

.tc-block — Container for tool call block
.tc-toggle — Expand/collapse button (▶)
.tc-output — Scrollable output pane
.tc-error — Error state styling
.tc-expanded — Expanded state

Related Issues:

#115 — Inline Expandable Tool Call Blocks (QA Approved)
#87 — Streaming + Tool Call support for copilot-sdk and claude-sdk

💰 Token Usage Tracking & Cost Estimation

Issue #128: Token Usage Tracking + Cost Estimation + WebUI Footer

The WebUI now displays real-time token usage statistics in the footer after each message. Tracks cumulative prompt and completion tokens across all runtimes, calculates costs based on per-model pricing, and displays live usage summary.

Features:

✅ Real-time tracking — Token counts updated after each message
✅ Multi-runtime support — Tracks tokens across copilot-sdk, claude-sdk, openrouter, wee (Ollama/OpenRouter/LM Studio)
✅ Cost estimation — Calculates costs based on current model pricing
✅ Accuracy — ±1% margin within expected pricing for all supported models
✅ Session-level aggregation — Cumulative counts show total tokens and estimated costs for entire session
✅ Per-message tracking — Individual message metadata includes token counts and partial costs
✅ WebUI footer display — Live stats accessible without API calls (cached locally)

Displayed Metrics:

Prompt tokens: Total tokens in all input messages
Completion tokens: Total tokens in all model responses
Total tokens: Sum of prompt + completion tokens
Estimated cost: Calculated from per-model pricing (e.g., $0.15 per 1M input tokens)
Model pricing: Retrieved from token_calculator.py (based on published pricing)

Footer Display Format:

💰 Tokens: 1,234 prompt + 567 completion = 1,801 total | Est. cost: $0.023 | Model: claude-3.5-sonnet

Token Calculation Logic:

Each runtime reports token usage after completing a message
Tokens summed by type (prompt vs completion)
Cost calculated: (prompt_tokens * model_input_price + completion_tokens * model_output_price) / 1_000_000
Metadata stored in session history for audit/replay purposes
Wee runtime strips internal __WEE_META__ before counting to avoid inflating token estimates

Supported Models:

Claude (claude-sdk): claude-3.5-sonnet, claude-3-opus, claude-3-haiku
Copilot (copilot-sdk): GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
OpenRouter: 200+ models with live pricing via OpenRouter API
Wee (Ollama): Ollama local models (token count via token_calculator.py estimate)
Wee (OpenRouter): Same as OpenRouter routing
Wee (LM Studio): LM Studio models (token estimate via calculator)

Related Issues:

#128 — Token Usage Tracking + Cost Estimation + WebUI Footer (QA Approved)
#91 — Background task permissions (Token tracking uses elevated permissions)

POST /api/v1/query — Stateless one-shot query endpoint

Execute a prompt without managing sessions. The endpoint creates an ephemeral session internally, runs the prompt, returns the result, and cleans up automatically. Ideal for evaluators, CI checks, and fire-and-forget queries.

Method	Path	Description
`POST`	`/api/v1/query`	Execute a one-shot query; no session state retained

Request body (JSON):

{
  "prompt": "What is 2 + 2?",
  "runtime": "copilot",
  "model": "claude-haiku-4.5",
  "agent": "orchestrator",
  "timeout": 120
}

Field	Type	Required	Description
`prompt`	string	✅	Query text (max 10,000 characters)
`runtime`	string	No	Runtime to use: `copilot`, `claude` (default: `copilot`)
`model`	string	No	Model name; defaults to runtime's configured default
`agent`	string	No	Agent name from `agents.json`; defaults to `orchestrator`
`timeout`	integer	No	Execution timeout in seconds (default: 120)

Response (200 OK — successful execution):

{
  "result": "4",
  "runtime": "copilot",
  "model": "claude-haiku-4.5",
  "agent": "orchestrator",
  "elapsed": 1.42
}

Error Detection (#67): When the runtime response contains a known error pattern, the endpoint returns the appropriate HTTP error status instead of 200 with error text:

HTTP Status	Error Code	Triggers
`422`	`model_not_found`	`ProviderModelNotFoundError`, `model not found`, `unknown model`
`429`	`rate_limit_exceeded`	`RateLimitError`, `rate limit`, `too many requests`
`403`	`permission_denied`	`PermissionDeniedError`, `permission denied`, `access denied`
`401`	`authentication_failed`	`AuthenticationError`, `invalid api key`, `authentication failed`
`503`	`service_unavailable`	`ServiceUnavailableError`, `service unavailable`, `temporarily unavailable`

Error response body (JSON):

{
  "detail": {
    "error": "model_not_found",
    "message": "ProviderModelNotFoundError: gemma4-26b not found (truncated to 500 chars)",
    "runtime": "opencode",
    "model": "gemma4-26b"
  }
}

Code Generation Improvements (#68): Additional handling for empty/null responses and connection errors:

HTTP Status	Error Code	Triggers
`502`	`empty_response`	Null, empty, or whitespace-only runtime output
`502`	`connection_refused`	`ECONNREFUSED` — Model server not running (e.g., local Ollama/OpenCode instance down)
`502`	`connection_reset`	`ECONNRESET` or `socket hang up` — Server closed connection unexpectedly
`504`	`connection_timeout`	`ETIMEDOUT` — Model server slow or hung

Additional processing (#68):

ANSI Stripping: ANSI escape codes (color, formatting) stripped from runtime output before error detection — prevents formatting codes from interfering with pattern matching in code generation scenarios

Example empty response:

{
  "detail": {
    "error": "empty_response",
    "message": "Runtime returned empty/null output",
    "runtime": "opencode",
    "model": "gemma4-26b"
  }
}

Example connection error:

{
  "detail": {
    "error": "connection_refused",
    "message": "Error: connect ECONNREFUSED 127.0.0.1:5000",
    "runtime": "opencode",
    "code": "ECONNREFUSED"
  }
}

Other error responses: Other error responses:

401 Unauthorized — Missing or invalid Bearer token
422 Unprocessable Entity — prompt missing, exceeds 10,000 chars, or invalid field type
429 Too Many Requests — Rate limit exceeded (30 requests/minute per IP)
503 Service Unavailable — Session execution failed (non-error-pattern failure)

Security:

Requires API authentication (Bearer token)
Prompt validated to 10,000 character maximum
Rate-limited to 30 requests/minute per IP address
Ephemeral sessions cleaned up after execution regardless of success or failure

Example (curl):

curl -s -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_TOKEN" \
  -d '{"prompt": "What is 2 + 2?", "runtime": "copilot", "model": "claude-haiku-4.5"}'

POST /api/v1/history/sessions/{session_id}/generate-title — LLM title generation

Force (re)generate a descriptive title for a session using an LLM or smart heuristic fallback. Useful when you want an immediate title refresh outside of the auto-trigger cycle.

Method	Path	Description
`POST`	`/api/v1/history/sessions/{session_id}/generate-title`	Generate or refresh an LLM title for the specified session

Response (200 OK — title generated):

{
  "session_id": "abc123",
  "title": "Kubernetes cluster health check",
  "source": "llm"
}

Field	Type	Description
`session_id`	string	The session whose title was updated
`title`	string	The generated title (max 120 chars)
`source`	string	`"llm"` (Ollama or Anthropic) or `"heuristic"` (no LLM used)

Error responses:

401 Unauthorized — Missing or invalid Bearer token
404 Not Found — Session does not exist or belongs to a different user
400 Bad Request — Session has no messages (nothing to summarize)
500 Internal Server Error — All title generation methods failed

Title generation cascade:

Ollama (local, free) — POST {TITLE_GEN_OLLAMA_URL}/api/generate with model TITLE_GEN_MODEL
Anthropic API — claude-haiku-4.5 when ANTHROPIC_API_KEY is set and Ollama is unavailable
Smart heuristic — Extracts first substantive user message, strips markdown/code/URLs, word-boundary truncate to 60 chars

Auto-generation behavior (background):

_maybe_auto_generate_title() is called non-blocking after every session response
First LLM title generated at ≥ 2 messages
Title refreshed every TITLE_REFRESH_INTERVAL messages (default: 10) if source is "llm"
User-set titles (title_source == "user") are never overwritten

Configuration (env vars):

Variable	Default	Description
`TITLE_GEN_OLLAMA_URL`	`http://192.168.1.101:11434`	Ollama API base URL
`TITLE_GEN_MODEL`	`granite3.3-tuned`	Ollama model for title generation
`TITLE_REFRESH_INTERVAL`	`10`	Messages between auto-refresh cycles

# Force regenerate a title
curl -s -X POST http://localhost:8000/api/v1/history/sessions/abc123/generate-title \
  -H "Authorization: Bearer $API_TOKEN"

Quick Start

# Run all tests
./run_tests.sh

# Or using Python directly
python3 -m unittest discover -s tests -p "test_*.py" -v

Test Options

# Run with verbose output
./run_tests.sh -v

# Run specific test class
./run_tests.sh -t tests.test_agent_manager.TestSlashCommands

# Generate coverage report
./run_tests.sh -c

Test Coverage

The test suite includes 209 tests across multiple test files:

Orchestrator Core Tests

tests/test_agent_manager.py (62 tests) — core orchestrator functionality:

Session Management (5 tests) - Creating, resuming, and persisting sessions
Agent Configuration (4 tests) - Loading and managing agent configurations
Slash Commands (9 tests) - All interactive commands (/help, /runtime, /model, /agent, /session)
Query Tracking (8 tests) - Process tracking for /status and /cancel commands
Model Resolution (5 tests) - Converting model names/aliases to full IDs
Metadata Stripping (4 tests) - Cleaning CLI output from different runtimes
Agent Switching (3 tests) - Changing agents and session context
Session Existence (2 tests) - Checking session state file existence

tests/test_new_features.py (79 tests) — WebUI and scheduler features:

Auth / pairing flow — pairing code generation, session token validation
History Manager — per-user session history CRUD
File upload / download — upload endpoint, file serving, cleanup
Scheduler endpoints — create, list, get, update, delete, pause, resume, results, logs
Image search — DuckDuckGo image search integration
Rate limiting — per-IP sliding window

Wee Native Runtime Tests

tests/test_wee_runtime_agentic.py (68 tests) — wee_runtime.py agentic capabilities:

Model Resolution (12 tests) - Ollama/OpenRouter prefix stripping, preset resolution, cross-provider parametrization
Tool Definitions (6 tests) - Schema validation, tool registration, JSON schema correctness
Tool Execution (11 tests) - Bash/Python execution, error handling, output capture, timeouts
SSH Sanitization (5 tests) - Word-boundary validation, injection prevention (Issue #111)
CLI Argument Parsing (3 tests) - Flag handling, defaults, priority resolution
Tool-Calling Loop (4 tests) - Single/multi-round mocked flows, max rounds enforcement
Permission Levels (5 tests) - Restricted/auto/elevated access control
Streaming Output (2 tests) - Empty response handling, newline termination
Error Handling (4 tests) - API failures, malformed arguments, invalid API base, timeouts
Performance Baselines (2 tests) - Import time <1s, model resolution <100ms
Ollama Integration (7 tests) - Live connection, single/multi-turn chat, tool calling
OpenRouter Integration (7 tests) - Live connection, API key verification, tool calling

Test Results

All tests pass with minimal external dependencies:

Orchestrator: 141 tests, 0.185s
Wee Runtime: 61 passed, 7 skipped (OpenRouter key), 0 failures
Total: 202+ passed

Tests use mocking to isolate orchestrator functionality and avoid:

Executing real CLI commands (Copilot, OpenCode, Claude)
Modifying user's home directory
Making real API calls to runtime providers

Wee runtime tests support both mocked tool-calling loops and optional live integration with Ollama and OpenRouter.

Adding Tests

When adding new features to agent_manager.py:

Add corresponding test cases to tests/test_agent_manager.py
Run the full test suite to ensure no regressions
Aim for high coverage of new functionality

For detailed testing documentation, see tests/README.md.

Web UI

Wee-Orchestrator ships a browser-based chat interface served at /ui by the API server.

Features

🍀 Glassmorphism design — frosted-glass panels, animated background blobs, responsive layout
💬 Chat panel — markdown rendering, syntax highlighting, image display (no overflow), clickable meta pills
⚡ Streaming responses — AI output streams to the browser in real-time via SSE; a blinking cursor shows progress and the bubble is replaced with fully-rendered markdown when complete
⏱️ Response generation timing — each assistant message displays how long it took to generate (format: "⏱️ Generated in X.Xs"), helping you understand performance across different runtimes
👤 @username display — shows @handle instead of raw numeric IDs in message headers
🔍 Typeahead — /command highlighting and autocomplete in the input box
📸 File uploads — drag-and-drop or click to attach images and files to messages
🖼️ Auto image search — AI can trigger DuckDuckGo image searches; results are served inline
📅 Scheduler panel — switch between Chat and Scheduler from the sidebar navigation (hidden when SCHEDULER_ENABLED=false)
- Job list with status badges (active / paused / disabled)
- Detail drawer with full job configuration
- Create / edit form with agent, runtime, model, and mode (yolo / restricted) selectors
- Daemon status badge showing scheduler health
- Toast notifications for CRUD operations
🔐 Pairing auth — 6-digit one-time code sent via Telegram or WebEx; no passwords

Accessing the UI

http://<host>:<port>/ui

Default port is set by API_PORT in .env (default 8000).

🔒 See Network Binding & Secure Access below for guidance on restricting which interfaces the server listens on.

Network Binding & Secure Access

⚠️ WARNING: Do NOT bind to 0.0.0.0

Binding to 0.0.0.0 exposes the API and Web UI on every network interface — including your LAN and any public-facing NIC. This server grants executing arbitrary shell commands and full file-system access to connected AI agents. A malicious actor on your LAN or internet could take over your machine.

Always restrict API_HOST to trusted interfaces only.

Recommended: Tailscale + Localhost

Set API_HOST in .env to a comma-separated list of the interfaces you want to bind (the server spawns a listener for each):

# ✅ GOOD — localhost and Tailscale only
API_HOST=127.0.0.1,100.x.x.x   # replace with your Tailscale IPv4 (tailscale ip -4)
API_PORT=8001

# ❌ BAD — exposes to entire LAN/internet
# API_HOST=0.0.0.0

After changing .env, restart the API service:

sudo systemctl restart agent-manager-api-dev.service
# Verify — should show ONLY 127.0.0.1 and Tailscale IP:
ss -tlnp | grep 8001

Accessing the Dev Environment Remotely

Option 1 – Tailscale (Recommended)

Install Tailscale: https://tailscale.com/download
Join the same Tailscale network (get invite key from admin)
Access directly via Tailscale IP:
```
http://100.x.x.x:8001/ui
```

Option 2 – SSH SOCKS Proxy

# Start SOCKS proxy (-f backgrounds it, -N means no command)
ssh -fN -D 1080 user@your-host

# Browser: configure SOCKS5 proxy  127.0.0.1:1080  (proxy DNS enabled)
# Then open: http://127.0.0.1:8001/ui

Firefox: Settings → Network Settings → Manual proxy → SOCKS Host 127.0.0.1 Port 1080 SOCKS v5 → ✓ Proxy DNS

Chrome/Edge:

google-chrome --proxy-server="socks5://127.0.0.1:1080"

Option 3 – SSH Port Forwarding (single port)

ssh -N -L 8001:127.0.0.1:8001 user@your-host
# Then open: http://localhost:8001/ui

Full details: docs/dev-access.md

Streaming (SSE)

Chat responses from the Web UI use POST /api/v1/sessions/{id}/stream instead of the blocking execute endpoint. The browser receives Server-Sent Events:

Event	Payload	Description
`start`	`{}`	Streaming bubble created in the UI
`chunk`	`{"text": "…"}`	Raw stdout line from the AI CLI as it arrives
`done`	`{"response":"…","runtime":"…","model":"…"}`	Final stripped response; bubble replaced with rendered markdown
`error`	`{"message":"…"}`	On failure

Keepalive comments (: keepalive) are sent every second to prevent proxy/browser timeouts. Slash commands and bash commands (!) skip the chunk loop and emit start → done immediately. All other channels (Telegram, WebEx, N8N) use the original blocking endpoint — streaming is WebUI-only.

Task Scheduler

The built-in task scheduler (task_scheduler.py) runs AI jobs on a schedule without human interaction.

Feature flag: The scheduler can be fully disabled by setting SCHEDULER_ENABLED=false in .env. This removes all /api/v1/scheduler/* API endpoints and hides the Scheduler tab in the Web UI. See Feature Flags below.

Features

📅 Natural-language schedules — in 10 minutes, every 2 hours, every day at 9am
🔄 Recurring or one-shot jobs
🤖 Per-job AI config — choose agent, runtime, model, and mode independently for each job
🔔 Creator-targeted notifications — results sent back to the Telegram or WebEx user who created the job
🔒 Per-user ACL — only allowed users (configured via SCHEDULER_ALLOWED_TELEGRAM / SCHEDULER_ALLOWED_WEBEX env vars) can create/manage jobs
⏸️ Pause / Resume — temporarily disable jobs without deleting them
📋 Results history — last N results stored per job, viewable via API or Web UI

Clock Drift Handling

The scheduler is resilient to system clock adjustments (NTP corrections, manual time changes, etc.). Five complementary mechanisms ensure consistent job execution:

Drift Detection — Compares wall-clock vs monotonic time each cycle. Logs warnings when drift exceeds 30 seconds with direction and magnitude.
Per-Job Monotonic Cooldown — Records monotonic time of last execution for each job. Prevents double-execution when a backward clock jump reschedules a job into an already-executed time slot.
Stale Job Recalculation — Recurring jobs more than 1 hour overdue get their next run advanced to the next future slot instead of executing stale runs. One-time jobs are never recalculated.
Drift-Aware Readiness Check — Applies all three guards before execution. Logs info when executing catchup runs.
Wall-Clock Debt Compensation (#71) — Tracks accumulated backward drift as a running debt. In each readiness check, compensated_now = now + debt expands the current-time window so jobs skipped during a backward jump are recovered automatically. Debt drains as the clock moves forward; capped at 600 seconds to prevent runaway compensation.

Bottom line: If your system experiences a clock adjustment, the scheduler will:

Skip any jobs that have already been executed (monotonic cooldown)
Advance any recurring jobs that would be stale (1+ hour old)
Recover jobs missed during a backward clock jump (wall-clock debt compensation, up to 10 min)
Continue executing new jobs normally

Drift Diagnostics: Call executor.get_drift_diagnostics() to inspect current compensation state:

{
    "wall_clock_debt_seconds": 15.3,     # accumulated backward drift (0 = inactive)
    "drift_compensation_active": True,   # True when debt > 0
    "drift_recovered_jobs": 4,           # total jobs recovered via compensation
    "recent_drift_events": [...],        # last 10 drift events (direction + magnitude)
    "compensation_cap_seconds": 600      # max compensation window
}

REST API Endpoints

Method	Path	Description
`GET`	`/api/v1/scheduler/status`	Daemon health / doctor report
`GET`	`/api/v1/scheduler/jobs`	List all jobs
`POST`	`/api/v1/scheduler/jobs`	Create a new job
`GET`	`/api/v1/scheduler/jobs/{id}`	Get job details
`PUT`	`/api/v1/scheduler/jobs/{id}`	Update a job
`DELETE`	`/api/v1/scheduler/jobs/{id}`	Delete a job
`POST`	`/api/v1/scheduler/jobs/{id}/pause`	Pause a job
`POST`	`/api/v1/scheduler/jobs/{id}/resume`	Resume a paused job
`GET`	`/api/v1/scheduler/jobs/{id}/results`	Retrieve execution results
`GET`	`/api/v1/scheduler/jobs/{id}/logs`	Retrieve execution logs

TODO Management

Method	Path	Description
`GET`	`/api/v1/todos`	Fetch TODOs from both GitHub Issues and flat files (deduplicated)
`POST`	`/api/v1/todos`	Create a new TODO in both GitHub Issues and flat file
`POST`	`/api/v1/todos/{title}/complete`	Complete/close a TODO in both sources

Dual-Source TODOs — Fetches from GitHub Issues (primary, labeled with todo) and flat files (fallback), automatically merged with deduplication by title. GitHub Issues take precedence on conflicts.

GET /api/v1/todos — Fetch all TODOs from GitHub Issues + flat files

Request parameters (query string):

?limit=50          # Number of TODOs to return (default: 100)
?offset=0          # Pagination offset (default: 0)
?source=all|github|flat    # Filter by source (default: all)

Response (200 OK):

{
  "todos": [
    {
      "id": "To1a2b3",
      "title": "Fix auth bug",
      "status": "open",
      "source": "github",
      "issue_number": 42,
      "labels": ["bug", "urgent"],
      "created_at": "2026-04-01T10:00:00Z"
    },
    {
      "id": "Ta4b5c6",
      "title": "Refactor database layer",
      "status": "open",
      "source": "flat",
      "created_at": "2026-04-02T14:30:00Z"
    }
  ],
  "total": 2,
  "offset": 0,
  "limit": 50
}

POST /api/v1/todos — Create a new TODO in both sources

Request body:

{
  "title": "Complete user auth flow",
  "due_date": "2026-04-15",
  "labels": ["backend", "security"],
  "details": "Implement JWT tokens and refresh logic"
}

Response (201 Created):

{
  "id": "To1a2b3",
  "title": "Complete user auth flow",
  "due_date": "2026-04-15",
  "labels": ["backend", "security"],
  "labels_stripped": [],
  "issue_number": 43,
  "source": "github+flat",
  "details": "Implement JWT tokens and refresh logic",
  "created_at": "2026-04-01T00:26:27Z"
}

Label Validation & Retry:

If provided labels don't exist in the GitHub repo, invalid labels are automatically stripped
Issue creation is retried without the invalid labels
labels_stripped field in response shows which labels were removed
If all labels are invalid, issue is created without the --label flag

Errors:

400 Bad Request — Missing required title field or invalid JSON
401 Unauthorized — Missing or invalid Bearer token
409 Conflict — TODO with this title already exists (in either source)
422 Unprocessable Entity — Path traversal detected (invalid characters in title)

POST /api/v1/todos/{title}/complete — Mark TODO as complete in both sources

Request: POST /api/v1/todos/Complete%20user%20auth%20flow/complete

Response (200 OK):

{
  "id": "To1a2b3",
  "title": "Complete user auth flow",
  "status": "closed",
  "github_issue_closed": 43,
  "flat_file_marked_done": true,
  "completed_at": "2026-04-05T15:45:00Z"
}

Errors:

401 Unauthorized — Missing or invalid Bearer token
404 Not Found — TODO with this title not found in either source
500 Internal Server Error — Subprocess error closing GitHub Issue or updating flat file

Security:

Path traversal protection: rejects /, \, .., and control characters in the title
Duplicate title detection prevents accidental overwrites
Authentication required: Bearer token or shared-key validation
Invalid label detection prevents API errors on malformed label names

Memory Promotion

Method	Path	Description
`POST`	`/api/v1/memory/promote`	Promote memory for a single agent (or orchestrator)
`POST`	`/api/v1/memory/promote-all`	Promote memory across all agents in agents.json

POST /api/v1/memory/promote — Trigger memory promotion for a single agent

Consolidates daily notes (/memories/daily/*.md) into the agent's MEMORY.md using LLM analysis. Durable facts are elevated, duplicates removed, and the knowledge base refreshed.

Request body (JSON):

{
  "agent": "devops"  // Optional — if omitted, promotes orchestrator memory
}

Response (200 OK):

{
  "status": "ok",
  "agent": "devops",
  "agent_path": "/opt/MyHomeDevops",
  "stdout": "Promoted 8 facts from 3 daily notes...",
  "stderr": "",
  "returncode": 0
}

Error responses:

401 Unauthorized — Missing or invalid Bearer token
404 Not Found — Unknown agent name
503 Service Unavailable — Memory promoter script not found
504 Gateway Timeout — Promotion exceeded 120-second timeout
500 Internal Server Error — Subprocess error or other failure

POST /api/v1/memory/promote-all — Trigger memory promotion for ALL agents

Iterates through every agent in agents.json (including orchestrator) and runs memory promotion for each. Handles partial failures gracefully — continues promotion for other agents if one fails.

Request body: Empty or omitted

Response (200 OK):

{
  "status": "ok",
  "total": 4,
  "succeeded": 4,
  "failed": 0,
  "results": [
    {
      "agent": "orchestrator",
      "agent_path": "/opt/memories",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    },
    {
      "agent": "devops",
      "agent_path": "/opt/MyHomeDevops",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    }
  ]
}

Security:

Both endpoints require API authentication (Bearer token or shared-key validation)
Memory promotion is read-only for daily notes, write-only to MEMORY.md
Agent path resolved from agents.json; prevents directory traversal

Helper Script: For scheduling memory promotion via the task scheduler or cron:

bash scripts/promote_all_agents_memory.sh

PATCH /api/v1/sessions/{id}/settings — Update session settings

Modify session-level settings like verbose mode (tool call visibility). Settings are persisted and returned in subsequent session queries.

Request body (JSON):

{
  "silent_mode": false  // Show tool call lines; set to true to hide
}

Response (200 OK):

{
  "id": "sess_abc123",
  "silent_mode": false,
  "created_at": "2026-04-03T20:00:00Z",
  "updated_at": "2026-04-03T21:05:42Z"
}

Error responses:

401 Unauthorized — Missing or invalid Bearer token
404 Not Found — Session does not exist
422 Unprocessable Entity — Invalid value (e.g., non-boolean for silent_mode)

Features:

Whitelist-based field filtering — only recognized fields are accepted (currently: silent_mode)
WebUI toggle button in header reflects and controls this setting
Tool call lines (.tc-line) hidden when silent_mode=true, shown when false
Does not affect logging or session history — only visual display

Security:

Requires API authentication (Bearer token)
Per-session settings — each user session has independent configuration

Quick Start

# Create a daily summary job (via API)

curl -X POST http://localhost:8000/api/v1/scheduler/jobs \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Daily standup",
    "schedule": "every day at 9am",
    "agent": "devops",
    "runtime": "copilot",
    "model": "gpt-5-mini",
    "mode": "restricted",
    "task": "Summarise open pull requests and any failing CI jobs",
    "notify": true,
    "recurring": true
  }'

Data is stored in /opt/.task-scheduler/ (jobs.json, results/, logs/).

Feature Flags

Wee-Orchestrator exposes a public GET /api/v1/config endpoint that the Web UI reads at boot to determine which features to display. Backend routes for disabled features are never registered.

Variable	Default	Description
`SCHEDULER_ENABLED`	`true`	Enable/disable the Task Scheduler API and Web UI panel

Disabling the Scheduler

# In .env
SCHEDULER_ENABLED=false

Effects when false:

All /api/v1/scheduler/* endpoints return 404 (routes not registered)
The 📅 Scheduler tab is hidden from the Web UI sidebar before auth — it never appears
GET /api/v1/config returns {"scheduler_enabled": false} for the browser to act on

To re-enable, set SCHEDULER_ENABLED=true (or remove the variable) and restart the service.

File Handling

Both the Telegram and WebEx connectors support sending and receiving files and images.

Receiving: files are downloaded to webex_downloads/ and injected into the agent context as a file path prompt
Sending: agents can produce local file paths that the connector uploads back to the user
Images: the Web UI serves AI-fetched images from /ai-media/ so the browser can render them inline

See WEBEX_FILE_HANDLING.md and FILE_MEDIA_HANDLING_SKILL.md for details.

Per-User Access Control

Agent & Model Pinning

Users can be locked to a specific agent, runtime, and model via pinned_users in the connector config:

"pinned_users": {
  "8193231291": {
    "agent": "family",
    "runtime": "copilot",
    "model": "gpt-5-mini"
  }
}

Pinned users cannot run /agent set — they receive a clear admin message. The pinned config is re-applied before every query, so even a session reset cannot bypass it.

Yolo Mode Restriction

By default all users may run /mode yolo. To restrict yolo access to a list of user IDs:

"yolo_allowed_users": ["8193231291", "9876543210"]

An empty list preserves the permissive default (all allowed).

File Structure

n8n-copilot-shim/
├── agent_manager.py           # Core: SessionManager, FastAPI app factory, all /api/v1/ endpoints
├── task_scheduler.py          # TaskScheduler class — schedule, pause, resume, results
├── telegram_connector.py      # Telegram bot long-polling connector
├── webex_connector.py         # WebEx webhook/RabbitMQ connector
├── agents.json                # Agent configuration (git-ignored)
├── agents.example.json        # Example configuration template
├── webui/
│   └── dist/                  # Built Web UI assets (index.html, app.js, app.css)
├── tests/
│   ├── test_agent_manager.py  # Core unit tests (62 tests)
│   └── test_new_features.py   # WebUI + scheduler feature tests (79 tests)
├── docs/plans/                # Planning docs
├── run_tests.sh               # Test runner
├── .testrc                    # Test configuration
├── .env.example               # Environment variable template
├── EXAMPLE_WORKFLOW.json      # N8N workflow example
├── ARCHITECTURE.md            # System architecture and Mermaid diagrams
├── RELEASE_NOTES.md           # Version history
└── README.md                  # This file

Architecture Summary

See ARCHITECTURE.md for full detail and Mermaid diagrams.

Key components:

Component	Description
`SessionManager`	Core AI execution engine — session state, slash commands, CLI dispatch, streaming queues
`HistoryManager`	Per-user, per-channel chat history persistence
`AuthManager`	Pairing-code auth, session token issuance, shared-key validation
`RateLimiter`	Per-IP, per-endpoint sliding-window rate limiting
`TaskScheduler`	Cron-like AI job scheduler embedded in the orchestrator (feature-flagged)
FastAPI app	REST API (`/api/v1/`) + SSE streaming (`/stream`) + static Web UI mount (`/ui`)
`TelegramConnector`	Long-polling Telegram bot → `SessionManager` bridge
`WebEXConnector`	WebEx webhook / RabbitMQ → `SessionManager` bridge

Troubleshooting

Agents not loading

Check that agents.json exists in the script directory or current directory
Verify JSON syntax with python -m json.tool agents.json
Check file permissions

Session issues

Run /session reset to start fresh
Check session storage directories exist:
- ~/.copilot/session-state/
- ~/.local/share/opencode/storage/session/global/
- ~/.claude/debug/
- ~/.gemini/sessions/

Scheduler not running

Check SCHEDULER_JOBS_FILE path exists and is writable (/opt/.task-scheduler/jobs.json)
Verify the API server is running: sudo systemctl status agent-manager-api.service
Hit GET /api/v1/scheduler/status to see the daemon health report

CLI not found

Ensure copilot, opencode, claude, and gemini binaries are in PATH or at expected locations
Check /usr/bin/copilot, /usr/bin/claude, ~/.opencode/bin/opencode, and gemini in PATH

Web UI auth loop

Confirm the API server can reach your Telegram or WebEx bot to deliver the pairing code
Check PAIRING_CODE_TTL (default 300 s) — request a new code if it expired

Agent Orchestration

This project supports multi-agent orchestration with dynamic agent discovery. See the comprehensive agent documentation:

AGENTS.md - Agent orchestration overview and usage guide
SKILL_SUBAGENTS.md - Detailed subagent management and advanced patterns
ARCHITECTURE.md - Full system architecture with Mermaid diagrams
agents.json - Agent configuration file (controls available agents)

Quick Agent Start

# List available agents
/agent list

# Switch to an agent
/agent set devops

# Execute in agent context
"Deploy the latest version"

# Resume agent session
"What's the status?"

# Switch to different agent
/agent set family

All agents are loaded dynamically from agents.json, enabling easy expansion and customization.

Telegram Connector

The Telegram connector bridges Telegram chat with your N8N Copilot Shim agents.

Features

💬 Receive messages from Telegram users
👤 User pairing by Telegram user ID
🔐 User access control (whitelist/blacklist)
🎯 Route to any configured agent
⚙️ Per-user session management

Memory Promotion

Method	Path	Description
`POST`	`/api/v1/memory/promote`	Promote memory for a single agent (or orchestrator)
`POST`	`/api/v1/memory/promote-all`	Promote memory across all agents in agents.json

POST /api/v1/memory/promote — Trigger memory promotion for a single agent

Consolidates daily notes (/memories/daily/*.md) into the agent's MEMORY.md using LLM analysis. Durable facts are elevated, duplicates removed, and the knowledge base refreshed.

Request body (JSON):

{
  "agent": "devops"  // Optional — if omitted, promotes orchestrator memory
}

Response (200 OK):

{
  "status": "ok",
  "agent": "devops",
  "agent_path": "/opt/MyHomeDevops",
  "stdout": "Promoted 8 facts from 3 daily notes...",
  "stderr": "",
  "returncode": 0
}

Error responses:

401 Unauthorized — Missing or invalid Bearer token
404 Not Found — Unknown agent name
503 Service Unavailable — Memory promoter script not found
504 Gateway Timeout — Promotion exceeded 120-second timeout
500 Internal Server Error — Subprocess error or other failure

POST /api/v1/memory/promote-all — Trigger memory promotion for ALL agents

Iterates through every agent in agents.json (including orchestrator) and runs memory promotion for each. Handles partial failures gracefully — continues promotion for other agents if one fails.

Request body: Empty or omitted

Response (200 OK):

{
  "status": "ok",
  "total": 4,
  "succeeded": 4,
  "failed": 0,
  "results": [
    {
      "agent": "orchestrator",
      "agent_path": "/opt/memories",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    },
    {
      "agent": "devops",
      "agent_path": "/opt/MyHomeDevops",
      "status": "ok",
      "returncode": 0,
      "stdout": "..."
    }
  ]
}

Security:

Both endpoints require API authentication (Bearer token or shared-key validation)
Memory promotion is read-only for daily notes, write-only to MEMORY.md
Agent path resolved from agents.json; prevents directory traversal

Helper Script: For scheduling memory promotion via the task scheduler or cron:

bash scripts/promote_all_agents_memory.sh

PATCH /api/v1/sessions/{id}/settings — Update session settings

Modify session-level settings like verbose mode (tool call visibility). Settings are persisted and returned in subsequent session queries.

Request body (JSON):

{
  "silent_mode": false  // Show tool call lines; set to true to hide
}

Response (200 OK):

{
  "id": "sess_abc123",
  "silent_mode": false,
  "created_at": "2026-04-03T20:00:00Z",
  "updated_at": "2026-04-03T21:05:42Z"
}

Error responses:

401 Unauthorized — Missing or invalid Bearer token
404 Not Found — Session does not exist
422 Unprocessable Entity — Invalid value (e.g., non-boolean for silent_mode)

Features:

Whitelist-based field filtering — only recognized fields are accepted (currently: silent_mode)
WebUI toggle button in header reflects and controls this setting
Tool call lines (.tc-line) hidden when silent_mode=true, shown when false
Does not affect logging or session history — only visual display

Security:

Requires API authentication (Bearer token)
Per-session settings — each user session has independent configuration

Quick Start

# With environment variable
export TELEGRAM_BOT_TOKEN="your-token-here"
python telegram_connector.py

# Or with token argument
python telegram_connector.py --token "your-token-here"

Managing Users

# Allow specific user
python telegram_connector.py --token TOKEN --allow-user 123456789

# Deny user
python telegram_connector.py --token TOKEN --deny-user 123456789

# List allowed users
python telegram_connector.py --token TOKEN --list-users

See TELEGRAM_CONNECTOR.md for full documentation.

Contributing & Issue Tracking

GitHub Issues for Project Management

This project uses GitHub Issues as the single source of truth for all TODOs, feature requests, and bug reports.

Why GitHub Issues?

✅ Centralized tracking across all sub-agents and features
✅ Linked to code commits and pull requests
✅ Searchable history of decisions and implementations
✅ Clear ownership and assignment of work
✅ Prioritization through labels and milestones

Issue Categories

We use labels to organize work:

Label	Purpose	Example
`bug`	Bugs and defects	"Message editing fails with 400 error"
`feature`	New features	"Add message reaction support"
`enhancement`	Improvements to existing features	"Improve error messages"
`documentation`	Docs and guides	"Add user guide for slash commands"
`WebEX`	WebEX connector specific	"Implement pinning in group rooms"
`Telegram`	Telegram connector specific	"Add Telegram reactions"
`help wanted`	Open for contributions	Any issue needing external help
`blocked`	Blocked on external dependency	"Waiting for WebEX API update"

Creating Issues

Before starting work, check for existing issues:

# View all open issues
gh issue list

# View WebEX-related issues
gh issue list --label WebEX

# View bugs
gh issue list --label bug

When NOT to Use TODO Comments

⚠️ Do NOT add TODO comments in code. Instead:

Create a GitHub issue describing the work needed
Reference the issue in commit messages: fix: resolve #42
Assign ownership so it's tracked and visible
Move to In Progress when you start work

Example:

# ❌ BAD - TODO in code
def pin_message(self, msg_id, room_id):
    # TODO: implement proper pinning when WebEX adds support
    pass

# ✅ GOOD - GitHub issue + clear code
def pin_message(self, msg_id, room_id):
    """Pin a message.

    Note: WebEX API doesn't support pinning in direct messages.
    See issue #42 for status on group room support.
    """
    pass

Outstanding Work

All outstanding work is tracked in GitHub Issues. Check the repository issues board to see:

In Progress - Work actively being done
Backlog - Planned but not started
Help Wanted - Open for contributions
Blocked - Waiting on dependencies

Start here: GitHub Issues

Version	Changes	Urgency	Date
main@2026-05-08	Latest activity on main branch	High	5/8/2026
v1.0.0	Latest release: v1.0.0	High	4/9/2026