altk-evolve

Self improving agents through iterations

agent-learning agent-memory agentic-workflows ai-agents claude-code claude-code-plugin claude-code-plugins codex model-context-protocol python

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Self improving agents through iterations

README

Evolve: On‑the‑job learning for AI agents

Blog posts: IBM announcement | Hugging Face blog

Coding agents repeat the same mistakes because they start fresh every session. Evolve gives agents memory — they learn from what worked and what didn't, so each session is better than the last.

Evolve is a system designed to help agents improve over time by learning from their trajectories. The Lite version is designed to effortlessly slot into existing agent assistants like Claude Code and Codex. It uses a combination of an MCP server for tool integration, vector storage for memory, and LLM-based conflict resolution to refine its knowledge base.

On the AppWorld benchmark, Evolve improved agent reliability by +8.9 points overall, with a 74% relative increase on hard multi-step tasks. Evolve is a system designed to help agents improve over time by learning from their trajectories. It uses a combination of an MCP server for tool integration, vector storage for memory, and LLM-based conflict resolution to refine its knowledge base.

Important

⭐ Star the repo: it helps others discover it.

Quick Start (Lite)

IBM Bob →

Claude Code →

Codex →

Quick Start (Evolve MCP Server)

Installation

Prerequisites:

Python 3.12 or higher
uv (recommended) or pip

From Source

# Clone the repository and install dependencies
git clone https://github.com/agenttoolkit/altk-evolve.git
cd altk-evolve
uv venv --python=3.12 && source .venv/bin/activate
uv sync
# Build the UI
cd frontend/ui
npm ci && npm run build
cd ../..

From PyPI

pip install altk-evolve

Optional Backend Dependencies:

The default filesystem backend uses simple text matching and requires no additional dependencies. For semantic vector similarity search, install one of these backends:

For PostgreSQL with pgvector support (recommended for production):

uv sync --extra pgvector

For Milvus support (optimized for large-scale vector search):

uv sync --extra milvus

See the Backend Configuration Guide for detailed comparison and setup instructions.

Configuration

For direct OpenAI usage:

export OPENAI_API_KEY=sk-...

For LiteLLM proxy usage and model selection (including global fallback via EVOLVE_MODEL_NAME), see the configuration guide.

Running Services

Start the Web UI and MCP server

uv run evolve-mcp

The Web UI can be accessed from: http://127.0.0.1:8000/ui/

Starting the Web UI and MCP Server

If you only want to access the Web UI and API (without the MCP server stdio blocking the terminal), you can run the FastAPI application directly using uvicorn:

uv run uvicorn altk_evolve.frontend.mcp.mcp_server:app --host 127.0.0.1 --port 8000

Then navigate to http://127.0.0.1:8000/ui/.

Starting only the MCP Server

If you're attaching Evolve to an MCP client that requires a direct command (like Claude Desktop):

uv run evolve-mcp

Or for SSE transport:

uv run evolve-mcp --transport sse --port 8201

Verify it's running:

npx @modelcontextprotocol/inspector@latest http://127.0.0.1:8201/sse --cli --method tools/list

Available tools:

get_entities(task: str, entity_type: str): Get relevant entities for a specific task, filtered by type (e.g., 'guideline', 'policy').
get_guidelines(task: str): Get relevant guidelines for a specific task (backward compatibility alias).
save_trajectory(trajectory_data: str, task_id: str | None): Save a conversation trajectory and generate new guidelines.
create_entity(content: str, entity_type: str, metadata: str | None, enable_conflict_resolution: bool): Create a single entity in the namespace.
delete_entity(entity_id: str): Delete a specific entity by its ID.

Filter Migration Note

Entity search filters reserve bare keys for top-level schema columns only: id, type, content, and created_at.

If you need to filter on JSON metadata, use the metadata.<key> form. For example, use filters={"type": "trajectory", "metadata.task_id": "123"} instead of filters={"type": "trajectory", "task_id": "123"}.

Existing integrations that stored custom fields in entity metadata should update filter writers to add the metadata. prefix for those keys.

Features

Proactive: Learns how to recognize problems and their solutions, and generates guidelines that get automatically applied to new tasks.
Conflict Resolution: Update existing guidelines when new information contradicts them.
On Command: An array of tools to manage guidelines whether in the agent or through a CLI

Architecture

Evolve is built on a modular architecture which forms a feedback loop, taking conversation traces (trajectories) from an agent, extracting key insights into a database, feeding it back into the agent.

Lite Mode omits the Interaction layer. All activity is performed in-agent

Tip Provenance

Evolve automatically tracks the origin of every guideline it generates or stores. Every tip entity contains metadata identifying its source:

creation_mode: Identifies how the tip was created (auto-phoenix via trace observability, auto-mcp via trajectory saving tools, or manual).
source_task_id: The ID of the original trace or task that inspired the tip, providing full audibility.

See the Low-Code Tracing Guide for more details.

Contributing, Community, and Feedback

Evolve is an active project, and real‑world usage helps guide its direction.

If you’re experimenting with Evolve or exploring on‑the‑job learning for agents, feel free to open an issue or discussion to share use cases, ideas, or feedback.

See the Contributing Guide to understand our development process, or how to submit changes, report bugs, or propose features.

Release History

Version	Changes	Urgency	Date
v1.1.5	## v1.1.5 (2026-07-24) ### Bug Fixes - backend: Reentrant-safe write-hook callbacks + write re-entrancy guard + filesystem state hygiene ([#287](https://github.com/AgentToolkit/altk-evolve/pull/287), [`1d2edc2`](https://github.com/AgentToolkit/altk-evolve/commit/1d2edc262feb418bd58ce176334f606d504ce233)) - ci: Add required patterns field to dependabot.yml security-patches group ([#273](https://github.com/AgentToolkit/altk-evolve/pull/273), [`d36f36f`](https://github.com/AgentToolkit/a	High	7/24/2026
v1.1.4	## v1.1.4 (2026-07-02) ### Bug Fixes - agent-wiki: Address CodeRabbit review on the split-down diff ([`3e26154`](https://github.com/AgentToolkit/altk-evolve/commit/3e261549a18ce5f30ed66f7eeb642e0c1d0e9cc8)) - agent-wiki: Address PR review findings ([`3e26154`](https://github.com/AgentToolkit/altk-evolve/commit/3e261549a18ce5f30ed66f7eeb642e0c1d0e9cc8)) - agent-wiki: Address review feedback from visahak ([`3e26154`](https://github.com/AgentToolkit/altk-evolve/commit/3e261549a18ce5	High	7/2/2026
v1.1.3	## v1.1.3 (2026-06-02) ### Bug Fixes - llm: Avoid Groq schema-tool failures during guideline generation ([#264](https://github.com/AgentToolkit/altk-evolve/pull/264), [`5b7bc7c`](https://github.com/AgentToolkit/altk-evolve/commit/5b7bc7cf10e7bea93cf7a73866595de284c21f56)) --- Detailed Changes: [v1.1.2...v1.1.3](https://github.com/AgentToolkit/altk-evolve/compare/v1.1.2...v1.1.3)	High	6/2/2026
v1.1.0	## v1.1.0 (2026-05-01) ### Bug Fixes - Always overwrite owner and visibility in save_entities.py ([#199](https://github.com/AgentToolkit/altk-evolve/pull/199), [`7af1eb1`](https://github.com/AgentToolkit/altk-evolve/commit/7af1eb1e461077e89da977cfcbc84fc30fba3be7)) - Improve publish, recall, subscribe, and sync handling ([#199](https://github.com/AgentToolkit/altk-evolve/pull/199), [`7af1eb1`](https://github.com/AgentToolkit/altk-evolve/commit/7af1eb1e461077e89da977cfcbc84fc30fba3be7)) - Ins	High	5/1/2026
v1.0.10	## v1.0.10 (2026-04-20) ### Bug Fixes - mcp: Align metadata filters and harden SSE teardown ([`a0bcc6d`](https://github.com/AgentToolkit/altk-evolve/commit/a0bcc6db5ac5fdb4808d6e11e451eb3156ba9596)) - postgres: Prevent ambiguous filter behavior across backends ([`a0bcc6d`](https://github.com/AgentToolkit/altk-evolve/commit/a0bcc6db5ac5fdb4808d6e11e451eb3156ba9596)) --- Detailed Changes: [v1.0.9...v1.0.10](https://github.com/AgentToolkit/altk-evolve/compare/v1.0.9...v1.0.10)	High	4/20/2026
v1.0.9	## v1.0.9 (2026-04-17) ### Bug Fixes - Publish install.sh as a versioned release artifact ([#195](https://github.com/AgentToolkit/altk-evolve/pull/195), [`0b055da`](https://github.com/AgentToolkit/altk-evolve/commit/0b055da765c03fa51348defb7630643f4a48c0f1)) ### Features - bob: Add save-trajectory skill to Bob evolve-lite ([#184](https://github.com/AgentToolkit/altk-evolve/pull/184), [`9ca94e5`](https://github.com/AgentToolkit/altk-evolve/commit/9ca94e5ced9d0ebca03552703e3a7fe2417aae5a))	High	4/17/2026
v1.0.8	## v1.0.8 (2026-04-09) --- Detailed Changes: [v1.0.7...v1.0.8](https://github.com/AgentToolkit/altk-evolve/compare/v1.0.7...v1.0.8)	High	4/9/2026
v1.0.6	## v1.0.6 (2026-04-03) ### Bug Fixes - Add optional implementation_steps to Tip model and prompt ([#124](https://github.com/AgentToolkit/altk-evolve/pull/124), [`d373e7e`](https://github.com/AgentToolkit/altk-evolve/commit/d373e7ebb00ca0b3438aa1017961fbeb9cb5d0d8)) - Clarify task status context in tip generation prompt ([#124](https://github.com/AgentToolkit/altk-evolve/pull/124), [`d373e7e`](https://github.com/AgentToolkit/altk-evolve/commit/d373e7ebb00ca0b3438aa1017961fbeb9cb5d0d8)) - Comp	High	4/3/2026
v1.0.5	## v1.0.5 (2026-03-12) --- Detailed Changes: [v1.0.4...v1.0.5](https://github.com/AgentToolkit/kaizen/compare/v1.0.4...v1.0.5)	Low	3/12/2026
v1.0.4	## v1.0.4 (2026-03-12) --- Detailed Changes: [v1.0.3...v1.0.4](https://github.com/AgentToolkit/kaizen/compare/v1.0.3...v1.0.4)	Low	3/12/2026
v1.0.3	## v1.0.3 (2026-03-12) ### Bug Fixes - save-trajectory: Address code review findings ([#89](https://github.com/AgentToolkit/kaizen/pull/89), [`6e6438b`](https://github.com/AgentToolkit/kaizen/commit/6e6438b285f562d15c9dc191b96a47a04d7d4e73)) - save-trajectory: Make log() best-effort so debug logging never crashes the script ([#89](https://github.com/AgentToolkit/kaizen/pull/89), [`6e6438b`](https://github.com/AgentToolkit/kaizen/commit/6e6438b285f562d15c9dc191b96a47a04d7d4e73)) - **s	Low	3/12/2026
v1.0.2	## v1.0.2 (2026-03-05) ### Bug Fixes - Include jinja prompt templates in package artifacts ([#85](https://github.com/AgentToolkit/kaizen/pull/85), [`0c29aba`](https://github.com/AgentToolkit/kaizen/commit/0c29abadae3b0f537761e42a32d6812a7efbe2c6)) --- Detailed Changes: [v1.0.1...v1.0.2](https://github.com/AgentToolkit/kaizen/compare/v1.0.1...v1.0.2)	Low	3/5/2026
v1.0.1	## v1.0.1 (2026-03-04) ### Bug Fixes - packaging: Include kaizen subpackages in distribution ([#84](https://github.com/AgentToolkit/kaizen/pull/84), [`1bac14c`](https://github.com/AgentToolkit/kaizen/commit/1bac14cdb08b85c46dd87b36368431c802367d1e)) --- Detailed Changes: [v1.0.0...v1.0.1](https://github.com/AgentToolkit/kaizen/compare/v1.0.0...v1.0.1)	Low	3/4/2026
v1.0.0	## v1.0.0 (2026-03-04) ### Bug Fixes - Add Pydantic validation error handling for LLM tip generation and validate trajectory data input. ([#56](https://github.com/AgentToolkit/kaizen/pull/56), [`d2a4d0a`](https://github.com/AgentToolkit/kaizen/commit/d2a4d0a8e107d1a6475c31b052c9afe77e5b4784)) - Address CodeRabbit review feedback (round 3) ([#60](https://github.com/AgentToolkit/kaizen/pull/60), [`8ea2051`](https://github.com/AgentToolkit/kaizen/commit/8ea20516b0509bd7b9e8a5c02d6943dcf0e58bd5))	Low	3/4/2026
v0.2.1	## v0.2.1 (2026-02-09) --- Detailed Changes: [v0.2.0...v0.2.1](https://github.com/AgentToolkit/kaizen/compare/v0.2.0...v0.2.1)	Low	2/9/2026
v0.2.0	## v0.2.0 (2026-02-09) --- Detailed Changes: [v0.1.0-rc.4...v0.2.0](https://github.com/AgentToolkit/kaizen/compare/v0.1.0-rc.4...v0.2.0)	Low	2/9/2026
v0.1.0-rc.4	## v0.1.0-rc.4 (2026-02-09) --- Detailed Changes: [v0.1.0-rc.3...v0.1.0-rc.4](https://github.com/AgentToolkit/kaizen/compare/v0.1.0-rc.3...v0.1.0-rc.4)	Low	2/9/2026
v0.1.0-rc.3	## v0.1.0-rc.3 (2026-02-09) --- Detailed Changes: [v0.1.0-rc.2...v0.1.0-rc.3](https://github.com/AgentToolkit/kaizen/compare/v0.1.0-rc.2...v0.1.0-rc.3)	Low	2/9/2026
v0.1.0-rc.2	## v0.1.0-rc.2 (2026-02-09) --- Detailed Changes: [v0.1.0-rc.1...v0.1.0-rc.2](https://github.com/AgentToolkit/kaizen/compare/v0.1.0-rc.1...v0.1.0-rc.2)	Low	2/9/2026
v0.1.0-rc.1	## v0.1.0-rc.1 (2026-02-09) - Initial Release	Low	2/9/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

doryOne memory layer for every AI agent. Local-first, markdown source of truth, and CLI/HTTP/MCP native. Your agent forgot who you are. Again. Dory fixes that.main@2026-06-10

claude-code-plugins-plus-skills423 plugins, 2,849 skills, 177 agents for Claude Code. Open-source marketplace at tonsofskills.com with the ccpi CLI package manager.@intentsolutionsio/intent-labs-pack@0.1.0

memsearchA Markdown-first memory system, a standalone library for any AI agent. Inspired by OpenClaw.v0.4.16

claude-codex-settingsMy personal Claude Code and OpenAI Codex setup with battle-tested skills, commands, hooks, agents and MCP servers that I use daily.v2.4.0

delimit-mcp-serverUnify Claude Code, Codex, Cursor, and Gemini CLI with persistent context, governance, and multi-model debate. 186 MCP tools. 123 tests.v4.16.0

More in MCP Servers

supersetCode Editor for the AI Agents Era - Run an army of Claude Code, Codex, etc. on your machine

kreuzbergA polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 91+ formats. Available for Rust, Python

ai-engineering-from-scratchLearn it. Build it. Ship it for others.

CodeGraphContextAn MCP server plus a CLI tool that indexes local code into a graph database to provide context to AI assistants.