Pydantic Deep Agents

The batteries-included deep agent harness for Python.
Terminal AI assistant out of the box — or build production agents with one function call.

Docs · PyPI · CLI · Framework · DeepResearch · Examples

What's New

2026-04-12 v0.3.8 — Stuck loop detection, context limit warnings for the model, expanded context file discovery (CLAUDE.md, .cursorrules, etc.), eviction & orphan repair migrated to capabilities hooks.
2026-04-11 v0.3.6 — One-command installer + self-update: curl -fsSL .../install.sh | bash installs everything automatically. New pydantic-deep update command. Startup update notifications with 24-hour PyPI cache.
2026-04-10 v0.3.5 — Headless runner (pydantic-deep run), Docker sandbox with named workspaces, browser automation via Playwright, Harbor adapter for Terminal Bench evaluation.

Full history: CHANGELOG.md

The Agent Harness

Pydantic Deep Agents is an agent harness — the complete infrastructure that wraps an LLM and makes it a functional autonomous agent. The model provides intelligence; the harness provides planning, tools, memory, sandboxed execution, and unlimited context.

🔧 Tool-calling	File read/write/edit, shell execution, glob, grep, web search, web fetch, browser automation — wired up and ready.
🧠 Persistent memory	MEMORY.md persists across sessions. Auto-injected into the system prompt. Each agent has isolated memory by default.
♾️ Unlimited context	Auto-summarization when approaching the token budget. LLM-based or zero-cost sliding window. Never hits a context wall.
🤝 Multi-agent / swarm	Spawn subagents for parallel workstreams. Shared TODO lists with claiming. Peer-to-peer message bus. Full team coordination.
🐳 Sandboxed execution	Docker sandbox with named workspaces. Installed packages persist between sessions. Project dir mounted at /workspace.
🗂️ Plan Mode	Dedicated planner subagent asks clarifying questions and structures the work before execution begins. Headless-compatible.
🔖 Checkpoints	Save conversation state at any point. Rewind to any checkpoint. Fork sessions to explore alternative approaches.
📚 Skills system	Domain-specific knowledge loaded on demand from SKILL.md files. Built-in skills: code-review, refactor, test-writer, git-workflow, and more.
🔌 MCP	Connect any Model Context Protocol server via pydantic-ai's native MCP capability.
⚡ Lifecycle hooks	Claude Code-style PRE_TOOL_USE / POST_TOOL_USE hooks. Shell commands or Python handlers. Audit logging, safety gates.
📐 Structured output	Type-safe Pydantic model responses via `output_type`. No JSON parsing. No `dict["key"]`. Full IDE autocomplete.
🔄 Stuck loop detection	Detects repeated identical tool calls, A-B-A-B alternating patterns, and no-op calls. Warns the model or stops the run.
⚠️ Context limit warnings	Model receives URGENT/CRITICAL warnings when approaching context limits (70%), well before auto-compression (90%).
💰 Cost tracking	Real-time token and USD cost tracking per run and cumulative. Hard budget limits with `BudgetExceededError`.
✨ Self-improving	`/improve` analyzes past sessions and proposes updates to MEMORY.md, SOUL.md, and AGENTS.md.
🏷️ 100% type-safe	Pyright strict + MyPy strict. 100% test coverage. Every public API is fully typed — safe to use in production.

Built natively on pydantic-ai — uses the Capabilities API directly, inherits all pydantic-ai streaming, multi-model support, and Pydantic validation automatically.

🖥️ CLI — Terminal AI Assistant

A Claude Code-style terminal AI assistant that works with any model and any provider.

Install (macOS & Linux)

curl -fsSL https://raw.githubusercontent.com/vstorm-co/pydantic-deep/main/install.sh | bash

No Python setup required — the script installs uv and the CLI automatically. Then:

export ANTHROPIC_API_KEY=sk-ant-...
pydantic-deep

Windows / manual: pip install "pydantic-deep[cli]" · Update: pydantic-deep update

Model & Provider Support

Works with any model that supports tool-calling:

Provider	Example models
Anthropic	`anthropic:claude-opus-4-6`, `claude-sonnet-4-6`
OpenAI	`openai:gpt-5.4`, `gpt-4.1`
OpenRouter	`openrouter:anthropic/claude-opus-4-6` (200+ models)
Google Gemini	`google-gla:gemini-2.5-pro`
Ollama (local)	`ollama:qwen3`, `ollama:llama3.3`
Any OpenAI-compatible	Custom base URL via env

Switch model anytime: pydantic-deep config set model openai:gpt-5.4 or /model in the TUI.

What you get in the TUI

	Feature
💬	Streaming chat with tool call visualization
📁	File read / write / edit, shell execution, glob, grep
🧠	Persistent memory and self-improvement across sessions
🗂️	Task planning, plan mode, and subagent delegation
♾️	Context compression for unlimited conversations
🔖	Checkpoints — save, rewind, and fork any session
🌐	Web search & fetch built-in
🖥️	Browser automation via Playwright (`--browser`)
🐳	Docker sandbox — sandboxed execution with named workspaces
💭	Extended thinking — `minimal` / `low` / `medium` / `high` / `xhigh`
💰	Real-time cost and token tracking per session
🛡️	Tool approval dialogs — approve, auto-approve, or deny per tool call
@	`@filename` file references · `!command` shell passthrough
✨	`/improve`, `/skills`, `/diff`, `/model`, `/theme`, `/compact`, and more

Usage

# Interactive TUI (default)
pydantic-deep
pydantic-deep tui --model openrouter:anthropic/claude-opus-4-6

# Headless deep agent — benchmarks, CI/CD, scripted automation
pydantic-deep run "Fix the failing test in test_auth.py"
pydantic-deep run --task-file task.md --json
pydantic-deep run "Refactor utils.py" --no-web-search --thinking false

# Docker sandbox — sandboxed execution, project dir mounted at /workspace
pydantic-deep tui --sandbox docker
pydantic-deep tui --workspace ml-env     # named workspace, packages persist

# Browser automation (requires pydantic-deep[browser])
pydantic-deep tui --browser
pydantic-deep run "Go to example.com and summarize the content" --browser

# Config & skills
pydantic-deep config set model anthropic:claude-sonnet-4-6
pydantic-deep skills list
pydantic-deep update                     # update to latest version

See CLI docs for the full reference.

🐍 Framework — Build Your Own Agent

pip install pydantic-deep

One function call gives you a production deep agent with planning, tool-calling, multi-agent delegation, persistent memory, unlimited context, and cost tracking. Everything is a toggle:

from pydantic_ai_backends import StateBackend
from pydantic_deep import create_deep_agent, create_default_deps

agent = create_deep_agent(
    model="anthropic:claude-sonnet-4-6",
    include_todo=True,          # Task planning with subtasks and dependencies
    include_subagents=True,     # Multi-agent swarm — delegate to subagents
    include_skills=True,        # Domain-specific skills from SKILL.md files
    include_memory=True,        # Persistent memory across sessions
    include_plan=True,          # Structured planning before execution
    include_teams=True,         # Agent teams with shared TODO lists + message bus
    web_search=True,            # Tool-calling: web search
    web_fetch=True,             # Tool-calling: web fetch
    thinking="high",            # Extended thinking / reasoning effort
    context_manager=True,       # Unlimited context via auto-summarization
    cost_tracking=True,         # Token/USD budget enforcement
    include_checkpoints=True,   # Save, rewind, and fork conversations
)

deps = create_default_deps(StateBackend())
result = await agent.run("Build a REST API for user auth", deps=deps)

Structured Output

Type-safe responses with Pydantic models — no JSON parsing, no dict["key"]:

from pydantic import BaseModel

class CodeReview(BaseModel):
    summary: str
    issues: list[str]
    score: int

agent = create_deep_agent(output_type=CodeReview)
result = await agent.run("Review the auth module", deps=deps)
print(result.output.score)  # fully typed

Multi-Agent Swarm

Spawn isolated subagents for parallel workstreams. Each subagent is a full deep agent with its own tool-calling, memory, and context:

agent = create_deep_agent(
    subagents=[
        {
            "name": "researcher",
            "description": "Researches topics using web search",
            "instructions": "Search the web, synthesize findings, cite sources.",
        },
        {
            "name": "code-reviewer",
            "description": "Reviews code for quality, security, and performance",
            "instructions": "Check for security issues, N+1 queries, missing tests...",
        },
    ],
)
# Main agent delegates: task(description="Review auth.py", subagent_type="code-reviewer")

Unlimited Context

Auto-summarization keeps long-running agents within the token budget:

from pydantic_deep import create_summarization_processor

processor = create_summarization_processor(
    trigger=("tokens", 100000),  # compress at 100k tokens
    keep=("messages", 20),       # keep last 20 messages verbatim
)
agent = create_deep_agent(history_processors=[processor])

Claude Code-Style Lifecycle Hooks

from pydantic_deep import Hook, HookEvent

agent = create_deep_agent(
    hooks=[
        Hook(
            event=HookEvent.PRE_TOOL_USE,
            command="echo 'Tool: $TOOL_NAME args: $TOOL_INPUT' >> /tmp/audit.log",
        ),
    ],
)

MCP Servers

from pydantic_ai.capabilities import MCP

agent = create_deep_agent(
    capabilities=[MCP(url="https://mcp.example.com/api")],
)

Context Files

Pydantic Deep Agents auto-discovers and injects project-specific context into every conversation:

File	Purpose	Who Sees It
`AGENTS.md`	Project conventions, architecture, instructions	Main agent + all subagents
`CLAUDE.md`	Claude Code project instructions	Main agent + all subagents
`SOUL.md`	Agent personality, style, communication preferences	Main agent only
`.cursorrules`	Cursor editor conventions	Main agent only
`.github/copilot-instructions.md`	GitHub Copilot instructions	Main agent only
`CONVENTIONS.md`	Project coding conventions	Main agent only
`CODING_GUIDELINES.md`	Coding guidelines	Main agent only
`MEMORY.md`	Persistent memory — read/write/update tools	Per-agent (isolated)

Compatible with Claude Code, Cursor, GitHub Copilot, and other agent frameworks. AGENTS.md follows the agents.md spec.

See the full API reference for all options.

🔬 DeepResearch — Reference App

A full-featured research deep agent with web UI — built entirely on Pydantic Deep Agents.

Plan Mode — planner asks clarifying questions	Multi-Agent Swarm — 5 subagents researching in parallel
Excalidraw Canvas — live diagrams synced with agent	File Browser — workspace files with inline preview

Web search (Tavily, Brave, Jina), sandboxed code execution, Excalidraw diagrams, plan mode, report export.

cd apps/deepresearch && uv sync && cp .env.example .env
uv run deepresearch    # → http://localhost:8080

See apps/deepresearch/README.md for full setup.

Architecture

Pydantic Deep Agents uses pydantic-ai's native Capabilities API for all cross-cutting concerns — hooks, memory, skills, context files, teams, and plan mode are all first-class pydantic-ai capabilities.

Capabilities

Capability	Package	What It Does
CostTracking	pydantic-ai-shields	Token/USD budget enforcement and real-time cost callbacks
ContextManagerCapability	summarization-pydantic-ai	Unlimited context via auto-summarization
LimitWarnerCapability	summarization-pydantic-ai	URGENT/CRITICAL warnings when context limits approach
StuckLoopDetection	pydantic-deep	Detects and breaks repetitive agent loops
EvictionCapability	pydantic-deep	Intercepts large tool outputs before they enter history
PatchToolCallsCapability	pydantic-deep	Fixes orphaned tool calls/results in history
HooksCapability	pydantic-deep	Claude Code-style PRE/POST_TOOL_USE lifecycle hooks
CheckpointMiddleware	pydantic-deep	Save, rewind, and fork conversation state
WebSearch / WebFetch	pydantic-ai built-in	Tool-calling: web search and URL fetching
SkillsCapability	pydantic-deep	Domain-specific skills from SKILL.md files
MemoryCapability	pydantic-deep	Persistent memory across sessions
TeamCapability	pydantic-deep	Multi-agent swarm — shared TODOs, message bus
PlanCapability	pydantic-deep	Structured planning before execution

Modular Packages

Every component is a standalone package — use only what you need:

Package	What It Does
pydantic-ai-backend	File storage, Docker sandbox, console toolset
pydantic-ai-todo	Task planning with subtasks and dependencies
subagents-pydantic-ai	Sync/async delegation, background tasks, cancellation
summarization-pydantic-ai	LLM summaries or zero-cost sliding window
pydantic-ai-shields	Cost tracking, input/output/tool blocking

                         Pydantic Deep Agents
+---------------------------------------------------------------------+
|                                                                     |
|   +----------+ +----------+ +----------+ +----------+ +---------+   |
|   | Planning | |Filesystem| | Subagents| |  Skills  | |  Teams  |   |
|   +----+-----+ +----+-----+ +----+-----+ +----+-----+ +----+----+   |
|        |            |            |            |            |        |
|        +------------+-----+------+------------+------------+        |
|                           |                                         |
|                           v                                         |
|  Summarization --> +------------------+ <-- Capabilities            |
|  Checkpointing --> |    Deep Agent    | <-- Hooks                   |
|  Cost Tracking --> |   (pydantic-ai)  | <-- Memory                  |
|  Loop Detect   --> |                  | <-- Limit Warner            |
|                    +--------+---------+                             |
|                             |                                       |
|           +-----------------+-----------------+                     |
|           v                 v                 v                     |
|    +------------+    +------------+    +------------+               |
|    |   State    |    |   Local    |    |   Docker   |               |
|    |  Backend   |    |  Backend   |    |  Sandbox   |               |
|    +------------+    +------------+    +------------+               |
|                                                                     |
+---------------------------------------------------------------------+

Full Feature List

Expand

Tool-Calling

ls, read_file, write_file, edit_file, glob, grep, execute — full filesystem access
Docker sandbox with named workspaces — sandboxed execution, packages persist between sessions
Web search (DuckDuckGo, Tavily, Brave) and web fetch
Browser automation via Playwright — navigate, click, type_text, screenshot, execute_js, and more

Deep Agent Architecture

Planning — Task tracking with subtasks, dependencies, and cycle detection
Subagents / Multi-agent swarm — Sync/async delegation, background task management, soft/hard cancellation
Agent Teams — Shared TODO lists with claiming and dependency tracking, peer-to-peer message bus
Plan Mode — Dedicated planner subagent for structured planning before execution
Persistent memory — MEMORY.md that persists across sessions, auto-injected into system prompt
Self-improving — /improve analyzes past sessions, proposes updates to context files

Context & Memory

Unlimited context — Auto-summarization when approaching token budget (LLM-based or sliding window)
Context limit warnings — Model receives URGENT/CRITICAL messages when approaching 70% context usage
Eviction capability — Intercepts large tool outputs via after_tool_execute before they enter history
Context files — Auto-discover and inject AGENTS.md, CLAUDE.md, SOUL.md, .cursorrules, copilot-instructions, CONVENTIONS.md, CODING_GUIDELINES.md
Checkpoints — Save state, rewind or fork conversations. In-memory and file-based stores. Per-run isolation via for_run()

Reliability

Stuck loop detection — Detects repeated identical calls, A-B-A-B alternating, and no-op patterns. Warns or stops the agent
Orphan repair — Fixes orphaned tool calls/results in conversation history before each model request
Context limit warnings — Injects URGENT/CRITICAL messages so the model knows to wrap up

Production Features

MCP — Connect any Model Context Protocol server
Lifecycle hooks — Claude Code-style PRE/POST_TOOL_USE. Shell commands or Python handlers
Structured output — Type-safe responses with Pydantic models via output_type
Cost tracking — Token/USD budgets with automatic enforcement and real-time callbacks
Streaming — Full streaming support for real-time responses
Image support — Multi-modal analysis with image inputs
Human-in-the-loop — Confirmation workflows for sensitive operations
Output styles — Built-in (concise, explanatory, formal, conversational) or custom

CLI

Interactive TUI (Textual) with streaming, tool visualization, session management
Headless runner (pydantic-deep run) for CI/CD, benchmarks, scripted automation
20+ slash commands: /improve, /compact, /diff, /model, /provider, /skills, /theme, and more
@filename file references, !command shell passthrough
Tool approval dialogs with auto-approve
Debug logging per session

Contributing

git clone https://github.com/vstorm-co/pydantic-deepagents.git
cd pydantic-deepagents
make install
make test   # 100% coverage required
make all    # lint + typecheck + test

Vstorm OSS Ecosystem

pydantic-deepagents is part of a broader open-source ecosystem for production AI agents:

Project

Description

full-stack-ai-agent-template

Zero to production AI app in 30 minutes. FastAPI + Next.js 15, 6 AI frameworks (incl. pydantic-deep), RAG pipeline, 75+ config options.

pydantic-ai-shields

Drop-in guardrails for Pydantic AI agents. 5 infra + 5 content shields.

pydantic-ai-subagents

Declarative multi-agent orchestration with token tracking.

pydantic-ai-summarization

Smart context compression for long-running agents.

pydantic-ai-backend

Sandboxed execution for AI agents. Docker + Daytona.

content-skills

Claude Code content studio — blog, social, slides, video, infographics — all brand-aware.

production-stack-skills

Claude Code skills for production-grade FastAPI, PostgreSQL, Docker, and observability.

Want the full stack? Use full-stack-ai-agent-template — it ships pydantic-deep integrated with FastAPI, Next.js, auth, WebSocket streaming, and RAG out of the box.

Browse all projects at oss.vstorm.co

Star History

License

MIT — see LICENSE

Need help shipping AI agents in production?

We're Vstorm — an Applied Agentic AI Engineering Consultancy
with 30+ production agent implementations. Pydantic Deep Agents is what we build them with.

Made with care by Vstorm

Release History

Version	Changes	Urgency	Date
0.3.24	## [0.3.24] - 2026-06-01 ### Fixed - Branch cost no longer drops off freshly mounted fork-tab chips (`apps/cli/widgets/fork_tabs.py`). `ForkTabsWidget.watch_statuses` mounts chips asynchronously (`await self.mount(...)`), so when the poll loop sets `statuses` then `branch_costs` in the same tick, `watch_branch_costs`'s single `call_after_refresh` pass could fire before a chip's mount completed and the `$x.xx` cost never landed — surfacing as a flaky full-suite test failure. Costs are n	High	6/1/2026
0.3.22	## [0.3.22] - 2026-05-24 ### Fixed - `AttributeError: 'LocalBackend' object has no attribute '_read_bytes'` at toolset `get_instructions()` time ([#118](https://github.com/vstorm-co/pydantic-deepagents/pull/118), independently authored by [@mcauthorn](https://github.com/mcauthorn) in [#119](https://github.com/vstorm-co/pydantic-deepagents/pull/119)). `pydantic-ai-backend 0.2.8` promoted the bytes-read entry point on `BackendProtocol` from private `_read_bytes` to public `read_bytes` an	High	5/24/2026
0.3.19	## [0.3.19] - 2026-05-14 ### Added - `PeriodicReminderCapability` — periodic task reminders for long agent runs ([#94](https://github.com/vstorm-co/pydantic-deepagents/pull/94)) — injects a "what are you supposed to be doing" reminder into the message history every N model-request turns to prevent agent drift on long, tool-heavy runs. Uses `before_model_request` and per-run state isolation via `for_run()`. - Four CLI modes via a new `/remind` command: `off`, `first` (zero-cost — re-	High	5/14/2026
0.3.18	## [0.3.18] - 2026-05-05 ### Fixed - `EvictionCapability` dropped `BinaryContent` (e.g. screenshots) from `ToolReturn` results — previously, any `ToolReturn(return_value=..., content=[..., BinaryContent(...)])` was collapsed into a plain string before the size check, so the multimodal `content` (images, audio, PDFs) was silently discarded along with a text eviction message. The capability now only measures and evicts `return_value`; the `content` list and `metadata` are always preserve	High	5/5/2026
0.3.17	## [0.3.17] - 2026-04-22 ### Added - `LiteparseToolset` — document parsing via [LiteParse](https://github.com/run-llama/liteparse) - New toolset at `pydantic_deep.toolsets.liteparse` - Tools: `parse_document` (text extraction) and `screenshot_document` (per-page images) - Reads files from any backend as bytes — works with `StateBackend`, `LocalBackend`, `DockerSandbox` - Optional OCR via built-in Tesseract or pluggable HTTP server (PaddleOCR, EasyOCR) - Lazy parser initi	High	4/22/2026
0.3.15	## [0.3.15] - 2026-04-17 ### Fixed - `PatchToolCallsCapability` caused `ValidationException: duplicate Ids` on Bedrock when tools raised `ModelRetry` — when a tool raised `ModelRetry`, pydantic-ai records the retry as a `RetryPromptPart` (carrying the original `tool_call_id`) on the following `ModelRequest`, not as a `ToolReturnPart`. The patch processor only scanned for `ToolReturnPart` when deciding whether a `ToolCallPart` was orphaned, so it injected a synthetic `ToolReturnPart` wi	High	4/17/2026
0.3.14	## [0.3.14] - 2026-04-16 ### Fixed - Subagents ignored parent `web_search`/`web_fetch` settings — the default subagent factory in `create_deep_agent` hardcoded `web_search=True` and `web_fetch=True`, overriding the parent agent's configuration. On Bedrock and Vertex Anthropic models this produced a 400 error (`web_fetch_20250910` not accepted), because the beta web tools are not supported there. The factory now propagates the parent agent's `web_search` and `web_fetch` flags to spawned	High	4/16/2026
0.3.13	## [0.3.13] - 2026-04-13 ### Fixed - User-provided tools lost metadata when passed via `tools=` parameter — tools registered through `create_deep_agent(tools=[...])` were previously added via `agent.tool(tool.function)` after construction, which hardcoded `takes_ctx=True` and discarded all `Tool`-level metadata (`name`, `description`, `prepare`, `max_retries`, `requires_approval`, `timeout`). Tools are now passed directly to the `Agent` constructor, preserving all metadata and correctl	High	4/13/2026
0.3.12	## [0.3.12] - 2026-04-13 ### Added - Bandit security scanner — [Bandit](https://bandit.readthedocs.io/) is now part of the development toolchain and CI pipeline. It runs on every commit via the new `security` job in GitHub Actions and is also available locally via `make security`. The scanner checks production code (`pydantic_deep/`) for common Python security vulnerabilities (CWE-listed issues). No medium- or high-severity findings block a merge. - GitHub Issue Templates — struct	Medium	4/13/2026
0.3.11	## [0.3.11] - 2026-04-13 ### Fixed - Browser opens on every message (`BrowserCapability`) — `async_playwright()` was entered eagerly at the start of `wrap_run`, spawning the Playwright Node.js driver process (which in turn opened a browser window) on every agent run — even when no browser tool was ever called. The Playwright context manager is now entered lazily inside the first-tool-call launcher, so runs that never use the browser incur zero Playwright overhead and no bro	Medium	4/13/2026
0.3.10	## [0.3.10] - 2026-04-12 ### Changed - Version re-release of 0.3.9 — 0.3.9 was published to PyPI and this release carries the same changes forward under a new version number. No functional differences from 0.3.9.	Medium	4/13/2026
0.3.9	## [0.3.9] - 2026-04-12 ### Added - Chromium auto-install (`BrowserCapability.auto_install`) — when the Chromium binary is missing, `BrowserCapability` now automatically runs `playwright install chromium` via the current Python interpreter before the first agent run. On success the launch is retried immediately; on failure the browser degrades gracefully (tools hidden, no instructions injected) without crashing the agent. Controlled via `auto_install: bool = True` on `Brows	Medium	4/12/2026
0.3.8	## [0.3.8] - 2026-04-12 ### Added - Automatic context limit warnings (`LimitWarnerCapability`) — the agent now receives URGENT/CRITICAL warnings injected as user messages when approaching the context window limit. Warnings start at 70% usage (well before auto-compression at 90%), giving the model time to wrap up or use `/compact`. Previously only the TUI status bar showed context usage — the model itself had no awareness of approaching limits. Enabled automatically when `co	Medium	4/12/2026
0.3.7	## [0.3.7] - 2026-04-11 ### Fixed - `web_search` not working for non-Anthropic and OpenRouter models — `duckduckgo` local fallback was not included in `cli` / `tui` extras, so `WebSearch` silently fell back to native-only mode. Models accessed through OpenRouter (or any provider without native web-search support) would report no `web_search` tool. `pydantic-ai-slim[duckduckgo]` is now bundled in both `cli` and `tui` extras	High	4/11/2026
0.3.6	## [0.3.6] - 2026-04-11 ### Added - One-command installer (`install.sh`) — macOS and Linux users can now install pydantic-deep without knowing Python or pip. A single curl command installs uv (if missing) and then the CLI: ```bash curl -fsSL https://raw.githubusercontent.com/vstorm-co/pydantic-deep/main/install.sh \| bash ``` The script auto-detects uv, falls back to installing it via `astral.sh/uv`, then runs `uv tool install "pydantic-deep[cli]"`. Verifies the instal	Medium	4/11/2026
0.3.5	## [0.3.5] - 2026-04-10 ### Added - Headless runner (`pydantic-deep run`) — new CLI command for non-interactive task execution. Designed for benchmarks (Terminal Bench), CI/CD pipelines, and scripted automation. All feature flags mirror the TUI and default from `.pydantic-deep/config.toml`. Supports `--task-file`, `--json`, `--max-turns`, `--timeout`, `--model`, `--working-dir`, `--web-search/--no-web-search`, `--web-fetch/--no-web-fetch`, `--thinking`, `--todo/--no-todo`,	Medium	4/11/2026
0.3.4	## [0.3.4] - 2026-04-09 ### Changed - Merged TUI into `apps/cli/` — removed old interactive/non-interactive CLI, TUI is now the default interface. Running `pydantic-deep` without a subcommand launches the TUI - Redesigned `/improve` pipeline — added `UserFactInsight` and `AgentLearningInsight` extraction categories; relaxed synthesis rules so user facts from a single session are accepted; MEMORY.md is now the primary target for personal facts and agent learnings - **Configurable	Medium	4/9/2026
0.3.3	## [0.3.3] - 2026-04-02 ### Changed - Default models changed: main agent `anthropic:claude-opus-4-6`, subagents `anthropic:claude-sonnet-4-6`, summarization `anthropic:claude-haiku-4-5-20251001` - Replaced `include_general_purpose_subagent` with `include_builtin_subagents` — adds a built-in "research" deep agent (filesystem + web + memory) instead of a plain pydantic-ai Agent - Subagents are now deep agents by default — all subagents (built-in and custom) are created via `create_deep	High	4/2/2026
0.3.2	## [0.3.2] - 2026-03-31 ### Added - `capabilities` parameter on `create_deep_agent()` for user-provided capabilities ([#55](https://github.com/vstorm-co/pydantic-deepagents/pull/55)) ### Fixed - Pre-existing mypy `unused-ignore` error in `spec.py`	Medium	3/31/2026
0.3.1	## [0.3.1] - 2026-03-31 ### Changed - Bump minimum `pydantic-ai-slim` to `>=1.74.0` - Toolset `get_instructions()` methods are now `async` and return `list[str] \| None` to match pydantic-ai 1.74.0's `AbstractToolset` signature - Removed manual `get_instructions()` calls from `create_deep_agent()` — pydantic-ai 1.74.0's `CombinedToolset` handles this automatically - Capability inner instruction callables are now `async` to properly `await` toolset `get_instructions()` ### Fixed - C	Medium	3/31/2026
0.3.0	## [0.3.0] - 2026-03-30 ### Breaking Changes - Full migration to pydantic-ai Capabilities API (requires `pydantic-ai>=1.71.0`) - Removed `pydantic-ai-middleware` dependency entirely — replaced by `pydantic-ai-shields>=0.3.0` - `HooksMiddleware` renamed to `HooksCapability` (extends `AbstractCapability`), moved from `pydantic_deep.middleware.hooks` to `pydantic_deep.capabilities.hooks` - `CheckpointMiddleware` now extends `AbstractCapability` instead of `AgentMiddleware` - Removed `	Medium	3/30/2026
0.2.21	## [0.2.21] - 2026-03-19 ### Fixed - Toolset instructions not injected into system prompt — `SkillsToolset`, `ContextToolset`, `AgentMemoryToolset`, and user-provided toolsets (e.g. `LocalContextToolset`) defined `get_instructions()` but pydantic-ai's `AbstractToolset` does not call it automatically. Instructions were silently missing from the agent's system prompt. Fixed by calling `get_instructions()` explicitly in `dynamic_instructions()` and removing unnecessary `async` from the me	Low	3/19/2026
0.2.20	## [0.2.20] - 2026-03-11 ### Fixed - CLI: multi-byte UTF-8 input garbled in raw mode — Chinese, Japanese, Korean and other multi-byte characters appeared as replacement characters when typed in interactive mode. `_read_raw_key()` now reads the full UTF-8 byte sequence before decoding. ([#38](https://github.com/vstorm-co/pydantic-deepagents/pull/38), by [@huapingchen](https://github.com/huapingchen)) ### Changed - Updated `pydantic-ai-backend` dependency to `>=0.1.14` — `DockerSan	Low	3/12/2026
0.2.19	## [0.2.19] - 2026-03-06 ### Fixed - `deps.todos` not synchronized with todo tools — `create_todo_toolset()` was called without `storage=` parameter, creating an isolated `TodoStorage` disconnected from `deps.todos`. Todo tools wrote to their own internal list while `deps.todos`, `get_todo_prompt()`, and `share_todos` remained empty. Fixed with `_DepsTodoProxy` pattern that delegates reads/writes to `deps.todos` at runtime. Subagent todo toolsets use the same proxy pattern for consiste	Low	3/6/2026
0.2.18	## [0.2.18] - 2026-02-27 ### Added - Custom tool descriptions — all toolset factories now accept `descriptions: dict[str, str] \| None` parameter to override any tool's built-in description. Applies to `SkillsToolset`, `AgentMemoryToolset`, `CheckpointToolset`, `create_team_toolset()`, `create_plan_toolset()`, and `create_web_toolset()` - Custom commands — user-triggered slash commands from `.md` files (`cli/commands/`). Built-in commands: `/commit`, `/pr`, `/review`, `/test`, `/fi	Low	2/27/2026
0.2.17	## [0.2.17] - 2026-02-17 ### Added - Checkpointing & Rewind: Save conversation state at intervals, rewind to any checkpoint, or fork into a new session. `Checkpoint`, `CheckpointStore` protocol, `InMemoryCheckpointStore`, `FileCheckpointStore`, `CheckpointMiddleware` (auto-save every tool/turn/manual), `CheckpointToolset` (save_checkpoint, list_checkpoints, rewind_to tools), `RewindRequested` exception for app-level rewind, `fork_from_checkpoint()` utility for session forking. Enable v	Low	2/17/2026
0.2.16	## [0.2.16] - 2025-02-12 ### Changed - Updated `subagents-pydantic-ai` dependency from `>=0.0.3` to `>=0.0.4` — fixes `AttributeError: 'Agent' object has no attribute '_register_toolset'` compatibility issue with pydantic-ai >= 1.38 ([subagents-pydantic-ai#5](https://github.com/vstorm-co/subagents-pydantic-ai/issues/5)) - Removed `_register_toolset` mock from test fixtures (`tests/conftest.py`) — no longer needed after subagents fix	Low	2/12/2026
0.2.15	## [0.2.15] - 2025-02-07 ### Added - `retries` parameter for `create_deep_agent()`: New explicit parameter (default: 3) that controls max retries for tool calls across all built-in toolsets. When the model sends invalid arguments (e.g. missing a required field), the validation error is fed back and the model can self-correct up to `retries` times. Previously, console tools (including `write_file`) were hardcoded to 1 retry via `FunctionToolset` default, making self-correction nearly im	Low	2/7/2026
0.2.14	## [0.2.14] - 2025-01-21 ### Changed - Breaking: Removed local `pydantic_deep/processors/` module - now uses external [summarization-pydantic-ai](https://github.com/vstorm-co/summarization-pydantic-ai) library - Breaking: Removed local `pydantic_deep/toolsets/subagents.py` module - now uses external [subagents-pydantic-ai](https://github.com/vstorm-co/subagents-pydantic-ai) library - Added `summarization-pydantic-ai>=0.0.1` dependency - Added `subagents-pydantic-ai>=0.0.3` depen	Low	1/23/2026
0.2.13	## [0.2.13] - 2025-01-17 ### Changed - Breaking: Updated `pydantic-ai-backend` dependency to `>=0.1.0` - Breaking: Removed `FilesystemBackend` and `LocalSandbox` - use `LocalBackend` instead - Breaking: Removed `FilesystemToolset` - use `create_console_toolset` from pydantic-ai-backend - Replaced custom filesystem toolset with `create_console_toolset` from pydantic-ai-backend - Re-exported `LocalBackend`, `create_console_toolset`, `get_console_system_prompt`, `ConsoleDeps`	Low	1/17/2026
0.2.12	## [0.2.12] - 2025-01-16 ### Changed - Updated `pydantic-ai-backend` dependency to `>=0.0.4` for persistent storage support - `__version__` now dynamically reads from package metadata (pyproject.toml) via `importlib.metadata` ### Documentation - Added persistent storage documentation to `docs/examples/docker-sandbox.md`: - `volumes` parameter for DockerSandbox - `workspace_root` parameter for SessionManager - Added `workspace_root` documentation to `docs/examples/docker-runti	Low	1/16/2026
0.2.11	# Summary Replaced pydantic-ai with pydantic-ai-slim. Thanks to this setting redundant dependencies are not installed. # Changes - replacing pydantic-ai with pydantic-ai-slim as a main package - setting cli, web and sanbox dependencies as optional to avoid installation of unnecessary libraries	Low	1/15/2026
0.2.10	# Summary Replaced pydantic-ai with pydantic-ai-slim. Thanks to this setting redundant dependencies are not installed. # Changes - replacing pydantic-ai with pydantic-ai-slim as a main package - setting cli, web and sanbox dependencies as optional to avoid installation of unnecessary libraries	Low	1/15/2026
0.2.9	## Summary - Migrate from local backends implementation to [`pydantic-ai-backend`](https://github.com/vstorm-co/pydantic-ai-backend) package from PyPI - Remove duplicated code (~1500 lines) from pydantic-deep - Update all imports and documentation to reference the new package ## Context Following the discussion in [pydantic/pydantic-ai#3747](https://github.com/pydantic/pydantic-ai/pull/3747) with @DouweM about extracting reusable components from pydantic-deep into standalone	Low	12/28/2025
0.2.8	# Changes ## Summary Enhanced file handling and metadata support across the core library. Added encoding detection, improved upload logic, and extended file metadata tracking. Business Context This change was needed to support robust handling of various file types (including binary and PDF), improve compatibility, and provide richer metadata for uploaded files. It enables better downstream processing and user experience for file-related features. ## Changes - Updated Docker backend	Low	12/28/2025
0.2.7	- Remove pydantic_deep/toolsets/todo.py (now in separate package) - Add pydantic-ai-todo>=0.1.0 as dependency (from PyPI) - Remove local path override in [tool.uv.sources] - Update imports to use pydantic_ai_todo directly - Fix import ordering (third-party before local) - Fix Todo re-export for mypy compatibility - Update documentation with pydantic-ai-todo references - Add pydantic-ai-todo callout to README - Add Issues URL to pyproject.toml	Low	12/23/2025
0.2.6	Release 0.2.6	Low	12/10/2025
0.2.5	Release 0.2.5	Low	12/10/2025
0.2.4	Release 0.2.4	Low	12/10/2025
0.2.3	Release 0.2.3	Low	12/9/2025
0.2.2	Release 0.2.2	Low	12/8/2025
0.2.1	Release 0.2.1	Low	12/8/2025
0.2.0	Release 0.2.0	Low	12/8/2025
0.1.0	## pydantic-deep v0.1.0 First public release of pydantic-deep - a deep agent framework built on pydantic-ai. ### Features - Agent Factory - `create_deep_agent()` for creating configured agents with sensible defaults - Multiple Backends - StateBackend (in-memory), FilesystemBackend, DockerSandbox, CompositeBackend - Rich Toolsets - TodoToolset, FilesystemToolset, SubAgentToolset, SkillsToolset - Skills System - Extensible skill definitions with markdown prompts and on-de	Low	11/29/2025

Dependencies & License Audit

Loading dependencies...

Similar Packages

langchainThe agent engineering platformlangchain-core==1.4.1

Deepagent-research-context-engineering🔍 Accelerate research using a Multi Agent System for efficient context engineering with DeepAgent and LangChain's library.main@2026-06-05

hermes-agentThe agent that grows with youv2026.6.5

airbyte-agent-connectors🐙 Drop-in tools that give AI agents reliable, permission-aware access to external systems.v0.1.226

airbyte-agent-sdk🐙 Drop-in tools that give AI agents reliable, permission-aware access to external systems.v0.1.226

More from vstorm-co

awesome-pydantic-ai An opinionated list of awesome Pydantic-AI frameworks, libraries, software and resources.

More in MCP Servers

node9-proxyThe Execution Security Layer for the Agentic Era. Providing deterministic "Sudo" governance and audit logs for autonomous AI agents.

mcp-compressorAn MCP server wrapper for reducing tokens consumed by MCP tools.

claude-plugins-officialOfficial, Anthropic-managed directory of high quality Claude Code Plugins.

langchain4jLangChain4j is an open-source Java library that simplifies the integration of LLMs into Java applications through a unified API, providing access to popular LLMs and vector databases. It makes impleme