agent-actions

Home > Testing > agent-actions

Declarative framework for orchestrating multi-model LLM pipelines with context engineering and quality gates.

ai-agents anthropic context-engineering-framework llm orchestration prompt-engineering prompt-engineering-tool python yaml

Why this rank:Recent releaseHealthy release cadenceStrong adoption

Description

Declarative framework for orchestrating multi-model LLM pipelines with context engineering and quality gates.

README

Declarative LLM orchestration. Define workflows in YAML — each action gets its own model, context window, schema, and pre-check gate. The framework handles DAG resolution, parallel execution, batch processing, and output validation.

Warning

Experimental — Under active development. Expect breaking changes. Open an issue with feedback.

Agent Actions lifecycle: Define → Validate → Execute

actions:
  - name: extract_features
    intent: "Extract key product features from listing"
    model_vendor: anthropic              # Each action picks its own model
    model_name: claude-sonnet-4-20250514

  - name: generate_description
    dependencies: [extract_features]
    model_vendor: openai                 # Mix vendors in one pipeline
    model_name: gpt-4o-mini
    context_scope:
      observe:
        - extract_features.features      # See only what it needs
      drop:
        - source.raw_html                # Don't waste tokens on noise

Install

pip install agent-actions

Quick start

agac init my-project && cd my-project                # scaffold a project
agac init --example contract_reviewer my-project     # or start from an example
agac run -a my_workflow                              # execute

Why not just write Python?

You will, until you have 15 steps, 3 models, batch retry, and a teammate asks what your pipeline does.

Capability	Agent Actions	Python script	n8n / Make
Per-step model selection	YAML field	Manual wiring	Per-node config
Context isolation per step	`observe` / `drop`	You build it	Not available
Pre-check guards (skip before LLM call)	`guard:`	If-statements	Post-hoc branching
Parallel consensus (3 voters + merge)	2 lines of YAML	Custom code	Many nodes + JS
Schema validation + auto-reprompt	Built in	DIY	Not available
Batch processing (1000s of records)	Built in	For-loops	Loop nodes
The YAML is the documentation	Yes	No	Visual graph

Examples

Example	Pattern	Key Features
Review Analyzer	Parallel consensus	3 independent scorers, vote aggregation, guard on quality threshold
Contract Reviewer	Map-reduce	Split clauses, analyze each, aggregate risk summary
Product Listing Enrichment	Tool + LLM hybrid	LLM generates copy, tool fetches pricing, LLM optimizes
Book Catalog Enrichment	Multi-step enrichment	BISAC classification, marketing copy, SEO metadata, reading level
Incident Triage	Parallel consensus	Severity classification, impact assessment, team assignment, response plan

Providers

Provider	Batch	Provider	Batch
OpenAI	Yes	Groq	Yes
Anthropic	Yes	Mistral	Yes
Google Gemini	Yes	Cohere	Online only
Ollama (local)	Online only

Switch providers per-action by changing model_vendor.

Key capabilities

Pre-flight validation — schemas, dependencies, templates, and credentials checked before any LLM call
Batch processing — route thousands of records through provider batch APIs
User-defined functions — Python tools for pre/post-processing and custom logic
Reprompting — auto-retry when LLM output doesn't match schema
Observability — per-action timing, token counts, and structured event logs
Interactive docs — agac docs builds and serves a visual workflow dashboard

Documentation

Full docs — guides, tutorials, reference
Configuration — YAML schema reference
CLI — all commands and options

Contributing

git clone https://github.com/Muizzkolapo/agent-actions.git && cd agent-actions
pip install -e ".[dev]"
pytest

See CONTRIBUTING.md. Report bugs via Issues.

License

Apache License 2.0

Release History

Version	Changes	Urgency	Date
v0.2.4	## Bug Fix * Observe directives on missing fields now skip the record instead of silently injecting None into prompts * Empty dict responses `[{}]` no longer bypass `_is_empty_output` check; reprompt exhaustion now routes through EXHAUSTED disposition instead of SUCCESS ## Under the Hood * Release v0.2.3 — batch changie entries, bump version	High	6/6/2026
v0.2.3	## v0.2.3 - 2026-05-31 ### Enhancement or New Feature * Add incremental checkpointing for online processing — interrupted runs resume from the last checkpoint instead of reprocessing all records * Add field visibility settings panel to HITL review UI — reviewers can configure which fields stay expanded vs collapsed per session ### Under the Hood * Release v0.2.2 — batch changie entries, bump version * Remove dead VendorRegistry class, SINGLE_RESPONSE_CLIENTS empty set, and stale dispatch_task co	High	5/31/2026
v0.2.2	## v0.2.2 - 2026-05-30 ### Bug Fix * Add exponential backoff with jitter to batch retry loop to prevent retry storms when provider is overloaded * Stop silently graduating records on reprompt batch submission failure; transient and permanent errors now handled distinctly with proper metadata * Remove silent os.getenv API key fallbacks in batch client factory — batch clients now raise ConfigurationError when no key is configured instead of silently trying hardcoded vendor env vars * Log recovery	High	5/30/2026
v0.2.0	## v0.2.0 - 2026-05-23 ### Breaking Change * Removed cross-workflow chaining: upstream: config key, --upstream/--downstream CLI flags, WorkflowOrchestrator, and VirtualAction injection. Workflows are self-contained. Users with upstream: in config will see an unknown-key warning. ### Enhancement or New Feature * v0.2.0 Quality Hardening — 25 fixes from comprehensive codebase audit: - Storage errors now propagate instead of silently returning COMPLETED - Schema migration uses ALTER TABLE instead	High	5/23/2026
v0.1.17	## What's new in v0.1.17 40 commits since v0.1.16 — major reliability, observability, and UI improvements. ### Highlights Pipeline reliability - Record-level error isolation and selective retry (#539) - Async batch reprompt recovery loop (#528) - Batch target materialization — DB + disk parity (#527) - Halt workflow when upstream action fails all records (#523) - Unify batch and online guard evaluation, preflight, and lifecycle paths (#531) Observability - Reprompt observability	High	5/16/2026
v0.1.16	## What's Changed ### Breaking Changes - refactor: consolidate on_schema_mismatch into reprompt config (#493) — Remove top-level `on_schema_mismatch` and `strict_schema` config keys. Schema conformance is now declared inside `reprompt: {on_schema_mismatch: reject/reprompt}` ### Bug Fixes - fix: return error dict on JSON parse failure instead of raising (#492) — Return error dict instead of raising `VendorAPIError` on JSON parse failure in ollama cloud and openai online clients, enablin	High	5/4/2026
v0.1.14	## v0.1.14 ### Breaking Changes - `read_target()` now validates `_state` field on all records (fail-closed). Pre-v0.1.14 target data requires deleting `agent_io/target/` and re-running. - `StorageBackend` subclasses must implement `_read_target_raw()` (template method) ### Enhancements - Record State Machine — `RecordState` enum (8 states), `RecordEnvelope.transition()` as single lifecycle mutator, executor boundary reset, `derive_disposition()`, schema enforcement - Unified processing	High	5/2/2026
v0.1.13	## Framework v0.1.13 ### Architectural Unification - Guard evaluation: 5 entry points → 1 unified evaluate() with GuardBehavior enum - RecordProcessor: 4 independent instantiations → single create() factory - Lineage building: deduplicated across exhausted/passthrough builders - Context builder: 235-line monolith → 4 focused functions (59-line orchestrator) - Dead code deleted: HistoricalNodeDataLoader, BatchProcessor, apply_observe_filter (-2992 lines) ### Data Model Fixes - Fan-in merges all	High	4/25/2026
v0.1.12	## v0.1.12 — 2026-04-19 ### Enhancement or New Feature * Self-reflection feedback strategy for reprompt retries * LLM critique escalation for stubborn reprompt failures ### Under the Hood * Unified online and batch reprompt logic via shared helpers ### Bug Fix * Drop directives now correctly exclude fields from passthrough wildcards * Pass compiled schemas to Groq, Cohere, and Mistral APIs * Retry transient OpenAI 400 JSON parsing errors * Cascade failures report as blocked instead of indepen	High	4/19/2026
v0.1.11	## v0.1.11 ### Enhancement or New Feature * Cross-workflow chaining: `--upstream` and `--downstream` CLI flags, virtual action injection, and orchestrator support for multi-workflow pipelines * FILE-mode record identity: tools receive full records with `node_id`, framework matches outputs to inputs by identity instead of heuristics (breaking change for FILE tools — use `record["content"]["field"]` instead of `record["field"]`) * Step-by-step progress display for sequential execution mode *	High	4/13/2026
v0.1.10	### Bug Fixes - fix: HITL FILE granularity + guard pre-filter for FILE mode (#229) - fix: HITL FILE mode truncates lineage to just its own node_id (#231) - fix: resolve false-positive StaticTypeError for Jinja loop variables (#232) - fix: accept cross-workflow dependency syntax, fix 4 crash sites - fix: improve VS Code data card indentation, grammar, and rendering (#223) - fix: improve docs data card indentation, contrast, and grammar (#224) ### Tests - test: deterministic lineage integrity tes	High	4/11/2026
v0.1.9	### Bug Fixes - fix: resolve FileUDFResult lineage by source_mapping index, not source_guid scan (#220) - FILE-mode tools with multiple input records sharing the same source_guid (e.g. after flatten) now produce correct per-record lineage instead of all inheriting the first match - Fixes silent downstream data duplication when using context_scope.observe on pre-flatten ancestors - source_mapping is now fully consumed by the runtime — no longer validated-then-discarded	High	4/8/2026
v0.1.8	### Features - feat: enforce context_scope as sole data gate (#199) - feat: namespace llm_context by action to prevent version data loss (#208) - feat: redesign VS Code data card with 5-section tree view (#215, #216) ### Bug Fixes - fix: deterministic record matcher — exact node_id join, no fallbacks (#195) - fix(vscode): safe JSON injection for card view records (#197) - fix(cla): grant contents write permission for signature commits (#200, #201) - fix: resolve context_scope field prefix patte	Medium	4/8/2026
v0.1.7	### Features - feat(storage): prompt trace storage for compilation-level observability (#184) - feat(workflow): support record_limit on downstream actions (#187) - feat(ui): surface prompt traces in Data Explorer and VS Code extension (#189) ### Bug Fixes - fix(input): remove double-parse of JSON in staging pipeline (#185) - fix: guard error taxonomy and circuit breaker for semantic errors (#186) - fix: preflight guard condition AST validation for unquoted strings (#188) ### Tests - fix(tests)	Medium	4/6/2026
v0.1.6	### Bug Fixes - fix(docs): render images in workflow README tab (#170) - fix(docs): logs badge green when count is zero (#171) - fix(docs): use absolute URLs for README images on PyPI (#173) - fix(validation): scope prompt validator to current workflow (#182) ### Documentation - Add Project Surface and Dependencies to all 17 module manifests - Tighten AGENTS.md link to manifest system (#169)	Medium	4/5/2026
v0.1.5	## What's New - Rich execution summary — `agac run` now renders a structured post-execution tree showing per-action status, type badges, provider, and latency - Configurable seed data path — set `seed_data_path` in `agent_actions.yml` to rename your seed data directory - Unified `agac docs` command — replaces separate `docs generate` + `docs serve` with a single command - Cleaner CLI output — demoted 14x "fallback heuristic" spam and other noise to debug level ## Fixes - Place	Medium	4/4/2026
v0.1.4	## What's New - Runtime tab added to docs site logs screen - Gantt-style execution timeline on runs detail screen - Action metrics enriched with latency, provider, cache, and disposition - Runtime warnings extracted from events.json into catalog - Project name field added to agent_actions.yml - Retry failed/skipped actions on workflow re-run - Pre-flight guard checks for nullable fields flowing into tool schemas - ActionStatus enum and DEFERRED records audit trai	Medium	4/3/2026
v0.1.3	## Added - Record and file limit controls for capped test runs - Clean data card redesign with typography controls - Doc-vs-code audit framework and fix reprompt docs - Centralized data card component ## Fixed - Serialize LSP lifecycle and broaden activation events - Scan all workspace dirs for agac-lsp instead of hardcoded venv names ## Chores - Bump vscode extension to 0.1.5	Medium	3/28/2026
v0.1.2	## v0.1.2 - 2026-03-28 ### Fixed - Fix `agent_actions.llm.batch` missing from PyPI package (#37) - Root cause: `.gitignore` `batch/` pattern silently excluded the module via hatchling's VCS-awareness - Set `ignore-vcs = true` in hatch build config to decouple packaging from `.gitignore` ### Changed - Convert `agac init` to a command group with proper subcommands (#40) - `agac init list` — list available examples from GitHub - `agac init example <name>` — scaffold from a GitHub example	Medium	3/28/2026
v0.1.1	Description: ## What's Changed ### Fixes - Fixed missing `agent_actions.llm.batch` module in package build - Fixed VS Code extension LSP server auto-discovery (no manual path needed) - Fixed `seed_data` vs `seed.` reference syntax in workflow skills ### Added - VS Code extension published to marketplace (`runagac.agent-actions`) - `agac-lsp` entrypoint for Language Server Protocol support - New examples: review_analyzer, contract_reviewer, product_listing_enrichment ###	Medium	3/27/2026
v0.1.0	Description: ## Agent Actions v0.1.0 Declarative framework for orchestrating multi-model LLM pipelines with context engineering and quality gates. ### Highlights - Declarative YAML workflows — define multi-step LLM pipelines as configuration, not code - Context engineering — per-action control over what each LLM sees via `observe`, `drop`, `passthrough` - Multi-model support — OpenAI, Anthropic, Google Gemini, Ollama, Groq, Mistral, Cohere per action - **Parallel	Medium	3/27/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

structured-prompt-skill✍️ Write effective AI prompts with this structured prompt engineering library and Claude Code skill, featuring 300+ curated examples for high-quality results.main@2026-06-07

ai-lead-qualifier🧠 Qualify leads with an AI-driven system that understands intent, asks key questions, and structures quality leads without hardcoding processes.main@2026-06-07

claude-code-configClaude Code skills, architectural principles, and alternative approaches for AI-assisted developmentmain@2026-06-06

ring89 skills and 38 specialized agents that enforce proven engineering practices for AI-assisted development. TDD, systematic debugging, parallel code review, and 10-gate development cycles — as a Claudemain@2026-06-03

OpenACMSelf-hosted autonomous AI agent — runs on your PC, controls your environment, connects to any MCP server.main@2026-06-02

More in Testing

vector-db-benchmarkFramework for benchmarking vector search engines

fspecFSPEC: The Spec-Driven, Multi-Agent Coding Factory. It is infrastructure for the "Dark Factory"—the emerging model of fully autonomous software development where AI agents handle all implementation wh

pilot#1 Terminal Benchmark 2.0 — AI that ships your tickets.

GitoAn AI-powered GitHub code review tool that uses LLMs to detect high-confidence, high-impact issues—such as security vulnerabilities, bugs, and maintainability concerns.