AgentBox

Run agents inside sandboxes. One API, any provider.

import { Agent, Sandbox } from "agentbox-sdk";

const sandbox = new Sandbox("local-docker", {
  workingDir: "/workspace",
  image: process.env.IMAGE_ID!,
  env: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY! },
});

const run = new Agent("claude-code", {
  sandbox,
  cwd: "/workspace",
  approvalMode: "auto",
}).stream({
  model: "claude-sonnet-4-6",
  input: "Create a hello world Express server in /workspace/server.ts",
});

for await (const event of run) {
  if (event.type === "text.delta") process.stdout.write(event.delta);
}

await sandbox.delete();

Providers are mix-and-match:

Agents — claude-code, opencode, codex
Sandboxes — local-docker, e2b, modal, daytona

Swap either one and your app code stays the same.

Install

npm install agentbox-sdk

Requires Node >= 20. The agent CLI you want to use (claude, opencode, codex) should be installed inside your sandbox image.

Getting started

1. Build a sandbox image

AgentBox ships with built-in image presets. Build one for your sandbox provider:

npx agentbox image build --provider local-docker --preset browser-agent

This prints an image reference (a Docker tag, Modal image ID, E2B template, or Daytona snapshot depending on the provider). Set it as IMAGE_ID:

export IMAGE_ID=<printed value>

2. Run an agent

import { Agent, Sandbox } from "agentbox-sdk";

const sandbox = new Sandbox("local-docker", {
  workingDir: "/workspace",
  image: process.env.IMAGE_ID!,
  env: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY! },
});

const agent = new Agent("claude-code", {
  sandbox,
  cwd: "/workspace",
  approvalMode: "auto",
});

const result = await agent.run({
  model: "claude-sonnet-4-6",
  input:
    "Explain the project structure and write a summary to /workspace/OVERVIEW.md",
});

console.log(result.text);
await sandbox.delete();

3. Stream events

agent.stream() returns an async iterable of normalized events:

const run = agent.stream({
  model: "claude-sonnet-4-6",
  input: "Write a fizzbuzz in Python",
});

for await (const event of run) {
  if (event.type === "text.delta") {
    process.stdout.write(event.delta);
  }
}

const result = await run.finished;

Agents

Three agent providers are supported. Each wraps a CLI that runs inside the sandbox:

Provider	CLI	Model format
`claude-code`	`claude`	`claude-sonnet-4-6`
`opencode`	`opencode`	`anthropic/claude-sonnet-4-6`, `openai/gpt-4.1`
`codex`	`codex`	`gpt-5-codex`

new Agent("claude-code", { sandbox, cwd: "/workspace", approvalMode: "auto" });
new Agent("opencode", { sandbox, cwd: "/workspace", approvalMode: "auto" });
new Agent("codex", { sandbox, cwd: "/workspace", approvalMode: "auto" });

Sandboxes

Four sandbox providers are supported. Each gives you an isolated environment with the same interface:

Provider	What it is	Auth
`local-docker`	Local Docker container	Docker daemon
`e2b`	Cloud micro-VM	`E2B_API_KEY`
`modal`	Cloud container	`MODAL_TOKEN_ID` + `MODAL_TOKEN_SECRET`
`daytona`	Cloud dev environment	`DAYTONA_API_KEY`

Every sandbox supports: run(), runAsync(), gitClone(), openPort(), getPreviewLink(), snapshot(), stop(), delete().

Skills

Attach GitHub repos as agent skills. They're cloned into the sandbox and surfaced to the agent:

const agent = new Agent("claude-code", {
  sandbox,
  cwd: "/workspace",
  approvalMode: "auto",
  skills: [
    {
      name: "agent-browser",
      repo: "https://github.com/vercel-labs/agent-browser",
    },
  ],
});

You can also embed skills inline:

skills: [
  {
    source: "embedded",
    name: "lint-fix",
    files: {
      "SKILL.md": "Run `npm run lint:fix` and verify the output is clean.",
    },
  },
],

Sub-agents

Delegate tasks to specialized sub-agents:

const agent = new Agent("claude-code", {
  sandbox,
  cwd: "/workspace",
  approvalMode: "auto",
  subAgents: [
    {
      name: "reviewer",
      description: "Reviews code for bugs and security issues",
      instructions:
        "Flag bugs, security issues, and missing edge cases. Be concise.",
      tools: ["bash", "read"],
    },
  ],
});

MCP servers

Connect MCP servers to give agents access to external tools:

const agent = new Agent("claude-code", {
  sandbox,
  cwd: "/workspace",
  approvalMode: "auto",
  mcps: [
    {
      name: "filesystem",
      type: "local",
      command: "npx",
      args: ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"],
    },
    {
      name: "my-api",
      type: "remote",
      url: "https://mcp.example.com/sse",
    },
  ],
});

Custom commands

const agent = new Agent("opencode", {
  sandbox,
  cwd: "/workspace",
  approvalMode: "auto",
  commands: [
    {
      name: "triage",
      description: "Triage a bug report into root cause + fix plan",
      template:
        "Analyze the bug report. Return: root cause, files to change, and tests to add.",
    },
  ],
});

Multimodal input

Pass images and files alongside text:

import { pathToFileURL } from "node:url";

const result = await agent.run({
  model: "claude-sonnet-4-6",
  input: [
    { type: "text", text: "Describe this mockup and suggest improvements." },
    { type: "image", image: pathToFileURL("/workspace/mockup.png") },
  ],
});

Provider support: opencode (text, images, files), claude-code (text, images, PDFs), codex (text, images).

Custom sandbox images

Define your own image when the built-in presets don't cover your needs.

Create my-image.mjs:

export default {
  name: "playwright-sandbox",
  base: "node:20-bookworm",
  env: { PLAYWRIGHT_BROWSERS_PATH: "/ms-playwright" },
  run: [
    "apt-get update && apt-get install -y git python3 ca-certificates",
    "npm install -g pnpm @anthropic-ai/claude-code",
    "npx playwright install --with-deps chromium",
  ],
  workdir: "/workspace",
  cmd: ["sleep", "infinity"],
};

Build it:

npx agentbox image build --provider local-docker --file ./my-image.mjs

This works with all providers. For cloud providers, the printed value will be that provider's native image reference.

Hooks

Hooks let you run code at specific points in the agent lifecycle. Each provider has its own hook format:

Claude Code — native hook settings:

new Agent("claude-code", {
  sandbox,
  cwd: "/workspace",
  provider: {
    hooks: {
      PostToolUse: [
        { matcher: "Bash", hooks: [{ type: "command", command: "echo done" }] },
      ],
    },
  },
});

Codex — similar to Claude Code:

new Agent("codex", {
  sandbox,
  cwd: "/workspace",
  provider: {
    hooks: {
      PostToolUse: [
        { matcher: "Bash", hooks: [{ type: "command", command: "echo done" }] },
      ],
    },
  },
});

OpenCode — plugin-based hooks:

new Agent("opencode", {
  sandbox,
  cwd: "/workspace",
  provider: {
    plugins: [
      {
        name: "session-notifier",
        hooks: [{ event: "session.idle", body: 'return "session-idle";' }],
      },
    ],
  },
});

Examples

The examples/ directory has short, runnable scripts that each demonstrate one feature:

Example	What it shows
`basic.ts`	Minimal agent + sandbox
`streaming.ts`	Stream and handle events
`interactive-approval.ts`	Approve tool calls from stdin
`skills.ts`	Attach a GitHub skill
`sub-agents.ts`	Delegate to sub-agents
`mcp-server.ts`	Connect an MCP server
`multimodal.ts`	Send images to the agent
`custom-image.ts`	Build a custom sandbox image
`cloud-sandbox.ts`	Use E2B, Modal, or Daytona
`git-clone.ts`	Clone a repo into the sandbox

All examples import from "agentbox-sdk" like a normal dependency. Run them with:

npx tsx examples/basic.ts

Package exports

import { Agent, Sandbox } from "agentbox-sdk"; // main entrypoint
import type { AgentRun } from "agentbox-sdk/agents"; // agent types
import type { CommandResult } from "agentbox-sdk/sandboxes"; // sandbox types
import type { NormalizedAgentEvent } from "agentbox-sdk/events"; // event types

Contributing

npm install
npm run build
npm run typecheck
npm test

npm run build generates the dist/ directory. You need to build before the examples or CLI work locally.

To test your local build from another project:

npm run build && npm pack
# then in your project:
npm install /path/to/agentbox-sdk-0.1.0.tgz

Tests

npm test                                              # fast, no real providers
AGENTBOX_RUN_SMOKE_TESTS=1 npm run test:smoke         # live smoke tests
AGENTBOX_RUN_MATRIX_E2E=1 npm run test:e2e:matrix     # provider matrix
AGENTBOX_RUN_LOCAL_DOCKER_E2E=1 npm run test:e2e:local-docker  # local Docker e2e

Live test suites are opt-in because they provision real infrastructure.

License

MIT

Version	Changes	Urgency	Date
main@2026-06-03	Latest activity on main branch	High	6/3/2026
v0.1.701	Improve UX through various details	Low	7/16/2025
v0.1.700	## What's Changed * feat: global search by @willydouhard in https://github.com/TrySummon/summon-app/pull/76 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.1.601...v0.1.700	Low	7/16/2025
v0.1.601	Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.1.600...v0.1.601	Low	7/12/2025
v0.1.600	## What's Changed * feat: add oauth support for external mcps by @willydouhard in https://github.com/TrySummon/summon-app/pull/68 * Willy/mcp reconnect by @willydouhard in https://github.com/TrySummon/summon-app/pull/69 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.1.500...v0.1.600	Low	7/11/2025
v0.1.500	## New Features - Auto-reconnect for external MCPs: External Model Context Protocols now automatically poll and reconnect when connections are lost - Persistent composer messages: Playground composer messages are now stored persistently across sessions - Quick search: Added Cmd+F keyboard shortcut for in-app search functionality ## Bug Fixes - Storage quota error: Fixed playground storage quota issues by removing tab history from local storage	Low	7/11/2025
v0.1.400	## What's Changed * feat: add mcp server logs by @willydouhard in https://github.com/TrySummon/summon-app/pull/67 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.1.300...v0.1.400	Low	7/8/2025
v0.1.300	## What's Changed * feat: add dataset agent by @willydouhard in https://github.com/TrySummon/summon-app/pull/61 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.1.200...v0.1.300	Low	7/7/2025
v0.1.200	## What's Changed * Now support MCP prompts & ressources * Fixed a bug where external MCP were not cleaned up when switching workspaces Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.1.100...v0.1.200	Low	7/4/2025
v0.1.100	## What's Changed * Headers from mcp.json are now correctly forwarded to sse and http MCPs * http and see detection mechanism in mcp.json is now more flexible Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.1.0...v0.1.100	Low	7/3/2025
v0.1.0	## What's Changed * Added an AI copilot to help build MCPs * Added an AI optimise tool button * Added an AI fix tool call in the playground * Fixed OpenAPI spec upload issues * Fixed API explorer not displaying some endpoints * Fixed tool call malformation Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.0.106...v0.1.0	Low	7/2/2025
v0.0.106	## What's Changed * Update .env file exclusions in downloadMcpZip function by @constantinidan in https://github.com/TrySummon/summon-app/pull/49 * fix: request body by @constantinidan in https://github.com/TrySummon/summon-app/pull/50 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.0.105...v0.0.106	Low	6/26/2025
v0.0.105	## What's Changed * refresh tools by @constantinidan in https://github.com/TrySummon/summon-app/pull/41 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.0.104...v0.0.105	Low	6/23/2025
v0.0.104	## What's Changed * Willy/dataset eval by @willydouhard in https://github.com/TrySummon/summon-app/pull/37 * feat: add workspaces by @willydouhard in https://github.com/TrySummon/summon-app/pull/39 * Willy/fix endpoint by @willydouhard in https://github.com/TrySummon/summon-app/pull/40 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.0.103...v0.0.104	Low	6/20/2025
v0.0.103	## What's Changed * fix: api explorer missing endpoints + json stringify circular issue w… by @willydouhard in https://github.com/TrySummon/summon-app/pull/38 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.0.102...v0.0.103	Low	6/19/2025
v0.0.102	## What's Changed * fix: rename api by @willydouhard in https://github.com/TrySummon/summon-app/pull/28 * fix: simplify gh templates by @willydouhard in https://github.com/TrySummon/summon-app/pull/35 * feat: infer external mcp protocol from url by @willydouhard in https://github.com/TrySummon/summon-app/pull/34 * Add dataset by @constantinidan in https://github.com/TrySummon/summon-app/pull/25 ## New Contributors * @constantinidan made their first contribution in https://github.com/TryS	Low	6/17/2025
v0.0.101	## What's Changed * feat: cursor rules by @willydouhard in https://github.com/TrySummon/summon-app/pull/26 * fix: structuralSharing in useApis by @willydouhard in https://github.com/TrySummon/summon-app/pull/27 ## New Contributors * @willydouhard made their first contribution in https://github.com/TrySummon/summon-app/pull/26 Full Changelog: https://github.com/TrySummon/summon-app/compare/v0.0.1...v0.0.101	Low	6/13/2025
v0.0.1	Full Changelog: https://github.com/TrySummon/summon-app/commits/v0.0.1	Low	6/10/2025

agentbox-sdk

Description

README