Version: 3.6.0-beta
Status: Active
Audience: Developers using AI coding tools, technical leads, engineering teams
Convention: 💬 = paste into AI coding tool chat. 🖥️ = run in terminal.
Cyber Pilot is a traceable delivery system for requirements, design, plans, and code.
Stable identifiers and references connect requirements, design, plans, and implementation so drift is surfaced early instead of being reconstructed ad hoc during review and delivery.
For teams already using an AI coding tool, Cyber Pilot provides the operating controls needed to keep requirements, design, plans, and code traceable, reviewable, and enforceable as artifacts and implementation change:
- stable identifiers and cross-link validation to prove alignment across requirements, design, plans, and code
- deterministic
cptvalidation to check structure, references, consistency, and traceability locally and in CI - templates, checklists, and staged workflows to gate generation, review, and validation through explicit stages with defined inputs, outputs, and checks
Jump to: Product shape | Fit and non-fit | Operating model | Traceability and validation model | Workflow model | Typical delivery sequence | Supported hosts | Evaluate Cyber Pilot | Installation and setup reference
- Requirements and design artifacts become the approved, file-backed source of scope, intent, and constraints for downstream work.
- Plans turn that approved intent into bounded execution shape before implementation sprawls across one long chat.
- Checklists make review and validation expectations visible instead of leaving them implicit in chat or memory.
- Implementation changes are reviewed against those approved artifacts rather than as isolated code diffs.
After cpt init and cpt generate-agents, Cyber Pilot typically adds a setup directory named cypilot/, generated AI coding tool integration files, and user-editable configuration under config/ inside that setup directory.
cypilot/ is the normal default for user projects. Self-hosted development in this repository uses .bootstrap/ as a repo-specific exception described in CONTRIBUTING.md.
This repo-installed control surface is how Cyber Pilot becomes operationally real inside a repository rather than staying a chat convention. It is also the first concrete proof surface most teams can inspect directly: what is generated, what remains user-editable, what is optional, and what deterministic validation can see.
- Generated — AI coding tool integration files and repository wiring
- User-editable — project configuration, rules, and any installed kit content meant for local use
- Optional — installed kit content extends the base platform only when you want a more opinionated delivery model
- Validator-visible — artifacts, plans, and configuration participate in deterministic
cptchecks when those configured surfaces are in use
| Surface | Typical location | Ownership |
|---|---|---|
| Setup directory | cypilot/ |
Created by setup; contains both generated and user-editable material |
| Host integration files | .windsurf/, .cursor/, .claude/, .github/, .codex/, .agents/ |
Generated by cpt generate-agents; regenerate when host integration changes |
| Project config | cypilot/config/ |
User-editable and reviewable in the repo |
| Installed kit content | cypilot/config/kits/{slug}/ |
User-editable local delivery surface for that kit |
| Self-hosted bootstrap copy in this repo only | .bootstrap/ |
Contributor-only special case; not the normal user-project layout |
Cyber Pilot has two main parts:
- Core platform — the repository wiring, workflow routing, configuration surfaces, deterministic validation, and chat-facing skill that make the delivery model operational and repeatable
- Kits — optional add-ons that specialize that same delivery model with domain-specific templates, rules, workflows, and validation material
Most teams should start with the core platform and add a kit later only if they want a ready-made delivery model for a specific domain or way of working. Kits extend the same underlying system rather than introducing a separate product shape.
In practice, teams usually encounter and touch Cyber Pilot through four main surfaces in the repository and toolchain:
| Surface | Form | Role |
|---|---|---|
| Primary AI surface | cypilot <workflow>: <request> |
Main chat entry point for plan, generate, and analyze requests |
| Deterministic CLI | cpt <command> |
Setup, validation, updates, and repeatable local or CI checks |
| Generated AI coding tool integration files | generated files in the repository | Connect the repository or workspace to supported tools without manual setup in each host |
| Optional kit content | installed kit content | Add domain-specific templates, rules, workflows, and validation material |
Use Cyber Pilot if you already work with an AI coding tool and the cost of ambiguity, rework, or review failure is high enough to justify more structure and control.
- Implementation work that needs to stay bounded, inspectable, and safer to continue across more than one step
- Alignment across handoffs or review checkpoints so requirements, design, plans, and implementation do not drift apart
- Review or delivery accountability when approved scope needs reviewable evidence, clearer status surfaces, or deterministic checks before merge
- you have a multi-step, higher-risk, or review-sensitive change where the cost of ambiguity or rework is higher than the cost of added structure
- the work needs bounded execution and reviewable alignment across approved inputs, implementation, and review, not just a quick diff
- you are working in a brownfield or unfamiliar area and need understanding before editing, not just speed while editing
- coordination, handoffs, or deterministic validation materially reduce risk before review or merge
- the task is a tiny edit, throwaway spike, or open-ended exploration
- the change is already well understood, low risk, and fast to make locally without added coordination or review structure
- speed matters more than structure and the shape of the work is still unclear
- you do not want artifact-backed process, staged review, or deterministic validation overhead even when the task is non-trivial
- your team rejects workflow discipline or does not want to maintain the delivery surfaces that make review and validation easier
- a first bounded plan that makes a risky or unfamiliar change easier to execute in stages
- a first inspectable understanding surface for an unfamiliar area before you start changing code
- a first deterministic validation result or drift signal before review or merge instead of trusting one generation pass
- a first reviewable linkage between approved inputs and the implementation under review
Cyber Pilot is best understood as the workflow, context, and validation layer around your AI coding tool.
Four actors shape the operating model: the AI coding tool provides the environment, chat interface, and model access, the agent performs the reasoning and writing inside that environment, Cyber Pilot governs the repo-attached workflow, configuration, and validation surface around the work, and the human decides approval, adequacy, risk acceptance, and whether the result is acceptable to merge or ship.
Cyber Pilot makes that repo-attached surface more explicit by controlling what context and rules are loaded, what structured artifacts or checkpoints the task is expected to use, and what deterministic checks can later be run with cpt. It does not supply the underlying intelligence of the model, and it does not decide whether the final implementation is correct, well-designed, or acceptable to merge.
-
Use the agent for
- reasoning
- writing
- transformation
- implementation judgment
-
Use Cyber Pilot for
- workflow selection and task framing
- task-matched context and rule loading
- templates, rules, and checklists
- governing the repo-attached workflow, configuration, and validation surface around the work
- bounding larger tasks into more controllable execution steps
For the same configured project surface and the same command or request shape, Cyber Pilot should make the same routing, context-loading, and check-execution decisions.
-
Deterministic
- config and resource resolution
- routing into workflows and specialized commands
- loading the same configured context and rules for the same task shape
- invoking the same checks against the same configured project surface
-
Non-deterministic
- the agent's reasoning, writing quality, design quality, and implementation judgment
- adequacy of the final solution
- human approval, review, and merge decisions
This does not imply the same reasoning trace, implementation approach, code, or solution quality from run to run.
Cyber Pilot can constrain process, route work, and surface evidence repeatably, but it cannot guarantee implementation quality or replace human review.
- What tradeoff does Cyber Pilot make?
- more maintained artifacts, explicit checkpoints, and review surface in exchange for more control, auditability, and repeatability
For the full fit / non-fit guidance, practical anti-patterns, planning heuristics, and workflow-choice rules, use guides/USAGE-GUIDE.md.
Cyber Pilot is strongest when the delivery surface is explicit and checkable.
The inspectable surface is the file-backed repository material a human can open, diff, review, and compare over time. The configured enforcement surface is the validator-visible subset of that material that the repository has explicitly chosen to subject to deterministic cpt checks.
- File-backed artifacts keep requirements, design, plans, and implementation visible as inspectable delivery inputs and outputs.
- Stable identifiers and cross-links connect those artifacts through one shared traceability surface.
- Templates, checklists, and file-backed plans create review surfaces that can be inspected, diffed, and repeated.
- Validation and review outputs become visible evidence alongside the work products they refer to.
- Drift signals become operationally visible through broken links, failed checks, and missing required structure instead of being reconstructed ad hoc later.
- Not every inspectable artifact is automatically enforced; deterministic enforcement applies only to file-backed, validator-visible material the repository has configured
cptto check. - Enforceable means configured + validator-visible + deterministic rather than inferred from everything a human can see in the repository.
- IDs, required links, document structure, plans, and stage completeness become enforceable when they are part of that configured validation surface.
- The same configured surface can be checked locally and in CI so enforcement is repeatable instead of chat-dependent.
- Requirement captures the approved scope.
- Design records the intended structure, constraints, or boundary decisions.
- Plan breaks the change into bounded execution steps.
- Implementation provides traceable linked evidence back to that approved scope.
- Validation result shows whether the configured structure, links, and review surfaces still hold.
The chain exists through explicit linked artifacts, stable identifiers or references, file-backed plans or checkpoints, and validation outputs tied to the configured surface. It helps surface drift and broken alignment operationally; it does not prove semantic equivalence between the requirement and the implementation.
These are the main deterministic conformance classes applied to that configured surface.
- Artifact and document structure such as required shape, expected sections, and validator-visible files
- Identifier and reference integrity across requirements, design, plans, code, and their cross-links
- Required links and traceability rules that keep artifacts aligned through the same stable identifiers
- TOC and document consistency where those checks are part of the configured validation surface
- Plan, checklist, and stage completeness when those surfaces are file-backed and explicitly configured for checking
- Behavioral correctness, absence of defects, and implementation quality remain non-deterministic and still require review.
- Soundness of design decisions and adequacy of tests remain judgment-based even when the artifacts, structure, and links are checkable.
- Business or product adequacy remains outside deterministic proof.
- Human approval, merge, and ship decisions remain judgment-based even when the evidence surface is strong.
Cyber Pilot can surface missing, broken, stale, or inconsistent evidence without proving that the implementation is correct or adequate.
Cyber Pilot has three core workflows. Each has a portable chat form and, in some hosts, a matching slash-command alias.
| Workflow | Portable chat form | Matching alias in some hosts | Use it when |
|---|---|---|---|
| Plan | cypilot plan: ... |
/cypilot-plan |
the task is too large, risky, or context-heavy for one conversation |
| Generate | cypilot generate: ... |
/cypilot-generate |
you want to create, update, implement, or configure something |
| Analyze | cypilot analyze: ... |
/cypilot-analyze |
you want to validate, review, inspect, compare, or audit |
The portable cypilot <workflow>: ... form is the best default. Slash commands are host-specific aliases, not separate capabilities.
plan, generate, and analyze are reusable workflow modes, not a fixed mandatory sequence. They define how work is framed; the next section shows one common delivery order in which teams often combine them.
For default routing priorities and detailed workflow-choice advice, use guides/USAGE-GUIDE.md.
This is one common order for combining the workflows when an early idea or PoC needs to become a production-ready change without losing scope or design intent.
In practice, teams usually move through four visible stages:
- Approve the requirement and design so the change starts from explicit scope and constraints.
- Use
planto split larger work into bounded phases before execution sprawls across one long chat. - Use
generatewithin approved scope so implementation stays tied to the intended change. - Use
analyzeand deterministic checks before merge so review sees both the implementation and its validation surface.
In practice, this creates clearer boundaries, earlier drift detection, and more reliable review evidence than one long mixed-purpose chat.
Cyber Pilot works across multiple AI coding tools through the same portable cypilot workflow model, but some hosts preserve its workflow boundaries more fully than others. The differences are mainly in orchestration control, workflow separation, subagent support, manual discipline burden, and first-run clarity.
| Host | Workflow support profile | Operational tradeoff |
|---|---|---|
| Claude Code | Strongest starting point for the full Cyber Pilot workflow | Preserves workflow separation, subagent-assisted isolation, and separate generation/review passes with the least manual reconstruction |
| Cursor | Good editor-first support for everyday Cyber Pilot use | Portable workflows still work well, but orchestration boundaries and isolation are less explicit than in stronger workflow-oriented hosts |
| GitHub Copilot | Usable for structured GitHub-centered Cyber Pilot work | The same portable workflow model applies, but phase separation and task orchestration need more manual steering than in Claude Code |
| OpenAI Codex | Best for bounded, tightly scoped Cyber Pilot work | Works best when workflow boundaries are narrow and explicit; less natural for broader multi-stage delivery flow |
| Windsurf | Usable when you enforce workflow discipline manually | Portable workflows still apply, but weaker isolation means generation and review should stay in separate chats by convention |
If you are unsure where to start, Claude Code currently gives the clearest first experience for the full Cyber Pilot workflow because it best preserves workflow separation, orchestration control, and subagent-assisted isolation.
For host-specific setup guidance, deeper tradeoffs, and the full support matrix, use guides/AGENT-TOOLS.md.
Use this path if you are evaluating Cyber Pilot in a real repository and want one concrete result quickly.
- Pick one real repository and one narrow real input such as a requirement, design note, or change request that should produce a bounded, reviewable output.
- Complete the one-time setup for that repository using the installation and setup reference below so the repo is initialized and ready for Cyber Pilot.
- Activate Cyber Pilot in chat with 💬
cypilot onin the AI coding tool attached to that repository. - Run one focused request with 💬
cypilot analyze: ...when you want an inspectable assessment of the input, or 💬cypilot plan: ...when you want bounded execution steps before implementation.
- Run one deterministic check with 🖥️
cpt validate --local-onlywhen you want to verify only the current repository, or 🖥️cpt validatewhen cross-repo or workspace resolution is part of the trial. - Use that validation step as the proof surface of the trial; this is where Cyber Pilot shows that it produced deterministic, validator-visible signals instead of only conversational output.
- Treat either a clean pass or an actionable failure as useful evidence; the most useful failures are localized, inspectable, and actionable rather than vague.
- The output stays anchored to the real requirement, design note, or change request you started from.
- One bounded and reviewable output appears such as a plan, inspectable summary, or validation surface you could act on immediately.
- One deterministic validation signal appears as either a clean local pass or a concrete failure you can inspect and act on.
- The next decision is clearer than it was before the trial, whether that means continue, narrow scope, or stop.
- Scope anchoring — whether the output stayed tied to the real requirement, design note, or change request.
- Reviewability — whether the resulting artifacts, plans, or validation outputs are easier to inspect than one long mixed-purpose chat.
- Evidence quality — whether the outputs or failures are localized, inspectable, and usable by someone other than the original chat author.
- Signal-to-effort — whether the trial produced enough useful signal to justify the setup and process overhead.
- Trust signal — whether you would trust the resulting surface enough to continue with a larger change.
Jump to: Installation and setup reference | Configuration files | Extended operating modes | Project extensibility | Further reading
For a first trial, you need Python 3.11+, Git, and one supported AI coding tool such as Claude Code, Cursor, Windsurf, GitHub Copilot, or OpenAI Codex.
Python 3.11+ is the runtime for Cyber Pilot's repository-local scripts and CI, even when you do not install cpt globally yourself.
pipx is recommended when you want to install the cpt CLI globally and run it yourself. gh is optional for PR review and PR status workflows.
Choose the path that matches the repository state.
-
If the repository already includes Cyber Pilot
- ensure Python 3.11+ is available for the repository-local scripts and CI
- clone or open the repository in your supported AI coding tool
- activate Cyber Pilot in chat with 💬
cypilot on - send one focused request with 💬
cypilot analyze: ...or 💬cypilot plan: ...
-
If the repository does not yet include Cyber Pilot
- install
cptglobally if you want to bootstrap the repository yourself - run the one-time repository setup steps below
- then activate Cyber Pilot in chat and send the first focused request
- install
If you need to bootstrap the repository yourself, use this one-time path:
-
Install the CLI
🖥️ Terminal:
pipx install git+https://github.com/cyberfabric/cyber-pilot.git cpt --version
-
Initialize the repository
🖥️ Terminal:
cpt init cpt generate-agents
cpt init and cpt generate-agents are one-time repository bootstrap steps, not steps every downstream user must repeat.
This creates a default setup directory `cypilot/`, generated AI coding tool integration files, and user-editable configuration under `config/` inside that setup directory.
-
Activate Cyber Pilot in the AI coding tool chat:
cypilot on -
Run one focused request with 💬
cypilot analyze: ...or 💬cypilot plan: ...
For detailed host-specific setup, troubleshooting, and operational walkthroughs, use guides/AGENT-TOOLS.md and guides/USAGE-GUIDE.md.
The main top-level user-editable configuration lives under config/ inside your Cyber Pilot setup directory. Other parts of the setup directory may contain generated or supporting material, and installed kits can add their own editable surfaces.
A quick ownership rule:
| Surface | Ownership |
|---|---|
cypilot/config/ |
User-editable control surface |
cypilot/config/kits/{slug}/ |
Editable installed-kit content |
Host integration files such as .windsurf/, .cursor/, .claude/, .github/, .codex/, .agents/ |
Generated by cpt generate-agents |
.bootstrap/ |
Self-hosted contributor-only context |
You do not need full configuration mastery immediately. Treat these as the main top-level control files you can inspect, review, edit, and version in the repository.
| File | What it controls |
|---|---|
core.toml |
Top-level project settings, installed kits, and workspace registration |
artifacts.toml |
The project's artifact model, codebase mappings, and traceability structure |
AGENTS.md |
Task navigation rules that tell the agent which files to load for each job |
SKILL.md |
Always-on project instructions that apply across requests |
rules/*.md |
Optional topic-specific rules the agent loads for relevant tasks |
For full configuration details, advanced surfaces, and editing patterns, see Configuration guide.
You do not need these on day one. Add them when your use case justifies the extra surface area.
Cyber Pilot supports multi-repo workspaces so related docs, code, and shared kit assets can live in separate repositories and still stay aligned.
Use this when docs, code, or shared kit assets live in separate repos and still need to stay aligned.
Workspaces expand the reachable repository set. They do not replace project-level extensibility inside one repository.
For practical guidance, see guides/USAGE-GUIDE.md. For the full model and configuration rules, see requirements/workspace.md.
RalphEx support is optional.
When available, Cyber Pilot can hand off selected execution work to RalphEx under human supervision.
Use this when you want supervised execution handoff for bounded tasks instead of keeping all work interactive inside the current AI coding tool.
For when to delegate and how human review fits, see guides/USAGE-GUIDE.md.
Cyber Pilot supports project-level extensibility, not just installable kits.
Cyber Pilot can also load project-defined skills, subagents, workflows, and rules, so teams can extend behavior without packaging everything as a kit.
Project extensibility changes the behavior available inside one repository. Workspaces connect multiple repositories. Teams can use both together: keep cross-repo traceability through workspaces while extending the local project behavior through project-defined skills, workflows, and rules.
For the full model and examples, see guides/PROJECT-EXTENSIBILITY.md.
Recommended reading path: README -> Usage guide -> Agent tools guide -> Configuration guide.
-
Start here next
-
Deeper reference
If you think a workflow is unclear, instructions behave incorrectly, a script behaves incorrectly, or important corner cases are missing, please open a GitHub issue.
- Issues list: github.com/cyberfabric/cyber-pilot/issues
- Create a new issue: github.com/cyberfabric/cyber-pilot/issues/new/choose
The most useful issue reports usually include:
- A short summary
- Affected file, workflow, script, or exact command
- Minimal reproduction steps
- Expected vs actual behavior
- Evidence such as exact command output, logs, validator output, screenshots, or a minimal prompt or plan slice
- Environment details such as OS, AI coding tool, model, and Cyber Pilot version if known
If you want to contribute, start with CONTRIBUTING.md.
Cyber Pilot is licensed under the Apache License 2.0. See LICENSE for details.

