Automated exploratory QA testing for web applications โ powered by Playwright and, optionally, LLMs (Claude or GPT-4o).
Point QA Agent at a URL and it explores your application like a real user: clicking buttons, filling forms, navigating with the keyboard, and checking for accessibility issues. Then it reports what it finds. No test scripts to write or maintain.
Need targeted tests? Pass plain-English instructions and an LLM generates custom Playwright steps that run alongside the standard suite.
- Features
- Installation
- Quick Start
- Agentic Testing
- Web Interface
- CLI Reference
- Programmatic Usage
- Test Categories
- Output Formats
- CI/CD Integration
- Architecture
- Development
- Contributing
- Exit Codes
- Troubleshooting
- License
| Category | What it does |
|---|---|
| Agentic testing | Give Claude or GPT-4o a bug report or feature spec; it generates custom Playwright test steps automatically |
| Two modes | focused tests only given URLs; explore crawls and discovers pages |
| Six test suites | Keyboard ยท mouse ยท forms ยท accessibility ยท error detection (on by default) + WCAG 2.1 AA compliance (opt-in) |
| Auth support | Username/password, cookies, Bearer tokens, custom headers |
| Four output formats | Console, Markdown, JSON, PDF |
| Screenshots & video | On-error or every-interaction screenshots; full session recording |
| Web UI | Dashboard for launching runs, live output, and browsing past sessions |
| CI/CD ready | Exit codes map to pass/fail; JSON output integrates with any pipeline |
Requires Python 3.10+. Check with
python --version.
pip install qa-agent # standard testing (Playwright only)
playwright install chromium # required โ downloads browser binariesOptional extras:
pip install "qa-agent[ai]" # agentic testing (convenience marker โ no extra packages needed)
pip install "qa-agent[pdf]" # PDF reports (adds WeasyPrint)
pip install "qa-agent[web]" # web UI (adds Flask)
pip install "qa-agent[all]" # everything aboveAgentic testing requires an API key for your chosen provider:
export ANTHROPIC_API_KEY=sk-ant-... # Anthropic (default)
export OPENAI_API_KEY=sk-... # OpenAI
playwright install chromiummust run once after every fresh install. See Troubleshooting if anything goes wrong.
# Test a single URL
qa-agent https://example.com
# Test multiple URLs
qa-agent https://example.com https://example.com/about
# Crawl and test discovered pages
qa-agent --mode explore --max-depth 2 https://example.com
# Generate reports in a custom directory
qa-agent --output json,markdown --output-dir ./reports https://example.com
# Run via module
python -m qa_agent https://example.comPass natural-language instructions and an LLM generates custom test steps that run alongside the standard suite. Supports Anthropic (Claude) and OpenAI (GPT-4o and others). No third-party AI packages are required โ all API calls use Python's built-in urllib.
# From a bug report (Anthropic, default)
qa-agent --instructions "The login button does nothing when email is blank" \
https://example.com/login
# Using OpenAI instead
qa-agent --llm openai --instructions "The login button does nothing when email is blank" \
https://example.com/login
# From a feature spec
qa-agent --instructions "The 'Remember me' checkbox should be unchecked by default \
and persist the session across browser restarts." \
https://example.com/login
# From a file
qa-agent --instructions-file feature-spec.txt https://example.com- The LLM receives your instructions and the target URL.
- It returns a structured plan: summary, focus areas, and custom Playwright test steps.
- The agent runs those steps on every tested page alongside the standard suites.
- Assertion failures become findings in the report with the severity the LLM assigned.
If the API call fails (or the key is missing), a warning is printed and the run continues with standard tests only.
# Choose provider (default: anthropic)
qa-agent --llm anthropic --instructions "Test checkout" https://shop.example.com
qa-agent --llm openai --instructions "Test checkout" https://shop.example.com
# Override model (defaults: anthropic โ claude-sonnet-4-6, openai โ gpt-4o)
qa-agent --llm openai --ai-model gpt-4o-mini --instructions "Test checkout" https://shop.example.com
# Bypass the plan cache
qa-agent --no-cache --instructions "..." https://example.comPlans are cached to ~/.qa_agent/cache/ (24-hour TTL). Pass --no-cache to force a fresh API call.
python -m qa_agent web # http://127.0.0.1:5000
qa-agent-web --host 0.0.0.0 --port 8080 # custom bind- Configuration form with all CLI options
- Real-time streaming output (Server-Sent Events)
- Stop a running test mid-run
- Browse past sessions grouped by domain
- Session detail: findings table, severity breakdown, screenshot gallery, report downloads
No authentication โ intended for local or internal use only.
Output is written to output/ by default. CLI sessions appear in the web UI automatically (JSON is always written).
qa-agent --version
qa-agent --helpqa-agent --mode focused https://example.com # default โ test only given URLs
qa-agent --mode explore https://example.com # crawl and test discovered pages| Flag | Default | Description |
|---|---|---|
--max-depth N |
3 |
Max link depth |
--max-pages N |
20 |
Max pages to test |
--allow-external |
off | Follow links to other domains |
--ignore PATTERN |
โ | URL regex to skip (repeatable) |
qa-agent --auth "user:pass@https://example.com/login" https://example.com/dashboard
qa-agent --auth-file auth.json https://example.com
qa-agent --cookies cookies.json https://example.com
qa-agent --header "Authorization: Bearer token123" https://example.comauth.json schema
{
"username": "testuser",
"password": "testpass",
"auth_url": "https://example.com/login",
"username_selector": "input#email",
"password_selector": "input#password",
"submit_selector": "button[type=submit]"
}qa-agent --output console,markdown,json,pdf https://example.com
qa-agent --output-dir ./reports https://example.comDefault: console,markdown. JSON is always written regardless of --output (for web UI discovery). Output is organized as output/{domain}/{session_id}/qa_reports|screenshots|recordings.
PDF requires the
[pdf]extra. Falls back to Markdown if WeasyPrint is not installed.
qa-agent --screenshots https://example.com # on errors
qa-agent --screenshots-all https://example.com # every interaction
qa-agent --full-page https://example.com # full-page captures
qa-agent --record https://example.com # session videoqa-agent --no-headless # visible browser window
qa-agent --viewport 1920x1080 # default: 1280x720
qa-agent --timeout 60000 # ms, default: 30000# Skip standard suites
qa-agent --skip-keyboard https://example.com
qa-agent --skip-mouse https://example.com
qa-agent --skip-forms https://example.com
qa-agent --skip-accessibility https://example.com
qa-agent --skip-errors https://example.com
# Enable opt-in suites
qa-agent --wcag-compliance https://example.com| Flag | Default | Description |
|---|---|---|
--llm {anthropic,openai} |
anthropic |
LLM provider for AI instructions |
--ai-model MODEL |
provider default | Model override (claude-sonnet-4-6 / gpt-4o) |
--no-cache |
off | Bypass the 24-hour plan cache |
from qa_agent import QAAgent, TestConfig, TestMode, OutputFormat
from qa_agent.llm_client import LLMProvider
config = TestConfig(
urls=["https://example.com"],
mode=TestMode.EXPLORE,
output_formats=[OutputFormat.CONSOLE, OutputFormat.JSON],
max_depth=2,
max_pages=10,
instructions="Verify the password reset flow.", # optional
llm_provider=LLMProvider.OPENAI, # optional, default: LLMProvider.ANTHROPIC
ai_model="gpt-4o-mini", # optional, default: None (uses provider default)
)
agent = QAAgent(config)
session = agent.run()
print(f"Pages tested: {len(session.pages_tested)}")
print(f"Total findings: {session.total_findings}")
for finding in session.get_all_findings():
print(f" [{finding.severity.value.upper()}] {finding.title}")TAB order and focusability ยท Arrow key navigation ยท Enter key activation ยท Escape key for modals ยท Keyboard trap detection ยท Focus visibility
Click targets ยท Hover states ยท Double-click ยท Right-click/context menus ยท Target size (WCAG 2.5.5, 44ร44 px min) ยท Overlapping elements
Required field indicators ยท Validation feedback ยท Error message accessibility ยท Label associations ยท HTML5 input types ยท Autocomplete attributes
Alt text ยท Heading structure (h1โh6) ยท Link text quality ยท Color contrast ยท ARIA usage ยท Landmark regions ยท Language attributes ยท Skip navigation
Console errors/warnings ยท Network errors (4xx, 5xx) ยท JavaScript exceptions ยท Broken images ยท Broken anchors ยท Mixed content
Non-text contrast (1.4.11) ยท Use of color (1.4.1) ยท Content on hover/focus (1.4.13) ยท Meaningful sequence (1.3.2) ยท Input purpose (1.3.5) ยท Focus visible (2.4.7) ยท Label in name (2.5.3) ยท Target size (2.5.5) ยท Language of parts (3.1.2) ยท Error identification (3.3.1) ยท ARIA role/property validation
{
"meta": {
"session_id": "a1b2c3d4",
"start_time": "2024-01-15T10:30:00",
"duration_seconds": 45.2
},
"summary": {
"pages_tested": 5,
"total_findings": 12,
"findings_by_severity": { "high": 2, "medium": 5, "low": 5 }
},
"findings": [...]
}| Level | Meaning |
|---|---|
CRITICAL |
Security issues, data loss |
HIGH |
Major usability blockers |
MEDIUM |
UX problems, accessibility issues |
LOW |
Minor improvements, best practices |
INFO |
Informational observations |
# GitHub Actions example
- name: Run QA Tests
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} # or OPENAI_API_KEY
run: |
pip install qa-agent
playwright install chromium
qa-agent --output json --output-dir ./qa-results https://staging.example.com
- name: Upload Results
uses: actions/upload-artifact@v4
with:
name: qa-results
path: ./qa-results/Exits with code 1 when critical or high severity issues are found, failing the CI step automatically. See Exit Codes.
Omit
--instructions/--instructions-fileand the API key env vars if you only need standard tests.
qa_agent/
โโโ cli.py # CLI entry point
โโโ agent.py # Core orchestrator
โโโ config.py # Configuration dataclasses
โโโ models.py # Finding, PageAnalysis, TestSession, TestPlan
โโโ llm_client.py # Anthropic & OpenAI clients via stdlib urllib
โโโ ai_planner.py # LLM-powered test plan generation
โโโ plan_cache.py # Filesystem cache for test plans
โโโ testers/
โ โโโ base.py # BaseTester abstract class
โ โโโ keyboard.py # Keyboard navigation
โ โโโ mouse.py # Mouse interaction
โ โโโ forms.py # Form handling
โ โโโ accessibility.py # WCAG accessibility
โ โโโ wcag_compliance.py # WCAG 2.1 AA compliance (opt-in)
โ โโโ errors.py # Console & network errors
โ โโโ custom.py # AI-generated test steps
โโโ reporters/
โ โโโ console.py # Colored terminal output
โ โโโ markdown.py # Markdown report
โ โโโ json_reporter.py # JSON report
โ โโโ pdf.py # PDF report (requires weasyprint)
โโโ web/
โโโ server.py # Flask app with SSE streaming
โโโ templates/ # Jinja2 templates
โโโ static/ # CSS and JavaScript
- Create
testers/my_tester.pyextendingBaseTester, implementrun() -> list[Finding] - Export from
testers/__init__.py - Add a
test_my_feature: boolflag toTestConfig - Call from
agent.pyin_test_page()
git clone https://github.com/billrichards/qa-agent.git
cd qa-agent
pip install -e ".[dev,web,ai]"
playwright install chromium
# Unit tests (no browser needed)
pytest -v -m "not integration and not network"
# Integration tests (real Playwright)
pytest -v -m integration --no-cov
# Lint & type check
ruff check .
mypy qa_agentCI runs unit tests across Python 3.10โ3.12 on Ubuntu, macOS, and Windows. Integration tests run on Ubuntu with Playwright. See .github/workflows/test.yml.
- Fork the repository
- Create a feature branch (
git checkout -b my-feature) - Make your changes and add tests
- Run
pytest -v -m "not integration and not network" - Open a pull request against
main
Code style: Ruff + Black, line length 100.
| Code | Meaning |
|---|---|
0 |
All tests passed (no critical/high findings) |
1 |
Critical or high severity issues found |
2 |
Error running tests |
130 |
Interrupted (Ctrl+C) |
playwright install chromiumMust run once after every fresh install. Easy to forget in CI.
pip install "qa-agent[web]"Required for qa-agent-web and python -m qa_agent web.
pip install "qa-agent[pdf]"Falls back to Markdown silently if WeasyPrint is absent.
No extra packages are needed โ LLM calls use Python's built-in urllib. You only need a valid API key for your chosen provider:
export ANTHROPIC_API_KEY=sk-ant-... # for --llm anthropic (default)
export OPENAI_API_KEY=sk-... # for --llm openaiIf the key is missing or the API call fails, qa-agent prints a warning and continues with standard tests.
Requires 3.10+. Check with python --version.
MIT โ Copyright (c) 2026 Bill Richards. See LICENSE.




