freshcrate
Skin:/
Home > Testing > promptfoo

promptfoo

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration. Used by OpenAI and Anthropic.

README

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a CLI and library for evaluating and red-teaming LLM apps. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Website ยท Getting Started ยท Red Teaming ยท Documentation ยท Discord

Promptfoo is now part of OpenAI. Promptfoo remains open source and MIT licensed. Read the company update.

Quick Start

npm install -g promptfoo
promptfoo init --example getting-started

Also available via brew install promptfoo and pip install promptfoo. You can also use npx promptfoo@latest to run any command without installing.

Most LLM providers require an API key. Set yours as an environment variable:

export OPENAI_API_KEY=sk-abc123

Once you're in the example directory, run an eval and view results:

cd getting-started
promptfoo eval
promptfoo view

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Review pull requests for LLM-related security and compliance issues with code scanning
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

promptfoo command line

It also can generate security vulnerability reports:

gen ai red team

Why Promptfoo?

  • Developer-first: Fast, with features like live reload and caching
  • Private: LLM evals run 100% locally - your prompts never leave your machine
  • Flexible: Works with any LLM API or programming language
  • Battle-tested: Powers LLM apps serving 10M+ users in production
  • Data-driven: Make decisions based on metrics, not gut feel
  • Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Release History

VersionChangesUrgencyDate
0.121.14## [0.121.14](https://github.com/promptfoo/promptfoo/compare/0.121.13...0.121.14) (2026-06-02) ### Features * add A2A provider ([#9586](https://github.com/promptfoo/promptfoo/issues/9586)) ([963b264](https://github.com/promptfoo/promptfoo/commit/963b264ba22d621282d0bf82efdae2b5defe6d59)) * **assertions:** add agent-rubric grader ([#9453](https://github.com/promptfoo/promptfoo/issues/9453)) ([cadb3c5](https://github.com/promptfoo/promptfoo/commit/cadb3c500277464f05244c8bc8525c2725aa5c22)) * **High6/2/2026
code-scan-action-0.1.7## [0.1.7](https://github.com/promptfoo/promptfoo/compare/code-scan-action-0.1.6...code-scan-action-0.1.7) (2026-05-29) ### Bug Fixes * **code-scan:** emit structured fork PR skip output ([#9426](https://github.com/promptfoo/promptfoo/issues/9426)) ([61c624c](https://github.com/promptfoo/promptfoo/commit/61c624c7f91808a6f59d8b837dbb3896dd9a74c0)) * **code-scan:** honor minimum-severity alias when min-severity is unset ([#9433](https://github.com/promptfoo/promptfoo/issues/9433)) ([ea5ea9e](htHigh5/29/2026
code-scan-action-0.1.6## [0.1.6](https://github.com/promptfoo/promptfoo/compare/code-scan-action-0.1.5...code-scan-action-0.1.6) (2026-05-21) ### Features * **code-scan:** add SARIF output support ([#9161](https://github.com/promptfoo/promptfoo/issues/9161)) ([4da26e9](https://github.com/promptfoo/promptfoo/commit/4da26e95e4837ad9fd3363dfb52a86e5e1ceb66d)) * **code-scan:** refine SARIF output ergonomics ([#9159](https://github.com/promptfoo/promptfoo/issues/9159)) ([ea3a655](https://github.com/promptfoo/promptfoo/High5/21/2026
0.121.11## [0.121.11](https://github.com/promptfoo/promptfoo/compare/0.121.10...0.121.11) (2026-05-08) ### Features * **quiverai:** add Arrow 1.1 models, vectorize endpoint, and GPT Image-2 pipeline ([#9139](https://github.com/promptfoo/promptfoo/issues/9139)) ([ce2c62d](https://github.com/promptfoo/promptfoo/commit/ce2c62d4f9cfd92bd8e48f45db2314271946c467)) ### Bug Fixes * **redteam:** handle MCP target prompt materialization ([#9149](https://github.com/promptfoo/promptfoo/issues/9149)) ([a050023High5/8/2026
0.121.9## [0.121.9](https://github.com/promptfoo/promptfoo/compare/0.121.8...0.121.9) (2026-04-27) ### Features * **providers:** add gpt-5.5 model support ([#8884](https://github.com/promptfoo/promptfoo/issues/8884)) ([8c5dc92](https://github.com/promptfoo/promptfoo/commit/8c5dc929a15e3f9c859f930cc71a6f7093bf666e)) ### Bug Fixes * **cli:** align command-line reference with CLI ([#8900](https://github.com/promptfoo/promptfoo/issues/8900)) ([c4ce0d4](https://github.com/promptfoo/promptfoo/commit/c4High4/27/2026
0.121.8## [0.121.8](https://github.com/promptfoo/promptfoo/compare/0.121.7...0.121.8) (2026-04-24) ### Features * **claude-agent-sdk:** bump to 0.2.116 and add title option ([#8858](https://github.com/promptfoo/promptfoo/issues/8858)) ([9bca53a](https://github.com/promptfoo/promptfoo/commit/9bca53a2502be2395690019fad65b5a008f14c05)) * **providers:** add GPT-5.5 OpenAI support ([#8873](https://github.com/promptfoo/promptfoo/issues/8873)) ([6488623](https://github.com/promptfoo/promptfoo/commit/648862High4/24/2026
0.121.6## [0.121.6](https://github.com/promptfoo/promptfoo/compare/0.121.5...0.121.6) (2026-04-18) ### Features * **anthropic:** add support for Claude Opus 4.7 ([#8763](https://github.com/promptfoo/promptfoo/issues/8763)) ([bcde21d](https://github.com/promptfoo/promptfoo/commit/bcde21d90731ca20781c3d7ebb34567de13e3044)) * **claude-agent-sdk:** bump to 0.2.112 and expose exclude_dynamic_sections ([#8767](https://github.com/promptfoo/promptfoo/issues/8767)) ([7abb3b7](https://github.com/promptfoo/proHigh4/22/2026
code-scan-action-0.1.5## [0.1.5](https://github.com/promptfoo/promptfoo/compare/code-scan-action-0.1.4...code-scan-action-0.1.5) (2026-04-14) ### Bug Fixes * **app:** clarify attack success rate label ([#8387](https://github.com/promptfoo/promptfoo/issues/8387)) ([7482eff](https://github.com/promptfoo/promptfoo/commit/7482eff88f193e857822b43da040638eb4ae1565)) * **code-scan:** avoid npm before env for MCP npx ([#8515](https://github.com/promptfoo/promptfoo/issues/8515)) ([7d2eacd](https://github.com/promptfoo/promHigh4/14/2026
0.121.5## [0.121.5](https://github.com/promptfoo/promptfoo/compare/0.121.4...0.121.5) (2026-04-14) ### Features * **providers:** add Abliteration provider ([b29fa9a](https://github.com/promptfoo/promptfoo/commit/b29fa9a475315cc97d57a5616d08e9b099d8f66b)) * **providers:** add OpenAI Codex app-server provider ([#8578](https://github.com/promptfoo/promptfoo/issues/8578)) ([a403dd1](https://github.com/promptfoo/promptfoo/commit/a403dd17b012029bbd4323e3d95e44e5366d08a3)) * **providers:** let anthropic:meMedium4/14/2026
0.121.4## [0.121.4](https://github.com/promptfoo/promptfoo/compare/0.121.3...0.121.4) (2026-04-10) ### Features * allow per-test opt-out of defaultTest assertions ([5e5959e](https://github.com/promptfoo/promptfoo/commit/5e5959ecc6984fe34df0c3fa74aa231fdc9ea972)) * **codex:** expand Codex SDK eval controls and docs ([#8433](https://github.com/promptfoo/promptfoo/issues/8433)) ([80c3f7f](https://github.com/promptfoo/promptfoo/commit/80c3f7f25431e7a6319df54b46b4cd283f4b6b8c)) * **eval:** group serial gHigh4/11/2026
0.121.3## [0.121.3](https://github.com/promptfoo/promptfoo/compare/0.121.2...0.121.3) (2026-03-24) ### Features * add block-no-verify PreToolUse hook to .claude/settings.json ([#8234](https://github.com/promptfoo/promptfoo/issues/8234)) ([29a856a](https://github.com/promptfoo/promptfoo/commit/29a856a8fa2defba5bc8362ea6e14364b7e624ce)) * add new config options to composite jailbreak strategy ([#7693](https://github.com/promptfoo/promptfoo/issues/7693)) ([071d345](https://github.com/promptfoo/promptfoMedium3/24/2026
0.121.2## [0.121.2](https://github.com/promptfoo/promptfoo/compare/0.121.1...0.121.2) (2026-03-12) ### Bug Fixes * add node-addon-api to devDependencies for sharp build ([#8102](https://github.com/promptfoo/promptfoo/issues/8102)) ([1d4e959](https://github.com/promptfoo/promptfoo/commit/1d4e9596f2199ade67e4b65207b8f99b7c2b1b3b)) * **deps:** update dependency @tanstack/react-virtual to ^3.13.20 ([#8083](https://github.com/promptfoo/promptfoo/issues/8083)) ([5e5f774](https://github.com/promptfoo/prompLow3/12/2026
0.121.1## [0.121.1](https://github.com/promptfoo/promptfoo/compare/0.121.0...0.121.1) (2026-03-09) ### Bug Fixes * **providers:** support newer opencode sdk api ([#8060](https://github.com/promptfoo/promptfoo/issues/8060)) ([7ec80b2](https://github.com/promptfoo/promptfoo/commit/7ec80b2e173dc99438002c8f5d16feb7b6643aa1))Low3/9/2026
0.121.0## [0.121.0](https://github.com/promptfoo/promptfoo/compare/0.120.27...0.121.0) (2026-03-09) ### โš  BREAKING CHANGES * **providers:** resolve relative config paths against config dir in claude-agent-sdk ([#8030](https://github.com/promptfoo/promptfoo/issues/8030)) ### Features * **redteam:** generalize insurance plugins for all insurance types ([#8002](https://github.com/promptfoo/promptfoo/issues/8002)) ([945c3bc](https://github.com/promptfoo/promptfoo/commit/945c3bc6725ca8bc7369f7d0efed6f8Low3/9/2026
0.120.27## [0.120.27](https://github.com/promptfoo/promptfoo/compare/0.120.26...0.120.27) (2026-03-06) ### Features * add promptfoo-evals agent skill for Claude Code and Codex ([#7985](https://github.com/promptfoo/promptfoo/issues/7985)) ([71160fe](https://github.com/promptfoo/promptfoo/commit/71160fea6aaa3471de9c6027b929830bdd7acfb0)) * **app:** add media library page ([#6901](https://github.com/promptfoo/promptfoo/issues/6901)) ([4eba85a](https://github.com/promptfoo/promptfoo/commit/4eba85aac7a310Low3/6/2026
0.120.26## [0.120.26](https://github.com/promptfoo/promptfoo/compare/0.120.25...0.120.26) (2026-03-03) ### Features * Add financial:sox-compliance plugin ([#7780](https://github.com/promptfoo/promptfoo/issues/7780)) ([b7cfc8e](https://github.com/promptfoo/promptfoo/commit/b7cfc8e47c5594a498c8a472460896424e5ada52)) * add model-identification plugin ([#7883](https://github.com/promptfoo/promptfoo/issues/7883)) ([a2ac7c6](https://github.com/promptfoo/promptfoo/commit/a2ac7c6139aaed061f41fd7ab83c7562e167Low3/3/2026
0.120.25## [0.120.25](https://github.com/promptfoo/promptfoo/compare/0.120.24...0.120.25) (2026-02-18) ### Features * add regenerate button for suggested policies ([#7652](https://github.com/promptfoo/promptfoo/issues/7652)) ([2b09693](https://github.com/promptfoo/promptfoo/commit/2b096935c26ee148ad28f7bfe47ddad292ce9bd3)) * **app:** add renderOption prop to Combobox component ([#7723](https://github.com/promptfoo/promptfoo/issues/7723)) ([a609016](https://github.com/promptfoo/promptfoo/commit/a60901Low2/18/2026
0.120.24## [0.120.24](https://github.com/promptfoo/promptfoo/compare/0.120.23...0.120.24) (2026-02-10) ### Features * add --filter-prompts option with MCP alignment ([#7451](https://github.com/promptfoo/promptfoo/issues/7451)) ([e9b53e2](https://github.com/promptfoo/promptfoo/commit/e9b53e2ac83df1f6e98bf9561a6a3c8d87d271af)) * **eval:** add hidden column indicators and schema-based column visibility persistence ([#7536](https://github.com/promptfoo/promptfoo/issues/7536)) ([8fbeb60](https://github.coLow2/10/2026
0.120.23## [0.120.23](https://github.com/promptfoo/promptfoo/compare/0.120.22...0.120.23) (2026-02-06) ### Bug Fixes * **blobs:** restore cloud blob upload for shared evals ([#7484](https://github.com/promptfoo/promptfoo/issues/7484)) ([7eb1009](https://github.com/promptfoo/promptfoo/commit/7eb100939c07b0b414459682cdd057e024b39814)) * **deps:** update dependency @opencode-ai/sdk to ^1.1.48 ([#7499](https://github.com/promptfoo/promptfoo/issues/7499)) ([b081a54](https://github.com/promptfoo/promptfoo/Low2/6/2026
0.120.22## [0.120.22](https://github.com/promptfoo/promptfoo/compare/0.120.21...0.120.22) (2026-02-04) ### Features * **redteam:** enable multilingual support for audio/video/image strategies ([#7485](https://github.com/promptfoo/promptfoo/issues/7485)) ([01b62ce](https://github.com/promptfoo/promptfoo/commit/01b62cee55c4c06edc510d9684a4696bbc62633b)) ### Bug Fixes * **app:** move rows useMemo after table declaration to fix build ([#7475](https://github.com/promptfoo/promptfoo/issues/7475)) ([d1c2Low2/4/2026
0.120.21## [0.120.21](https://github.com/promptfoo/promptfoo/compare/0.120.20...0.120.21) (2026-02-03) ### Features * **app:** add print styles to DataTable for light mode printing ([#7365](https://github.com/promptfoo/promptfoo/issues/7365)) ([167b27c](https://github.com/promptfoo/promptfoo/commit/167b27c4b9483173cebe6eb7b3467e72a723fd3f)) * **app:** improve HTTP endpoint request body editor ([#7438](https://github.com/promptfoo/promptfoo/issues/7438)) ([cfadb37](https://github.com/promptfoo/promptfLow2/3/2026
0.120.20## [0.120.20](https://github.com/promptfoo/promptfoo/compare/0.120.19...0.120.20) (2026-01-29) ### Features * **redteam:** add email validation to generate command ([#7314](https://github.com/promptfoo/promptfoo/issues/7314)) ([4fffc3a](https://github.com/promptfoo/promptfoo/commit/4fffc3a2827cfadc9213f06167755fa880a2099d)) ### Bug Fixes * **deps:** update dependency @openai/agents to ^0.4.3 ([#7352](https://github.com/promptfoo/promptfoo/issues/7352)) ([7fbb175](https://github.com/promptfLow1/29/2026
0.120.19## [0.120.19](https://github.com/promptfoo/promptfoo/compare/0.120.18...0.120.19) (2026-01-28) ### Features * **app:** enhance DataTable with column alignment and styling improvements ([#7349](https://github.com/promptfoo/promptfoo/issues/7349)) ([8b8b122](https://github.com/promptfoo/promptfoo/commit/8b8b1223cef96257783caf69079976090db0062c)) * **app:** extend UI component interfaces for data-testid support ([#7339](https://github.com/promptfoo/promptfoo/issues/7339)) ([d9dc48a](https://githLow1/28/2026
0.120.18## [0.120.18](https://github.com/promptfoo/promptfoo/compare/0.120.17...0.120.18) (2026-01-28) ### Features * **eval:** support multiple --filter-metadata flags with AND logic ([#7317](https://github.com/promptfoo/promptfoo/issues/7317)) ([61d8d17](https://github.com/promptfoo/promptfoo/commit/61d8d174ee756881edac31dcad9861bb14530803)) * **providers:** add collaboration_mode support to OpenAI Codex SDK ([#7275](https://github.com/promptfoo/promptfoo/issues/7275)) ([a3e6d58](https://github.comLow1/28/2026
0.120.17## [0.120.17](https://github.com/promptfoo/promptfoo/compare/0.120.16...0.120.17) (2026-01-23) ### Features * **redteam:** add telecom vertical red team plugins ([#7182](https://github.com/promptfoo/promptfoo/issues/7182)) ([678fd1e](https://github.com/promptfoo/promptfoo/commit/678fd1e9828aeece905f6d17984f3478846749ee)) * **redteam:** add VLSU compositional safety plugin ([#6855](https://github.com/promptfoo/promptfoo/issues/6855)) ([3e30cb0](https://github.com/promptfoo/promptfoo/commit/3e3Low1/23/2026
0.120.16## [0.120.16](https://github.com/promptfoo/promptfoo/compare/0.120.15...0.120.16) (2026-01-21) ### Features * **config:** add per-test structured output support ([#6239](https://github.com/promptfoo/promptfoo/issues/6239)) ([4629892](https://github.com/promptfoo/promptfoo/commit/4629892c14c8df37d298229209a8a932d607de9f)) * **eval:** re-enable SIGINT graceful shutdown for eval pause/resume ([#7012](https://github.com/promptfoo/promptfoo/issues/7012)) ([06364ef](https://github.com/promptfoo/proLow1/21/2026
0.120.15## [0.120.15](https://github.com/promptfoo/promptfoo/compare/0.120.14...0.120.15) (2026-01-20) ### Features * **app:** add NavigationSidebar component and enhance Tabs ([#7073](https://github.com/promptfoo/promptfoo/issues/7073)) ([55a3125](https://github.com/promptfoo/promptfoo/commit/55a3125e5627c2f1a4981744f536fde71b21a5de)) * **app:** add Storybook with stories for all UI components ([#7066](https://github.com/promptfoo/promptfoo/issues/7066)) ([53f51cf](https://github.com/promptfoo/prompLow1/20/2026
0.120.14## [0.120.14](https://github.com/promptfoo/promptfoo/compare/0.120.13...0.120.14) (2026-01-14) ### Features * **redteam:** add numTests config option for strategy test capping ([#7030](https://github.com/promptfoo/promptfoo/issues/7030)) ([0ca5ded](https://github.com/promptfoo/promptfoo/commit/0ca5deda234c22482d43fd0a07f7855a444696ff)) ### Bug Fixes * **deps:** update @actions/github to v7 and fix workspace config ([#7037](https://github.com/promptfoo/promptfoo/issues/7037)) ([c6b2496](httLow1/14/2026
0.120.13## [0.120.13](https://github.com/promptfoo/promptfoo/compare/0.120.12...0.120.13) (2026-01-13) ### Features * **ui:** Add Ink-based interactive list UI foundation ([#7013](https://github.com/promptfoo/promptfoo/issues/7013)) ([84a2ac7](https://github.com/promptfoo/promptfoo/commit/84a2ac7b2f8697d4cdafd0f868778348e6211c0e)) ### Bug Fixes * **ui:** preserve exact getRowId values in DataTable row selection ([#7032](https://github.com/promptfoo/promptfoo/issues/7032)) ([e78f083](https://githubLow1/13/2026
0.120.12## [0.120.12](https://github.com/promptfoo/promptfoo/compare/0.120.11...0.120.12) (2026-01-12) ### Features * **app:** show provider config details on hover in eval results ([#6757](https://github.com/promptfoo/promptfoo/issues/6757)) ([c790f80](https://github.com/promptfoo/promptfoo/commit/c790f809a0baab2c186ea5be4a787229baad6d2d)) * **assertions:** add word-count assertion type ([#7028](https://github.com/promptfoo/promptfoo/issues/7028)) ([d21f7a0](https://github.com/promptfoo/promptfoo/coLow1/12/2026
0.120.11## [0.120.11](https://github.com/promptfoo/promptfoo/compare/0.120.10...0.120.11) (2026-01-10) ### Features * **app:** add Combobox component ([#6946](https://github.com/promptfoo/promptfoo/issues/6946)) ([a1fb9ed](https://github.com/promptfoo/promptfoo/commit/a1fb9ed64d49d4fc4cf58c96dfa89454eddc59d0)) * **codeScan:** add fork PR authentication support ([#6958](https://github.com/promptfoo/promptfoo/issues/6958)) ([9c0fee4](https://github.com/promptfoo/promptfoo/commit/9c0fee4904af3135492545Low1/10/2026
0.120.10## [0.120.10](https://github.com/promptfoo/promptfoo/compare/0.120.9...0.120.10) (2026-01-06) ### Features * **evaluator:** enrich error results with provider context and metadata ([#6913](https://github.com/promptfoo/promptfoo/issues/6913)) ([a004182](https://github.com/promptfoo/promptfoo/commit/a0041825b8149e94d25828a43b77787896ba8dc6)) * **providers:** add Azure AI Foundry video provider (Sora) ([#6890](https://github.com/promptfoo/promptfoo/issues/6890)) ([1479e74](https://github.com/proLow1/6/2026
0.120.9## [0.120.9](https://github.com/promptfoo/promptfoo/compare/0.120.8...0.120.9) (2025-12-30) ### Features - **app:** add apiBaseUrl field to provider configuration UI ([#6884](https://github.com/promptfoo/promptfoo/issues/6884)) โ€” @mldangelo - **app:** add design system, navigation, model audit, and eval creator ([#6823](https://github.com/promptfoo/promptfoo/issues/6823)) โ€” @faizanminhas - **cli:** add wildcard support for prompt filters ([#6853](https://github.com/promptfoo/promptfoo/issues/6Low12/30/2025
0.120.8## [0.120.8](https://github.com/promptfoo/promptfoo/compare/0.120.7...0.120.8) (2025-12-21) ### Features * **redteam:** add --description flag to redteam run command ([#6796](https://github.com/promptfoo/promptfoo/issues/6796)) ([95cc2ff](https://github.com/promptfoo/promptfoo/commit/95cc2ffe1075b00620647369beb9bb331af95858)) * **server:** add configurable base path support ([#6758](https://github.com/promptfoo/promptfoo/issues/6758)) ([9395a28](https://github.com/promptfoo/promptfoo/commit/9Low12/21/2025
0.120.7## [0.120.7](https://github.com/promptfoo/promptfoo/compare/0.120.6...0.120.7) (2025-12-19) ### Features * blob storage ([#6708](https://github.com/promptfoo/promptfoo/issues/6708)) ([73fcd51](https://github.com/promptfoo/promptfoo/commit/73fcd5183bfaa37b76326f21eaeaaaddee264bb9))Low12/19/2025
0.120.6## [0.120.6](https://github.com/promptfoo/promptfoo/compare/0.120.5...0.120.6) (2025-12-19) ### Features * **auth:** add interactive team selection during login ([#6760](https://github.com/promptfoo/promptfoo/issues/6760)) ([11c7037](https://github.com/promptfoo/promptfoo/commit/11c7037d229d0cfb335fc2e3419de6c210fec7bc)) * **bedrock:** configurable numberOfResults for Bedrock Knowledge Base ([#6738](https://github.com/promptfoo/promptfoo/issues/6738)) ([f8f0b8b](https://github.com/promptfoo/pLow12/19/2025
0.120.5## [0.120.5](https://github.com/promptfoo/promptfoo/compare/0.120.4...0.120.5) (2025-12-16) ### Features * **cli:** support multiple --env-file flags ([#6622](https://github.com/promptfoo/promptfoo/issues/6622)) ([015f2df](https://github.com/promptfoo/promptfoo/commit/015f2dfb76be0710a2c98d87fe957060e18de162)) * **esm:** add resolvePackageEntryPoint for ESM-only packages ([#6586](https://github.com/promptfoo/promptfoo/issues/6586)) ([fbc0eca](https://github.com/promptfoo/promptfoo/commit/fbc0Low12/16/2025
0.120.4## [0.120.4](https://github.com/promptfoo/promptfoo/compare/0.120.3...0.120.4) (2025-12-11) ### Features * **providers:** add ElevenLabs provider integration ([#6022](https://github.com/promptfoo/promptfoo/issues/6022)) ([8d54faa](https://github.com/promptfoo/promptfoo/commit/8d54faa1c240e28557b1eb652c358a3f8eb4b0a2)) * **providers:** add GPT-5.2 model support ([#6628](https://github.com/promptfoo/promptfoo/issues/6628)) ([b105980](https://github.com/promptfoo/promptfoo/commit/b105980f121d1b0Low12/11/2025
0.120.3## [0.120.3](https://github.com/promptfoo/promptfoo/compare/0.120.2...0.120.3) (2025-12-10) ### Features * **providers:** add multi-turn session persistence to browser provider ([#6585](https://github.com/promptfoo/promptfoo/issues/6585)) ([873241e](https://github.com/promptfoo/promptfoo/commit/873241ee0b5692edc74fcb33815b99adfab68a52)) ### Bug Fixes * **build:** exclude Nunjucks template fixture from TypeScript ([#6588](https://github.com/promptfoo/promptfoo/issues/6588)) ([6f02eec](httpsLow12/10/2025
0.120.2## [0.120.2](https://github.com/promptfoo/promptfoo/compare/0.120.1...0.120.2) (2025-12-09) ### Features * **assertions:** tool calling f1 score ([#6548](https://github.com/promptfoo/promptfoo/issues/6548)) ([1327195](https://github.com/promptfoo/promptfoo/commit/13271958b5a48b7d26586daf5f06d98bcdf4d063)) * **providers:** add Amazon Nova 2 model support with reasoning capabilities ([#6531](https://github.com/promptfoo/promptfoo/issues/6531)) ([3a99c2b](https://github.com/promptfoo/prompLow12/9/2025
0.120.1## [0.120.1](https://github.com/promptfoo/promptfoo/compare/0.120.0...0.120.1) (2025-12-08) ### Features * **providers:** update claude-agent-sdk to ^0.1.60 with betas and dontAsk support ([#6557](https://github.com/promptfoo/promptfoo/issues/6557)) ([cc3d857](https://github.com/promptfoo/promptfoo/commit/cc3d85763606facb615965ad9288c33650e01512)) ### Bug Fixes * **ci:** trigger Docker build from release-please workflow ([#6572](https://github.com/promptfoo/promptfoo/issues/6572)) ([6b1790Low12/8/2025
0.120.0## [0.120.0](https://github.com/promptfoo/promptfoo/compare/0.119.14...0.120.0) (2025-12-08) ### Features - **build:** migrate to ESM (ECMAScript Modules) ([#5594](https://github.com/promptfoo/promptfoo/issues/5594)) ([9cdf09b](https://github.com/promptfoo/promptfoo/commit/9cdf09b1c681454ed3fa047dee41a43fea48028a)) - **cli:** toggle debug log live ([#6517](https://github.com/promptfoo/promptfoo/issues/6517)) ([6beebce](https://github.com/promptfoo/promptfoo/commit/6beebce4134f0e0dfd54e7f1Low12/8/2025
0.119.14## [0.119.14](https://github.com/promptfoo/promptfoo/compare/0.119.13...0.119.14) (2025-12-01) ### Features * Add web search assertion type ([#5111](https://github.com/promptfoo/promptfoo/issues/5111)) ([11c01cc](https://github.com/promptfoo/promptfoo/commit/11c01cc637efd6867a1e99e44bc8633d324ac66a)) * **examples:** add Strands Agents SDK example ([#6384](https://github.com/promptfoo/promptfoo/issues/6384)) ([28c3d58](https://github.com/promptfoo/promptfoo/commit/28c3d584f2f820de40a17e6Low12/1/2025
0.119.13## [0.119.13](https://github.com/promptfoo/promptfoo/compare/promptfoo-v0.119.12...promptfoo-v0.119.13) (2025-11-25) ### Features * ecommerce plugin pack ([#6168](https://github.com/promptfoo/promptfoo/issues/6168)) ([152b1ff](https://github.com/promptfoo/promptfoo/commit/152b1ff3f3fdb6ca43a0a5718d463757f63a1814)) ### Bug Fixes * **deps:** bump posthog-node from 5.13.2 to 5.14.0 for sha1-hulud mitigation ([6a44eda](https://github.com/promptfoo/promptfoo/commit/6a44eda819f48273230853cc8692bLow11/25/2025
0.119.12## [0.119.12](https://github.com/promptfoo/promptfoo/compare/promptfoo-v0.119.11...promptfoo-v0.119.12) (2025-11-24) ### Features * changelog automation and validation ([#6252](https://github.com/promptfoo/promptfoo/issues/6252)) ([ee74c4a](https://github.com/promptfoo/promptfoo/commit/ee74c4ae7dc01c35dd52d835a19188f06a334a1a)) * **providers:** add Anthropic structured outputs support ([#6226](https://github.com/promptfoo/promptfoo/issues/6226)) ([1b1b9d2](https://github.com/promptfoo/promptfLow11/24/2025
0.119.11## What's Changed **Bug Fixes** - fix(deps): update dependency @apidevtools/json-schema-ref-parser to v15 by @renovate[bot] in https://github.com/promptfoo/promptfoo/pull/6300 - fix(redteam): fix template bug in agentic strategies by @mldangelo in https://github.com/promptfoo/promptfoo/pull/6240 - fix: avoid sending target output to cloud if excludeTargetOutputFromAgenticAttackGeneration is set to true @MrFlounder in https://github.com/promptfoo/promptfoo/pull/6320 **Chores** - revert:Low11/24/2025
0.119.10## What's Changed ### Bug Fixes - fix(providers): LiteLLM API key authentication with LITELLM_API_KEY env var by @mldangelo in #6322 - fix(webui): Basic strategy checkbox behavior in red team setup by @minhle1291 in #6313 - fix(code-scan): prevent GitHub API error when startLine equals line by @yash2998chhabria in #6314 - fix(app): Test generation tooltips remain visible after dialog is rendered by @will-holley in #6309 - fix(webui): allow thumbs up/down ratings to toggle off and removLow11/23/2025
0.119.9## What's Changed ### Features - feat(webui): add custom policy generation to red team setup by @typpo in https://github.com/promptfoo/promptfoo/pull/6181 - feat(webui): add strategy test generation to red team setup by @will-holley in https://github.com/promptfoo/promptfoo/pull/6005 - feat(webui): add visibility button for PFX passphrase field in red team target configuration by @faizanminhas in https://github.com/promptfoo/promptfoo/pull/6258 ### Bug Fixes - fix(auth): allow CI eLow11/20/2025
0.119.8## What's Changed ### Features - feat(providers): add Gemini 3 Pro support with thinking configuration by @mldangelo in #6241 - feat(plugins): organize domain-specific risks into vertical suites by @typpo in #6215 ### Bug Fixes - fix(code-scan): point at correct cloud production url by @danenania in #6247 - fix(code-scan): ensure no non-json output in 'code-scans run' command with --json flag by @danenania in #6248 - fix: exclude source maps from npm package to reduce bundle size Low11/19/2025
0.119.7## Features - **feat(assertions)**: add dot product and euclidean distance metrics for similarity assertion - use `similar:dot` and `similar:euclidean` assertion types to match production vector database metrics and support different similarity use cases in [#6202](https://github.com/promptfoo/promptfoo/pull/6202) - **feat(webui)**: expose Hydra strategy configuration (max turns and stateful toggle) in red team setup UI in [#6165](https://github.com/promptfoo/promptfoo/pull/6165) - **feat(pLow11/18/2025
0.119.6## What's Changed ### Bug Fixes - fix(redteam): respect redteam.provider configuration for local grading by @mldangelo in #5959 - fix(cli): correct port type handling in view command by @iitslamaa in #6071 - fix(redteam): dynamically update crescendo system prompt with currentRound and successFlag by @yash2998chhabria in #6133 - fix(cli): format object and array variables with pretty-printed JSON by @zsarkis in #6175 - fix(webui): filter hidden metadata keys from metadata filter dropdoLow11/12/2025
0.119.5## What's Changed **Features** - feat: FERPA red team plugin by @typpo in https://github.com/promptfoo/promptfoo/pull/6130 - feat(redteam): show granular subcategory metrics for harmful plugins by @MrFlounder in https://github.com/promptfoo/promptfoo/pull/6134 - feat: hydra the new advanced multi-turn red team strategy by @MrFlounder in https://github.com/promptfoo/promptfoo/pull/6151 - feat(providers): add variable templating support for initialMessages in simulated-user provider by @mldLow11/10/2025
0.119.4## What's Changed **Features** - feat(redteam): make meta agent a default strategy by @typpo in https://github.com/promptfoo/promptfoo/pull/6109 **Bug Fixes** - fix(redteam): make intent for policy more accurate by @MrFlounder in https://github.com/promptfoo/promptfoo/pull/6116 **Chores** - chore: bump @aws-sdk/client-bedrock-runtime from 3.922.0 to 3.925.0 by @dependabot[bot] in https://github.com/promptfoo/promptfoo/pull/6117 - chore: bump version 0.119.4 by @MrFlounder in hLow11/6/2025
0.119.3## What's Changed **Features** - feat(webui): add eval copy functionality by @mldangelo in https://github.com/promptfoo/promptfoo/pull/6079 - feat(redteam): add timestamp context to all grading rubrics by @MrFlounder in https://github.com/promptfoo/promptfoo/pull/6110 - feat(redteam): add gradingGuidance UI for plugin-specific grading rules by @MrFlounder in https://github.com/promptfoo/promptfoo/pull/6108 - feat(model-audit): add revision tracking and deduplication for model scans by @Low11/5/2025
0.119.2## [0.119.2] - 2025-11-03 ### Added - feat(integrations): add Microsoft SharePoint dataset support with certificate-based authentication for importing CSV files (#6080) by @tanyapylat - feat(providers): add `initialMessages` support to simulated-user provider for starting conversations from specific states, with support for loading from JSON/YAML files via `file://` syntax (#6090) by @mldangelo - feat(providers): add local config override support for cloud providers - merge local configuLow11/3/2025
0.119.1## What's Changed **Bug Fixes** - fix(csv): handle primitive values directly in red team CSV export by @sklein12 in https://github.com/promptfoo/promptfoo/pull/6040 - fix(build): removing axios as a runtime dependency in google provider by @jameshiester in https://github.com/promptfoo/promptfoo/pull/6050 - fix(init): cleanup directory and show error message when example fails to download by @LizzHale in https://github.com/promptfoo/promptfoo/pull/6051 - fix(redteam): validate custom strLow10/29/2025
0.119.0# What's Changed ## Features - feat(webui): filter eval results by metric values with numeric operators (EQ, GT, LTE, etc.) by @will-holley in #6011 - feat(providers): 10-100x performance improvement for Python providers with persistent worker pools by @mldangelo in #5968 - feat(providers): add OpenAI Agents SDK integration with support for agents, tools, and handoffs by @mldangelo in #6009 - feat(providers): add function calling/tool support for Ollama by @mldangelo in #5977 - feat(prLow10/28/2025

Dependencies & License Audit

Loading dependencies...

Similar Packages

langfuse๐Ÿชข Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. ๐ŸŠYC W23 v3.178.0
giskard-oss๐Ÿข Open-Source Evaluation & Testing library for LLM Agentsgiskard-checks/v1.0.2b3
agent-reviewAnalyze git code changes to generate structured review reports using flexible AI models and integrated workflows.main@2026-06-04
opikDebug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.2.0.56
agentaThe open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.v0.100.9

More in Testing

vector-db-benchmarkFramework for benchmarking vector search engines
GitoAn AI-powered GitHub code review tool that uses LLMs to detect high-confidence, high-impact issuesโ€”such as security vulnerabilities, bugs, and maintainability concerns.
mxcliMendix cli tool, a headless way to work with Mendix projects. Enables Mendix projects for use with 3rd party agentic coding tools like Claude Code and Copilot. Includes a starlark linter for quality v
llm_context_benchmarks ๐Ÿ“Š LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimiz