freshcrate

Search results for "evaluation-framework"

3 results found
promptfoo📁code-scan-action-0.1.5🌿 Growing19,943

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and

harness📁master@2026-04-21🌱 Seedling1

Define and control AI agents in markdown with full prompt transparency, persistent memory, and integrated tools via the Claude Agent SDK.