freshcrate — #evaluation-framework

Home > #evaluation-framework

Tag: #evaluation-framework

3 packages • ⭐ 34,645 total stars

promptfoocode-scan-action-0.1.5🌿 Growing⭐19,943

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and

ci ci-cd cicd evaluation evaluation-framework llm llm-eval llm-evaluation typescriptby promptfoo

deepevalv3.9.5🌳 Mature⭐14,701

The LLM Evaluation Framework

evaluation-framework evaluation-metrics llm-evaluation llm-evaluation-framework llm-evaluation-metrics pythonby confident-ai

harnessmaster@2026-04-21🌱 Seedling⭐1

Define and control AI agents in markdown with full prompt transparency, persistent memory, and integrated tools via the Claude Agent SDK.

ai claude claude-code claude-skills code-repository evaluation-framework gemini git llm-agent typescriptby heba-ramdan