freshcrate

Search results for "agent-evaluation"

4 results found
ai-agents-reality-check📁0.0.0🌿 Growing57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

giskard-oss📁giskard-checks/v1.0.2b1🌱 Seedling5,225

🐢 Open-Source Evaluation & Testing library for LLM Agents

trulens📁trulens-2.7.2🌱 Seedling3,237

Evaluation and Tracking for LLM Experiments and AI Agents

any-agent📁1.18.0🌱 Seedling1,141

A single interface to use and evaluate different agent frameworks