Home > #agent-evaluation
Tag: #agent-evaluation
4 packages • ⭐ 9,660 total stars
🐢 Open-Source Evaluation & Testing library for LLM Agents
Evaluation and Tracking for LLM Experiments and AI Agents
A single interface to use and evaluate different agent frameworks
Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica
