freshcrate
Skin:/
Home > #agent-evaluation

Tag: #agent-evaluation

4 packages â€ĸ ⭐ 9,760 total stars

giskard-ossgiskard-checks/v1.0.2b3đŸ›ī¸ Flagship⭐5,289

đŸĸ Open-Source Evaluation & Testing library for LLM Agents

trulenstrulens-2.8.1đŸŒŗ Mature⭐3,261

Evaluation and Tracking for LLM Experiments and AI Agents

any-agent1.18.0đŸŒŗ Mature⭐1,153

A single interface to use and evaluate different agent frameworks

ai-agents-reality-check0.0.0đŸŒŋ Growing⭐57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica