freshcrate — #agent-evaluation

Home > #agent-evaluation

Tag: #agent-evaluation

4 packages • ⭐ 9,760 total stars

giskard-ossgiskard-scan/v1.0.0b3🏛️ Flagship⭐5,289

🐢 Open-Source Evaluation & Testing library for LLM Agents

agent-evaluation ai-red-team ai-security ai-testing fairness-ai llm llm-eval llm-evaluation pythonby Giskard-AI

trulenstrulens-2.9.0🌳 Mature⭐3,261

Evaluation and Tracking for LLM Experiments and AI Agents

agent-evaluation agentops ai-agents ai-monitoring ai-observability evals explainable-ml llm-eval pythonby truera

any-agent1.18.0🌳 Mature⭐1,153

A single interface to use and evaluate different agent frameworks

a2a agent-evaluation agents ai mcp pythonby mozilla-ai

ai-agents-reality-check0.0.0🌱 Seedling⭐57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

agent-architecture agent-benchmark agent-evaluation agent-performance agentic-ai agentic-workflow ai-benchmarking architectural-evaluation llm-agent pythonby Cre4T3Tiv3

Tag: #agent-evaluation

Trending in #agent-evaluation