freshcrate — Search

Search results for "agent-evaluation"

4 results found

ai-agents-reality-check 📁0.0.0🌿 Growing⭐57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

agent-architecture agent-benchmark agent-evaluation agent-performance agentic-ai agentic-workflow ai-benchmarking architectural-evaluation llm-agent pythonby Cre4T3Tiv3Python

giskard-oss 📁giskard-checks/v1.0.2b1🌱 Seedling⭐5,225

🐢 Open-Source Evaluation & Testing library for LLM Agents

agent-evaluation ai-red-team ai-security ai-testing fairness-ai llm llm-eval llm-evaluation pythonby Giskard-AIPython

trulens 📁trulens-2.7.2🌱 Seedling⭐3,237

Evaluation and Tracking for LLM Experiments and AI Agents

agent-evaluation agentops ai-agents ai-monitoring ai-observability evals explainable-ml llm-eval pythonby trueraPython

any-agent 📁1.18.0🌱 Seedling⭐1,141

A single interface to use and evaluate different agent frameworks

a2a agent-evaluation agents ai mcp pythonby mozilla-aiPython