freshcrate — Search

Search results for "agent-benchmark"

3 results found (Python)

ai-agents-reality-check 📁0.0.0🌿 Growing⭐57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

agent-architecture agent-benchmark agent-evaluation agent-performance agentic-ai agentic-workflow ai-benchmarking architectural-evaluation llm-agent pythonby Cre4T3Tiv3Python

OpenClawProBench 📁main@2026-04-15🌿 Growing⭐340

OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

agent benchmark evaluation harness leaderboard llm openclaw pythonby suyoumoPython

Ultimate-Agent-Directory 📁0.0.0🌱 Seedling⭐51

🤖 The most comprehensive directory of AI agent frameworks, platforms, tools, and resources - hundreds of curated entries covering open-source, no-code, enterprise, and autonomous solutions. NEW Boil

agent agentic agentic-ai agents boilerplate boilerplate-application boilerplate-template pythonby moshehbenavrahamPython