freshcrate

Search results for "benchmarking"

33 results found
ai-agents-reality-checkπŸ“0.0.0🌿 Growing⭐57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

openclaw-engramπŸ“v9.3.142🌿 Growing⭐54

Local-first memory plugin for OpenClaw AI agents. LLM-powered extraction, plain markdown storage, hybrid search via QMD. Gives agents persistent long-term memory across conversations.

agent-frameworkπŸ“python-1.1.0🌳 Mature⭐9,325

A framework for building, orchestrating and deploying AI agents and multi-agent workflows with support for Python and .NET.

LeanKGπŸ“v0.16.5🌱 Seedling⭐32

LeanKG: Stop Burning Tokens. Start Coding Lean.

control-layerπŸ“v8.41.0🌿 Growing⭐62

The world’s fastest AI model gateway (450x less overhead than LiteLLM). Unified access to LLMs across endpoints (openAI, self-hosted, etc.) behind a single authentication layer - with API key generati

LLM-Agents-Ecosystem-HandbookπŸ“0.0.0🌳 Mature⭐508

One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.

cactusπŸ“0.0.0🌿 Growing⭐50

LLM Agent that leverages cheminformatics tools to provide informed responses.

arthur-engineπŸ“2.1.529🌿 Growing⭐75

Make AI work for Everyone - Monitoring and governing for your AI/ML

Autonomous-AgentsπŸ“main@2026-04-16🌿 Growing⭐1,211

Autonomous Agents (LLMs) research papers. Updated Daily.

Awesome-Context-EngineeringπŸ“0.0.0🌳 Mature⭐3,045

πŸ”₯ Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.

modelsπŸ“main@2026-04-21🌿 Growing⭐72

This repository contains comprehensive pricing and configuration data for LLMs. It powers cost attribution for 200+ enterprises running 400B+ tokens through Portkey AI Gateway every day.

vector-db-benchmarkπŸ“master@2026-04-17🌿 Growing⭐356

Framework for benchmarking vector search engines

claude-flowsπŸ“0.0.0🌿 Growing⭐93

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architect

mcp-client-for-ollamaπŸ“v0.28.0🌿 Growing⭐599

A text-based user interface (TUI) client for interacting with MCP servers using Ollama. Features include agent mode, multi-server, model switching, streaming responses, tool management, human-in-the-l

agent-clientπŸ“v0.13.0🌿 Growing⭐90

Autonomous CLI agent integrations for the Spring AI ecosystem with Claude Code, Gemini CLI, and secure sandbox execution

Awesome-World-ModelsπŸ“main@2026-04-21🌿 Growing⭐1,473

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website

LLM-Agent-Paper-dailyπŸ“main@2026-04-21🌱 Seedling⭐20

Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)

awesome-code-agentsπŸ“main@2026-04-20🌿 Growing⭐94

A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding β€” they're redefining how software changes the world.

Awesome-Agent-MemoryπŸ“main@2026-04-16🌿 Growing⭐333

Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.

skillπŸ“v1.2.1🌱 Seedling⭐978

PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with πŸ¦€ by the humans at https://kilo.ai

rufloπŸ“v3.5.80🌿 Growing⭐31,236

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade archit

AutoRAGπŸ“v0.3.22🌱 Seedling⭐4,693

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

Awesome-Repo-Level-Code-GenerationπŸ“main@2026-04-10🌿 Growing⭐274

Must-read papers on Repository-level Code Generation & Issue Resolution πŸ”₯

agent-skills-standardπŸ“php-v1.3.2🌱 Seedling⭐391

A collection of Agent Skills Standard and Best Practice for Programming Languages, Frameworks that help our AI Agent follow best practies on frameworks and programming laguages

apitapπŸ“v1.11.0🌱 Seedling⭐78

CLI, MCP server, and npm library that turns any website into an API β€” no docs, no SDK, no browser.

Open-SableπŸ“v1.7.0🌱 Seedling⭐18

Open-Sable is a local-first autonomous agent framework with AGI-inspired cognitive subsystems (goals, memory, metacognition, tool use). It can run continuously on your machine, integrate with chat int

VecturaKitπŸ“5.3.0🌱 Seedling⭐280

Swift-based vector database for on-device RAG using MLTensor and MLX Embedders

objectbox-javaπŸ“V5.4.1🌱 Seedling⭐4,606

Database for Android and JVM - first and fast, lightweight on-device vector database

StandardπŸ“0.0.0🌱 Seedling⭐18

JSON Agents - A universal JSON-native standard for describing AI agents, their capabilities, tools, runtimes, and governance in a portable, framework-agnostic format. Based on RFC 8259, JSON Schema 2

devitoπŸ“v4.8.21🌱 Seedling⭐689

DSL and compiler framework for automated finite-differences and stencil computation

Zen-Ai-PentestπŸ“v3.0.0🌱 Seedling⭐279

πŸ›‘βš”οΈAI-Powered Penetration Testing Framework with automated vulnerability scanning, multi-agent system, and compliance reportingπŸ›‘βš”οΈ

HealthFlowπŸ“datasetsπŸ’€ Dormant⭐40

HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research