freshcrate — Search

Search results for "benchmarking"

33 results found

ai-agents-reality-check 📁0.0.0🌿 Growing⭐57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

agent-architecture agent-benchmark agent-evaluation agent-performance agentic-ai agentic-workflow ai-benchmarking architectural-evaluation llm-agent pythonby Cre4T3Tiv3Python

openclaw-engram 📁v9.3.142🌿 Growing⭐54

Local-first memory plugin for OpenClaw AI agents. LLM-powered extraction, plain markdown storage, hybrid search via QMD. Gives agents persistent long-term memory across conversations.

ai-agent ai-memory conversational-ai engram knowledge-graph llm local-first long-term-memory typescriptby joshuaswarrenTypeScript

agent-framework 📁python-1.1.0🌳 Mature⭐9,325

A framework for building, orchestrating and deploying AI agents and multi-agent workflows with support for Python and .NET.

agent-framework agentic-ai agents ai dotnet multi-agent orchestration pythonby microsoftPython

LeanKG 📁v0.16.5🌱 Seedling⭐32

LeanKG: Stop Burning Tokens. Start Coding Lean.

ai-agent claude claude-code claude-code-plugin concise-context cursor gemini graph-database rustby FreePeakRust

control-layer 📁v8.41.0🌿 Growing⭐62

The world’s fastest AI model gateway (450x less overhead than LiteLLM). Unified access to LLMs across endpoints (openAI, self-hosted, etc.) behind a single authentication layer - with API key generati

gateway llm openai rustby doublewordaiRust

LLM-Agents-Ecosystem-Handbook 📁0.0.0🌳 Mature⭐508

One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.

ai ai-agent ai-agents fine-tuning finetuning-llms freamework llm llmops pythonby oxbshwPython

cactus 📁0.0.0🌿 Growing⭐50

LLM Agent that leverages cheminformatics tools to provide informed responses.

cheminformatics chemistry foundation-models jupyter notebook llm llm-agent nlp scienceby pnnlJupyter Notebook

arthur-engine 📁2.1.529🌿 Growing⭐75

Make AI work for Everyone - Monitoring and governing for your AI/ML

agentic benchmarking evaluation genai guardrails llm ml monitoring pythonby arthur-aiPython

Autonomous-Agents 📁main@2026-04-16🌿 Growing⭐1,211

Autonomous Agents (LLMs) research papers. Updated Daily.

agent agentic agentic-ai agents ai ai-agents aiagent aiagentsby tmgthb

Awesome-Context-Engineering 📁0.0.0🌳 Mature⭐3,045

🔥 Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.

agent agentic-ai agi awesome-list cognitive-science context-engineering llm ragby Meirtz

models 📁main@2026-04-21🌿 Growing⭐72

This repository contains comprehensive pricing and configuration data for LLMs. It powers cost attribution for 200+ enterprises running 400B+ tokens through Portkey AI Gateway every day.

ai javascript llms llms-benchmarking modelsby Portkey-AIJavaScript

vector-db-benchmark 📁master@2026-04-17🌿 Growing⭐356

Framework for benchmarking vector search engines

benchmark python vector-database vector-search vector-search-engineby qdrantPython

claude-flows 📁0.0.0🌿 Growing⭐93

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architect

shellby xyzthiagoShell

mcp-client-for-ollama 📁v0.28.0🌿 Growing⭐599

A text-based user interface (TUI) client for interacting with MCP servers using Ollama. Features include agent mode, multi-server, model switching, streaming responses, tool management, human-in-the-l

agentic-ai ai command-line-tool generative-ai linux llm local-llm macos pythonby joniglPython

agent-client 📁v0.13.0🌿 Growing⭐90

Autonomous CLI agent integrations for the Spring AI ecosystem with Claude Code, Gemini CLI, and secure sandbox execution

javaby spring-ai-communityJava

Awesome-World-Models 📁main@2026-04-21🌿 Growing⭐1,473

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website

artificial-intelligence autonomous-driving awesome deep-learning embodied-ai future-prediction video-prediction world-modelby leofan90

LLM-Agent-Paper-daily 📁main@2026-04-21🌱 Seedling⭐20

Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)

llm llm-agent pythonby Lyz103Python

awesome-code-agents 📁main@2026-04-20🌿 Growing⭐94

A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding — they're redefining how software changes the world.

pythonby EuniAIPython

Awesome-Agent-Memory 📁main@2026-04-16🌿 Growing⭐333

Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.

agent-memory ai-agent ai-agent-memory awesome-agent-memory llm-memory memory memory-management multimodal-llm-memoryby TeleAI-UAGI

skill 📁v1.2.1🌱 Seedling⭐978

PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

pythonby pinchbenchPython

ruflo 📁v3.5.80🌿 Growing⭐31,236

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade archit

agentic-ai agentic-engineering agentic-framework agentic-rag agentic-workflow agents ai-assistant ai-tools typescriptby ruvnetTypeScript

AutoRAG 📁v0.3.22🌱 Seedling⭐4,693

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysis automl benchmarking document-parser embeddings evaluation llm llm-evaluation pythonby Marker-Inc-KoreaPython

Awesome-Repo-Level-Code-Generation 📁main@2026-04-10🌿 Growing⭐274

Must-read papers on Repository-level Code Generation & Issue Resolution 🔥

ai4se automated-software-engineering code-generation large-language-models llm software-engineeringby YerbaPage

agent-skills-standard 📁php-v1.3.2🌱 Seedling⭐391

A collection of Agent Skills Standard and Best Practice for Programming Languages, Frameworks that help our AI Agent follow best practies on frameworks and programming laguages

agent-agentic-ai android angular best-practices coding-standards cursor-rules flutter typescriptby HoangNguyen0403TypeScript

apitap 📁v1.11.0🌱 Seedling⭐78

CLI, MCP server, and npm library that turns any website into an API — no docs, no SDK, no browser.

ai-agent api browser-automation mcp mcp-server playwright skill-file typescript web-scrapingby n1byn1ktTypeScript

Open-Sable 📁v1.7.0🌱 Seedling⭐18

Open-Sable is a local-first autonomous agent framework with AGI-inspired cognitive subsystems (goals, memory, metacognition, tool use). It can run continuously on your machine, integrate with chat int

agentic agentic-ai ai ai-assistant open-source pythonby IdeoaLabsPython

VecturaKit 📁5.3.0🌱 Seedling⭐280

Swift-based vector database for on-device RAG using MLTensor and MLX Embedders

mlx-swift rag swiftby rryamSwift

objectbox-java 📁V5.4.1🌱 Seedling⭐4,606

Database for Android and JVM - first and fast, lightweight on-device vector database

android database edge embedded java kotlin mobile nosqlby objectboxJava

Standard 📁0.0.0🌱 Seedling⭐18

JSON Agents - A universal JSON-native standard for describing AI agents, their capabilities, tools, runtimes, and governance in a portable, framework-agnostic format. Based on RFC 8259, JSON Schema 2

agent-governance agent-manifest agent-orchestration agent-specification ai-agents ai-framework interoperability json pythonby JSON-AgentsPython

devito 📁v4.8.21🌱 Seedling⭐689

DSL and compiler framework for automated finite-differences and stencil computation

code-generation compiler dsl finite-difference fwi gpu hpc jit pythonby devitocodesPython

Zen-Ai-Pentest 📁v3.0.0🌱 Seedling⭐279

🛡⚔️AI-Powered Penetration Testing Framework with automated vulnerability scanning, multi-agent system, and compliance reporting🛡⚔️

ai automation compliance cybersecurity ethical-hacking framework penetration-testing pentesting pythonby SHAdd0WTAkaPython

VectorDBBench 📁v1.0.20🌱 Seedling⭐1,068

Benchmark for vector databases.

benchmark cost-effectiveness performance python vector-database vector-search vectordbby zilliztechPython

HealthFlow 📁datasets💤 Dormant⭐40

HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research

ai-for-healthcare ai-for-science ehr llm llm-agent multi-agent pythonby yhzhu99Python