freshcrate — Search

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

evaluation hacktoberfest hacktoberfest2025 langchain llama-index llm llm-evaluation llm-observability pythonby comet-mlPython

ragflow 📁v0.25.0🏛️ Flagship⭐78,674

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

agent agentic agentic-ai agentic-workflow ai context-engineering context-retrieval deep-research pythonby infiniflowPython

adk-python 📁v1.31.1🏛️ Flagship⭐19,165

An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

agent agentic agentic-ai agents agents-sdk ai ai-agents aiagentframework pythonby googlePython

evals 📁v0.1.15🌿 Growing⭐106

A comprehensive evaluation framework for AI agents and LLM applications.

agentic agentic-ai ai evaluation machine-learning python strands-agentsby strands-agentsPython

arthur-engine 📁2.1.529🌿 Growing⭐77

Make AI work for Everyone - Monitoring and governing for your AI/ML

agentic benchmarking evaluation genai guardrails llm ml monitoring pythonby arthur-aiPython

AI-Infra-Guard 📁v4.1.4🌳 Mature⭐3,521

A full-stack AI Red Teaming platform securing AI ecosystems via OpenClaw Security Scan, Agent Scan, Skills Scan, MCP scan, AI Infra scan and LLM jailbreak evaluation.

agent agent-security ai-infra ai-red-teaming ai-security llm llm-evaluation llm-jailbreak pythonby TencentPython

fast-agent 📁v0.6.17🌳 Mature⭐3,750

Code, Build and Evaluate agents - excellent Model and Skills/MCP/ACP Support

acp agent agent-framework agent-skills cli mcp mcp-client mcp-server pythonby evalstatePython

logfire 📁v4.32.1🌳 Mature⭐4,185

AI observability platform for production LLM and agent systems.

agent-observability ai ai-observability ai-tools evals fastapi llm-observability logging pythonby pydanticPython

mlflow 📁ts/v0.2.0-rc.1🏛️ Flagship⭐25,479

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controllin

agentops agents ai ai-governance apache-spark evaluation langchain llm-evaluation pythonby mlflowPython

giskard-oss 📁giskard-checks/v1.0.2b1🏛️ Flagship⭐5,289

🐢 Open-Source Evaluation & Testing library for LLM Agents

agent-evaluation ai-red-team ai-security ai-testing fairness-ai llm llm-eval llm-evaluation pythonby Giskard-AIPython

trulens 📁trulens-2.7.2🌳 Mature⭐3,261

Evaluation and Tracking for LLM Experiments and AI Agents

agent-evaluation agentops ai-agents ai-monitoring ai-observability evals explainable-ml llm-eval pythonby trueraPython

AutoRAG 📁v0.3.22🌳 Mature⭐4,713

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysis automl benchmarking document-parser embeddings evaluation llm llm-evaluation pythonby Marker-Inc-KoreaPython

langroid 📁0.61.1🌳 Mature⭐3,976

Harness LLMs with Multi-Agent Programming

agents ai chatgpt function-calling gpt gpt-4 gpt4 information-retrieval llm-agent pythonby langroidPython

RAG-Anything 📁v1.2.10🏛️ Flagship⭐16,790

"RAG-Anything: All-in-One RAG Framework"

multi-modal-rag python retrieval-augmented-generationby HKUDSPython

fast-plaid 📁1.4.5🌿 Growing⭐245

High-Performance Engine for Multi-Vector Search

colbert colpali information-retrieval python rust vector-databaseby lightonaiPython

txtai 📁v9.7.0🏛️ Flagship⭐12,412

💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

agents ai ai-agents embeddings information-retrieval language-model large-language-models llm python vector-databaseby neumlPython

llm-wiki 📁v1.1.0-rc8🌿 Growing⭐139

LLM-powered knowledge base from your Claude Code, Codex CLI, Copilot, Cursor & Gemini sessions. Karpathy's LLM Wiki pattern — implemented and shipped.

ai claude-code cli codex-cli copilot cursor developer-tools gemini-cli mcp pythonby PratiyushPython

qwe-qwe 📁v0.17.6🌱 Seedling⭐35

⚡ Lightweight offline AI agent for local models. No cloud, no API keys — just your GPU.

agent ai ai-agent pythonby deepfounder-aiPython

ai-plugin-scanner 📁v2.0.45🌿 Growing⭐158

Security and best-practices scanner for AI Plugins, covering Codex, Claude, Opencode, Gemini & more. Scores trust for plugins 0-100.

cli codex codex-plugins mcp plugin-scanner python scanner securityby hashgraph-onlinePython

cognithor 📁v0.92.3🌿 Growing⭐115

Cognithor - Agent OS: Local-first autonomous agent operating system. 16 LLM providers, 17 channels, 112+ MCP tools, 5-tier memory, A2A protocol, knowledge vault, voice, browser automation, Computer-us

agent-os ai-agent anthropic autonomous-agent discord-bot document-analysis gdpr-compliant gemini pythonby Alex8791-cyberPython

PraisonAI 📁v4.6.27🏛️ Flagship⭐6,969

PraisonAI 🦞 — Hire a 24/7 AI Workforce. Stop writing boilerplate and start shipping autonomous agents that research, plan, code, and execute tasks. Deployed in 5 lines of code with built-in memory, R

agents ai ai-agent-framework ai-agent-sdk ai-agents ai-agents-framework ai-agents-sdk ai-framwork pythonby MervinPraisonPython

claude-code-plugins-plus-skills 📁v4.26.0🌳 Mature⭐1,995

423 plugins, 2,849 skills, 177 agents for Claude Code. Open-source marketplace at tonsofskills.com with the ccpi CLI package manager.

agent-skills ai ai-agents anthropic automation claude-code claude-code-plugins developer-tools mcp pythonby jeremylongshorePython

mcp-client-for-ollama 📁v0.28.0🌳 Mature⭐655

A text-based user interface (TUI) client for interacting with MCP servers using Ollama. Features include agent mode, multi-server, model switching, streaming responses, tool management, human-in-the-l

agentic-ai ai command-line-tool generative-ai linux llm local-llm macos pythonby joniglPython

ha-mcp 📁v7.3.0.dev386🌳 Mature⭐2,465

The Unofficial and Awesome Home Assistant MCP Server

pythonby homeassistant-aiPython

restai 📁v6.1.45🌿 Growing⭐485

RESTai is an AIaaS (AI as a Service) open-source platform. Supports many public and local LLM suported by Ollama/vLLM/etc. Precise embeddings usage, tuning, analytics etc. Built-in image/audio generat

blocky embeddings fastapi langchain llama llamaindex llm ollama python ragby apocasPython

Auto-claude-code-research-in-sleep 📁v0.4.4🏛️ Flagship⭐7,173

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works wi

ai-research ai-tools aris autonomous-agent claude claude-code claude-code-skills codex pythonby wanshuiyinPython

ISC-Bench 📁v0.0.5🌳 Mature⭐799

Internal Safety Collapse: Turning the LLM or an AI Agent into a sensitive data generator.

adversarial-attacks agent-safety ai-safety benchmark frontier-models jailbreak large-language-models llm-safety pythonby wuyoscarPython

synaptic-memory 📁v0.16.0🌱 Seedling⭐27

Brain-inspired knowledge graph: spreading activation, Hebbian learning, memory consolidation.

ai-agent embedding graph-database hebbian-learning knowledge-graph llm mcp mcp-server pythonby PlateerLabPython

connectonion 📁v0.9.1🌳 Mature⭐863

The Best AI Agent Framework for Agent Collaboration.

agent agentic-ai llm openonion pythonby openonionPython

caveman 📁v1.6.0🏛️ Flagship⭐42,198

🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman

ai anthropic caveman claude claude-code llm meme prompt-engineering pythonby JuliusBrusseePython

LRAT 📁0.0.0🌱 Seedling⭐39

The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.

agent agentic llm python searchby Yuqi-ZhouPython

any-agent 📁1.18.0🌳 Mature⭐1,153

A single interface to use and evaluate different agent frameworks

a2a agent-evaluation agents ai mcp pythonby mozilla-aiPython

ai-agents-reality-check 📁0.0.0🌿 Growing⭐57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

agent-architecture agent-benchmark agent-evaluation agent-performance agentic-ai agentic-workflow ai-benchmarking architectural-evaluation llm-agent pythonby Cre4T3Tiv3Python

llmware 📁v0.4.6🌿 Growing⭐14,862

Unified framework for building enterprise RAG pipelines with small, specialized models

agents generative-ai-tools llamacpp llm onnx openvino parsing python retrieval-augmented-generationby llmware-aiPython

claude-codex-settings 📁v2.3.0🌳 Mature⭐623

My personal Claude Code and OpenAI Codex setup with battle-tested skills, commands, hooks, agents and MCP servers that I use daily.

ai-agents ai-tools claude-ai claude-code claude-code-plugin claude-skills claudecode claudecode-config pythonby fcakyonPython

LIA-Assistant 📁v1.17.1🌱 Seedling⭐17

Open-source multi-agent AI assistant powered by LangGraph, FastAPI & Next.js — 16+ agents, Human-in-the-Loop, MCP integration, voice TTS, RAG, 500+ metrics, 6 languages.

ai ai-agent ai-assistant assistant chatbot claude claude-code clawdbot pythonby jgouviergmailPython

arag 📁v0.1.0🌿 Growing⭐252

A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.

agent agentic-ai agenticrag deepresearch evaluation graphrag llm llmagents pythonby Ayanami0730Python

cyllama 📁0.2.11🌱 Seedling⭐25

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

agents cython cython-wrapper llama-cpp python python3 rag stable-diffusion-cpp whisper-cppby shakfuPython

RAGElo 📁0.4.0🌿 Growing⭐128

RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker

pythonby zetaalphavectorPython

JRVS 📁0.0.0🌿 Growing⭐236

JRVS AI Agent with JARCORE autonomous coding engine - RAG knowledge base, web scraping, calendar, code generation. Powered by whatever local AI you choose.

pythonby XthebuilderPython

ragas 📁v0.4.3🌳 Mature⭐13,570

Supercharge Your LLM Application Evaluations 🚀

evaluation llm llmops pythonby explodinggradientsPython

Observal 📁v0.2.0🌿 Growing⭐572

Observal is an AI agent registry with first in class observabilty and eval framework

agents claude-code cli-tool cursor evaluation gemini-cli kiro large-language-models pythonby BlazeUp-AIPython

GTA 📁v0.2.0🌿 Growing⭐143

[NeurIPS 2024 D&B] GTA: A Benchmark for General Tool Agents & [arXiv 2026] GTA-2

llm-agent llm-evaluation pythonby open-compassPython

yao-meta-skill 📁main@2026-04-19🌿 Growing⭐297

YAO = Yielding AI Outcomes. A lightweight but rigorous system for creating, evaluating, packaging, and governing reusable agent skills.

agent-skills ai-agents meta-skill prompt-engineering python skill-engineering workflow-automationby yaojingangPython

OpenClawProBench 📁main@2026-04-15🌿 Growing⭐453

OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

agent benchmark evaluation harness leaderboard llm openclaw pythonby suyoumoPython

claw-eval 📁main@2026-04-15🌿 Growing⭐465

Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.

agent harness llm openclaw pythonby claw-evalPython

TrustRAG 📁0.0.0🌳 Mature⭐1,253

TrustRAG：The RAG Framework within Reliable input,Trusted output

deep-research deep-search python rag retrieval-augmented-generationby gomate-communityPython

AgenticX 📁v0.3.7🌿 Growing⭐114

AgenticX is a unified, production-ready multi-agent platform — Python SDK + CLI (agx) + Studio server + Machi desktop app. Features Meta-Agent orchestration, 15+ LLM providers, MCP Hub, hierarchical m

agent-framework agentic-workflows ai-agent ai-orchestration chatbot desktop-app electron fastapi pythonby DemonDamonPython

PageIndex 📁main@2026-04-10🌿 Growing⭐25,597

📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG

agentic-ai agents ai ai-agents context-engineering information-retrieval llm python rag vector-databaseby VectifyAIPython

atomic-knowledge 📁v0.2.0🌱 Seedling⭐36

Markdown-first work-memory protocol for existing agents, with maintained knowledge, candidate notes, evals, and an example KB.

agent ai filesystem knowledge-base llm markdown personal-agent python ragby Nimo1987Python

tulip_agent 📁0.0.0🌱 Seedling⭐44

autonomous agent with access to a tool library

autonomous-agent large-language-model python tool-libraryby HRI-EUPython

sec-edgar-mcp 📁v1.0.8🌿 Growing⭐253

A SEC EDGAR MCP (Model Context Protocol) Server

ai artificial-intelligence edgar edgar-database finance genai llm mcp pythonby stefanoamorelliPython

pdd 📁main@2026-04-21🌿 Growing⭐656

Prompt Driven Development Command Line Interface

ai cli code developer-tools development methodology prompt prompt-engineering pythonby promptdrivenPython

deer-flow 📁main@2026-04-21🌿 Growing⭐63,234

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of ta

agent agentic agentic-framework agentic-workflow ai ai-agents deep-research harness pythonby bytedancePython

deepeval 📁v3.9.5🌳 Mature⭐14,911

The LLM Evaluation Framework

evaluation-framework evaluation-metrics llm-evaluation llm-evaluation-framework llm-evaluation-metrics pythonby confident-aiPython

agentic-chatops 📁main@2026-04-20🌿 Growing⭐100

3-tier agentic ChatOps (n8n + GPT-4o + Claude Code) implementing all 21 patterns from "Agentic Design Patterns" — solo operator managing 137 devices

agentic-ai chatops claude-code devops infrastructure-automation matrix mcp multi-agent-systems shellby papadopouloskyriakosPython

auto-deep-researcher-24x7 📁main@2026-04-19🌿 Growing⭐622

🔥 An autonomous AI agent that runs your deep learning experiments 24/7 while you sleep. Zero-cost monitoring, Leader-Worker architecture, constant-size memory.

ai-agent autonomous-agent claude-code deep-learning experiment-automation gpu hyperparameter-tuning llm-agent pythonby Xiangyue-ZhangPython

cognitive-dissonance-dspy 📁main@2026-04-14🌿 Growing⭐276

A multi-agent LLM system for detecting and resolving cognitive dissonance.

pythonby evalopsPython

cdpilot 📁v0.3.0🌱 Seedling⭐25

Zero-dependency browser automation CLI. 70+ commands, 10 test assertions, smart commands (click/fill by text — no LLM needed). MCP server for AI agents with 500x fewer tokens. Extract, observe, script

ai-agent assertions automation brave-browser browser-automation cdp chrome-devtools-protocol claude pythonby mehmetnadirPython

claude-code-config 📁0.0.0🌱 Seedling⭐88

Claude Code skills, architectural principles, and alternative approaches for AI-assisted development

ai-agents claude claude-code llm machine-learning mcp prompt-engineering python skillsby AnastasiyaWPython

learn-hermes-agent 📁0.0.0🌱 Seedling⭐16

A 27-chapter hands-on tutorial for building an autonomous AI agent from zero in Python. Agent loop, tool system, memory, skills, MCP, multi-platform gateway, and self-evolution — inspired by Herme

agent-from-scratch agent-tutorial ai-agent chatbot hermes-agent llm-agent python reinforcement-learning self-improving-aiby longyunfeiguPython

NanoCoder-Pro 📁0.0.0🌱 Seedling⭐54

NanoCoder Pro — Autonomous Coding Agent with Master-SubAgent Architecture

pythonby j61398257-labPython

simplenote-mcp-server 📁v1.15.0🌱 Seedling⭐17

MCP Server for Simplenote integration with Claude Desktop

ai backend claude-ai crud electron integration mcp-server open-source pythonby docdyhrPython

llm_context_benchmarks 📁0.0.0🌱 Seedling⭐59

📊 LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimiz

ai benchmarking llms pythonby ivanfioravantiPython

sinain-hud 📁overlay-v2.8.0🌱 Seedling⭐5

Ambient intelligence that sees what you see, hears what you hear, and acts on your behalf

agent ai audio-transcription hud macos mcp overlay privacy pythonby anthillnetPython

dory 📁v0.1.0🌱 Seedling⭐14

One memory layer for every AI agent. Local-first, markdown source of truth, and CLI/HTTP/MCP native. Your agent forgot who you are. Again. Dory fixes that.

agents ai-agents claude-code codex docker fastapi knowledge-graph llm model-context-protocol pythonby deeflectPython

agent-arch 📁main@2026-04-21🌱 Seedling⭐10

No description

agentic-ai agentic-workflow agents ai ai-architect ai-architecture ai-architecture-compliance architecture llm-agent pythonby agent-axiomPython

agent2 📁v0.1.0🌱 Seedling⭐26

The production runtime for AI agents. Schema in, API out. Built on PydanticAI + FastAPI.

agent-runtime ai-agents ai-framework developer-tools docker enterprise-ai fastapi llm pythonby duozokkerPython

LettuceDetect 📁0.1.8💤 Dormant⭐565

Lightweight hallucination detection framework for RAG applications

bert hallucination-detection hallucination-evaluation information-extraction nlp python pytorch token-classificationby KRLabsOrgPython

claude-ruby-grape-rails 📁v1.13.4🌱 Seedling⭐5

Claude Code plugin for Ruby, Rails, Grape, PostgreSQL, Redis, and Sidekiq development

ai ai-agent ai-agents ai-tools claude claude-code llm python rubyby slbugPython

claude-skills 📁v2.0.0🌿 Growing⭐12,208

220+ Claude Code skills & agent plugins for Claude Code, Codex, Gemini CLI, Cursor, and 8 more coding agents — engineering, marketing, product, compliance, C-level advisory.

agent-plugins agent-skills agentic-ai ai-coding-agent anthropic-claude claude-ai claude-code claude-code-plugins prompt-engineering pythonby alirezarezvaniPython

deltallm 📁v0.1.21-rc1🌱 Seedling⭐4

Route, manage, and analyze your LLM requests across multiple providers with a unified API interface

ai-gateway ai-infrastructure api-gateway kubernetes llm-gateway llm-proxy llm-routing mcp model-context-protocol pythonby deltawiPython

Geneclaw 📁v0.1.0🌱 Seedling⭐36

Self-evolving AI agent framework with 5-layer safety gatekeeper. Agents observe failures, propose fixes, and safely apply them. Built on HKUDS/nanobot.

ai-agent autonomous-agent evolution llm nanobot python safety self-evolving-aiby Clawland-AIPython

Nightshift 📁v0.0.7🌱 Seedling⭐1

Autonomous overnight codebase improvement agent for Claude Code. Run it before bed, wake up to production-ready fixes.

ai-agent anthropic automation claude claude-code code-quality developer-tools overnight-agent pythonby RecusivePython

RagaAI-Catalyst 📁v2.2.4💤 Dormant⭐16,141

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal

agentic-ai agentic-ai-development agentneo agents ai-agent-monitoring ai-application-debugging ai-evaluation-tools ai-performance-optimization pythonby raga-ai-hubPython

DOX 📁main@2026-04-15🌱 Seedling⭐2

Broken RAG For The Broken Souls

hallucination llm python rag retrieval-augmented-generation vibecodingby AmMoPyPython

surf 📁0.0.0🌱 Seedling⭐1

The open framework for extensible & grounded AI agent orchestration.

agent-framework agent-orchestration ai-agent ai-framework azure multi-agent pythonby barney-wPython

uniAI 📁0.0.0🌱 Seedling⭐1

Syllabus-aware RAG study assistant for university students. Answers strictly from your own notes & PDFs, unit-scoped retrieval, cross-encoder reranking, and a hallucination gate — built to help studen

ai chromadb django genai information-retrieval llm local-llm ollama python vector-databaseby git-pratap-shreyPython

geon-decoder 📁main@2026-04-11🌱 Seedling⭐3

GEON: Structure-first decoding via equivalence classes and field closure

ai code-generation decoding language-models llm machine-learning program-synthesis pythonby singhalpm-hubPython

sawzhang_skills 📁0.0.0🌱 Seedling⭐2

Claude Code skills collection — CCA study guides, Twitter research, MCP review, auto-iteration tools

ai-agent automation cca claude claude-code developer-tools llm mcp prompt-engineering pythonby sawzhangPython

pytorch_template 📁v0.3.0🌱 Seedling⭐10

AI-agent-friendly PyTorch research pipeline — one YAML config drives preflight, training, Optuna HPO, and real-time TUI monitoring

ai-agent claude-code cli deep-learning experiment-management hyperparameter-optimization machine-learning optuna pythonby AxectPython

Agent_Life_Space 📁v1.36.0🌱 Seedling⭐1

Self-hosted autonomous AI agent — 9-layer cascade, Docker sandbox, encrypted vault, review/build/control plane, 1407+ tests

ai-agent anthropic autonomous-agent claude code-review control-plane docker llm pythonby B2JK-IndustryPython

evo-agents 📁master@2026-04-19🌱 Seedling⭐3

Complete Workspace Template for OpenClaw - Full agent lifecycle with unified memory system (Markdown + SQLite), self-evolution, RAG. Not for SubAgent/Skill use.

agent ai-memory bge-m3 chinese-nlp fts5 local-ai markdown memory-system python ragby luoboaskPython

Government-Citizen-Services-Voice-Agent 📁main@2026-04-15🌱 Seedling⭐1

Autonomous, multilingual AI voice agent using ElevenLabs, LangGraph, and RAG for government services

conversational-ai elevenlabs fastapi govtech langgraph python rag voice-agentby AutomaticarePython

idle-harness 📁main@2026-04-18🌱 Seedling⭐1

GAN-inspired multi-agent system that autonomously builds full-stack web apps from a single prompt using Claude AI agents

agent-sdk ai-agent ai-code-generator anthropic autonomous-coding claude claude-code code-generation pythonby jhlee0409Python

fastRAG 📁v3.1.2💤 Dormant⭐1,776

Efficient Retrieval Augmentation and Generation Framework

benchmark colbert diffusion generative-ai information-retrieval knowledge-graph llm multi-modal pythonby IntelLabsPython

lmnr0.7.47🌱 Seedling

Python SDK for Laminar

pypiby lmnr.aiPython

boostedblob1.0.0🌱 Seedling

Command line tool and async library to perform basic file operations on local paths, Google Cloud Storage paths and Azure Blob Storage paths.

pypiby pypiPython

looker-sdk26.6.1🌱 Seedling

Looker REST API

4.0 api looker looker_sdk pypiby Looker Data Sciences, Inc.Python