freshcrate — Search

Search results for "evaluation"

147 results found

phoenix 📁arize-phoenix-v14.9.1🌳 Mature⭐9,209

AI Observability & Evaluation

agents ai-monitoring ai-observability aiengineering anthropic datasets evals jupyter notebook langchain prompt-engineeringby Arize-aiJupyter Notebook

opik 📁2.0.6🌳 Mature⭐18,767

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

evaluation hacktoberfest hacktoberfest2025 langchain llama-index llm llm-evaluation llm-observability pythonby comet-mlPython

agenta 📁v0.96.7🌳 Mature⭐4,011

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

agents evaluation llm-as-a-judge llm-evaluation llm-framework llm-monitoring llm-observability llm-platform prompt-engineering typescriptby Agenta-AITypeScript

ouroboros 📁v0.28.8🌳 Mature⭐2,107

Stop prompting. Start specifying.

ai-agent claude-code codex-cli devtools evaluation llm mcp multi-agent prompt-engineering pythonby Q00Python

WeKnora 📁v0.4.0🌳 Mature⭐13,819

LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.

agent agentic ai chatbot chatbots embeddings evaluation generative-ai goby TencentGo

LLM-Agents-Ecosystem-Handbook 📁0.0.0🌳 Mature⭐508

One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.

ai ai-agent ai-agents fine-tuning finetuning-llms freamework llm llmops pythonby oxbshwPython

ai-agents-reality-check 📁0.0.0🌿 Growing⭐57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

agent-architecture agent-benchmark agent-evaluation agent-performance agentic-ai agentic-workflow ai-benchmarking architectural-evaluation llm-agent pythonby Cre4T3Tiv3Python

openlit 📁openlit-1.18.1🌿 Growing⭐2,358

Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Management, Vault, Playground. 🚀💻 Integrates with 50+ LLM Providers,

ai-observability amd-gpu clickhouse distributed-tracing genai gpu-monitoring grafana langchain pythonby openlitPython

CodeGen 📁0.0.0🌳 Mature⭐773

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pr

pythonby facebookresearchPython

openclaw-engram 📁v9.3.142🌿 Growing⭐54

Local-first memory plugin for OpenClaw AI agents. LLM-powered extraction, plain markdown storage, hybrid search via QMD. Gives agents persistent long-term memory across conversations.

ai-agent ai-memory conversational-ai engram knowledge-graph llm local-first long-term-memory typescriptby joshuaswarrenTypeScript

agent-framework 📁python-1.1.0🌳 Mature⭐9,325

A framework for building, orchestrating and deploying AI agents and multi-agent workflows with support for Python and .NET.

agent-framework agentic-ai agents ai dotnet multi-agent orchestration pythonby microsoftPython

PraisonAI 📁v4.6.25🌳 Mature⭐6,900

PraisonAI 🦞 — Hire a 24/7 AI Workforce. Stop writing boilerplate and start shipping autonomous agents that research, plan, code, and execute tasks. Deployed in 5 lines of code with built-in memory, R

agents ai ai-agent-framework ai-agent-sdk ai-agents ai-agents-framework ai-agents-sdk ai-framwork pythonby MervinPraisonPython

OpenSandbox 📁docker/execd/v1.0.13🌳 Mature⭐9,925

Secure, Fast, and Extensible Sandbox runtime for AI agents.

ai ai-agent ai-infra kubernetes python sandboxby alibabaPython

agentmemory 📁v0.9.1🌳 Mature⭐738

Persistent memory for AI coding agents

typescriptby rohitg00TypeScript

AgentWard 📁main@2026-04-20🌱 Seedling⭐30

AgentWard – Built for all, hardened for OpenClaw.

agent-security defense-in-depth llm-agent llm-security openclaw openclaw-plugin openclaw-security prompt-injection-defense typescriptby FIND-LabTypeScript

neurolink 📁v9.56.0🌿 Growing⭐121

Universal AI Development Platform with MCP server integration, multi-provider support, and professional CLI. Build, test, and deploy AI applications with multiple ai providers.

agents ai ai-development ai-platform automation developer-tools llm local-first typescriptby juspayTypeScript

piclaw 📁v1.8.3🌿 Growing⭐467

I'm going to build my own OpenClaw, with blackjack... and bun!

adaptive-cards ai-agent bun coding-agent docker llm pi-agent self-hosted typescriptby rcarmoTypeScript

mentisdb 📁0.9.3.39🌿 Growing⭐56

Memory that lasts and compounds. MentisDB gives agents durable memory so they do not just remember, they improve over time. It stores append-only thought chains plus a Git-like skills registry, lett

ai ai-agents claude codex copilot infinite-memory openai openrouter rustby CloudLLM-aiRust

langchain 📁langchain-core==1.3.0🌳 Mature⭐133,178

The agent engineering platform

agents ai ai-agents anthropic chatgpt deepagents enterprise framework pythonby langchain-aiPython

RAGHub 📁main@2026-04-17🌳 Mature⭐1,712

A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.

ai artificial-intelligence large-language-models llm machine-learning natural-language-processing nlp open-sourceby Andrew-Jang

agentic-memory 📁0.0.0🌿 Growing⭐162

No description

by lhl

RAPTOR 📁0.0.0🌱 Seedling⭐13

RAPTOR (Robust AI-Powered Toolkit for Operational Robots) is an AI-native Content Insight Engine that transforms passive media storage into an intelligent knowledge platform through automated analysis

ai ai-automation ai-framework ai-orchestration artificial-intelligence audio-processing computer-vision content-analysis pythonby DHT-AI-StudioPython

tulip_agent 📁0.0.0🌱 Seedling⭐44

autonomous agent with access to a tool library

autonomous-agent large-language-model python tool-libraryby HRI-EUPython

LRAT 📁0.0.0🌱 Seedling⭐34

The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.

agent agentic llm python searchby Yuqi-ZhouPython

GEA 📁0.0.0🌱 Seedling⭐23

Group Evolving Agents: Open-Ended Self-Improvement via Experience Sharing

code-generation group-evolving-agents open-ended-evolution open-endedness python research-agents self-evolving-agentsby eric-ai-labPython

chinese-llm-benchmark 📁v5.9🌿 Growing⭐5,841

ReLE评测：中文AI大模型能力评测（持续更新）：目前已囊括359个大模型，覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3-max、qwen3.5-plus、百川、讯飞星火、商汤senseChat等商用模型，以及step3.5-flash、kimi-k2.5、ernie4.5、Min

agentic-ai artificial-intelligence llm-agent llm-evaluationby jeinlee1991

arthur-engine 📁2.1.529🌿 Growing⭐75

Make AI work for Everyone - Monitoring and governing for your AI/ML

agentic benchmarking evaluation genai guardrails llm ml monitoring pythonby arthur-aiPython

langfuse 📁v3.169.0🌿 Growing⭐24,578

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

analytics autogen evaluation langchain large-language-models llama-index llm llm-evaluation prompt-engineering typescriptby langfuseTypeScript

promptfoo 📁code-scan-action-0.1.5🌿 Growing⭐19,943

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and

ci ci-cd cicd evaluation evaluation-framework llm llm-eval llm-evaluation typescriptby promptfooTypeScript

cognithor 📁v0.92.2🌿 Growing⭐94

Cognithor - Agent OS: Local-first autonomous agent operating system. 16 LLM providers, 17 channels, 112+ MCP tools, 5-tier memory, A2A protocol, knowledge vault, voice, browser automation, Computer-us

agent-os ai-agent anthropic autonomous-agent discord-bot document-analysis gdpr-compliant gemini pythonby Alex8791-cyberPython

plano 📁0.4.20🌿 Growing⭐6,241

Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic.

ai-gateway ai-gateway-support envoy envoyproxy gateway generative-ai llm-gateway llm-inference rustby katanemoRust

arag 📁v0.1.0🌿 Growing⭐247

A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.

agent agentic-ai agenticrag deepresearch evaluation graphrag llm llmagents pythonby Ayanami0730Python

pydantic-deepagents 📁0.3.15🌿 Growing⭐648

Python Deep Agent framework built on top of Pydantic-AI, designed to help you quickly build production-grade autonomous AI agents with planning, filesystem operations, subagent delegation, skills, and

agent-framework anthropic artificial-intelligence business-intelligence chatgpt clawdbot enterprise framework pythonby vstorm-coPython

helix 📁2.9.30🌿 Growing⭐757

♾️ Private Agent Fleet with Spec Coding. Each agent gets their own GPU-accelerated desktop. Run Claude, Codex, Gemini and open models on a full private AI Stack ♾️

agents api genai glm go golang helm k8s kimi llm-agentby helixmlGo

Autonomous-Agents 📁main@2026-04-16🌿 Growing⭐1,211

Autonomous Agents (LLMs) research papers. Updated Daily.

agent agentic agentic-ai agents ai ai-agents aiagent aiagentsby tmgthb

ollama 📁v0.21.0🌿 Growing⭐168,597

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

deepseek gemma gemma3 glm go golang gpt-oss llamaby ollamaGo

career-ops 📁v1.5.0🌿 Growing⭐30,403

AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.

ai-agent anthropic automation career claude claude-code cli golang javascriptby santiferJavaScript

Awesome-Context-Engineering 📁0.0.0🌳 Mature⭐3,045

🔥 Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.

agent agentic-ai agi awesome-list cognitive-science context-engineering llm ragby Meirtz

MODULAR-RAG-MCP-SERVER 📁0.0.0🌳 Mature⭐783

A modular RAG (Retrieval-Augmented Generation) system with MCP Server architecture. Using Skill to make AI follow each step of the spec and complete the code 100% by AI.

pythonby jerry-ai-devPython

evals 📁v0.1.15🌿 Growing⭐103

A comprehensive evaluation framework for AI agents and LLM applications.

agentic agentic-ai ai evaluation machine-learning python strands-agentsby strands-agentsPython

langwatch 📁skills@v0.3.0🌿 Growing⭐3,193

The platform for LLM evaluations and AI agent testing

ai analytics datasets dspy evaluation gpt llm llm-ops typescriptby langwatchTypeScript

AI-Infra-Guard 📁v4.1.4🌿 Growing⭐3,428

A full-stack AI Red Teaming platform securing AI ecosystems via OpenClaw Security Scan, Agent Scan, Skills Scan, MCP scan, AI Infra scan and LLM jailbreak evaluation.

agent agent-security ai-infra ai-red-teaming ai-security llm llm-evaluation llm-jailbreak pythonby TencentPython

OpenClawProBench 📁main@2026-04-15🌿 Growing⭐340

OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

agent benchmark evaluation harness leaderboard llm openclaw pythonby suyoumoPython

claw-eval 📁main@2026-04-15🌿 Growing⭐394

Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.

agent harness llm openclaw pythonby claw-evalPython

unsloth-buddy 📁main@2026-04-15🌿 Growing⭐212

Zero-friction LLM fine-tuning skill for Claude Code, Gemini CLI & any ACP agent. Unsloth on NVIDIA · TRL+MPS/MLX on Apple Silicon. Automates env setup, LoRA training (SFT, DPO, GRPO, vision), post-hoc

apple-silicon claude-code dpo fine-tuning gaslamp grpo huggingface lora pythonby TYH-labsPython

latitude-llm 📁claude-code-telemetry-0.0.5🌿 Growing⭐3,955

Latitude is the open-source agent engineering platform

typescriptby latitude-devTypeScript

Agentic-RAG-R1 📁0.0.0🌿 Growing⭐412

Agentic RAG R1 Framework via Reinforcement Learning

agentic grpo python rag rlby jiangxinkePython

AgenticX 📁v0.3.7🌿 Growing⭐105

AgenticX is a unified, production-ready multi-agent platform — Python SDK + CLI (agx) + Studio server + Machi desktop app. Features Meta-Agent orchestration, 15+ LLM providers, MCP Hub, hierarchical m

agent-framework agentic-workflows ai-agent ai-orchestration chatbot desktop-app electron fastapi pythonby DemonDamonPython

prism-mcp 📁v9.3.0🌿 Growing⭐116

The Mind Palace for AI Agents — Autonomous Cognitive OS with affect-tagged memory (valence engine), token-economic RL (surprisal gate + UBI), Hebbian learning, ACT-R spreading activation, Synapse Engi

agent-memory ai-agent anti-sycophancy claude-desktop cognitive-architecture google-gemini hebbian-learning llm-tools typescriptby dcostencoTypeScript

agent-client 📁v0.13.0🌿 Growing⭐90

Autonomous CLI agent integrations for the Spring AI ecosystem with Claude Code, Gemini CLI, and secure sandbox execution

javaby spring-ai-communityJava

sample-getting-started-with-strands-agents-course 📁0.0.0🌱 Seedling⭐75

Learn to build AI agents with Strands framework. Covers LLM integration via Amazon Bedrock/Anthropic, AWS service connections, tool implementation with MCP/A2A protocols, and agent evaluation using La

a2a anthropic aws bedrock jupyter notebook litellm mcp python strands-agentsby aws-samplesJupyter Notebook

ISC-Bench 📁v0.0.5🌿 Growing⭐786

Internal Safety Collapse: Turning the LLM or an AI Agent into a sensitive data generator.

adversarial-attacks agent-safety ai-safety benchmark frontier-models jailbreak large-language-models llm-safety pythonby wuyoscarPython

weaviate 📁v1.37.1🌿 Growing⭐15,988

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c

approximate-nearest-neighbor-search generative-search go grpc hnsw hybrid-search image-search information-retrieval mlopsby weaviateGo

cordum 📁V0.9.9.1🌿 Growing⭐461

The open agent control plane. Govern autonomous AI agents with pre-execution policy enforcement, approval gates, and audit trails. Works with LangChain, CrewAI, MCP, and any framework.

agent-framework agentic-ai ai-agent ai-governance ai-orchestration ai-safety audit-trail autonomous-agents goby cordum-ioGo

panguard-ai 📁v1.4.19🌱 Seedling⭐37

Open-source security platform for AI agents -- audits skills before install, monitors 24/7, shares threat intelligence across all users. | AI Agent 開源安全平台 -- 安裝前審計 skill、24/7 即時監控、社群共享威脅情報。

ai-agent ai-security cybersecurity llm-security mcp open-source prompt-injection sigma-rules typescriptby panguard-aiTypeScript

mission-control 📁v2.5.0🌿 Growing⭐1,853

The world's first Autonomous Product Engine (APE): AI agents research your market, generate features, and ship code as PRs. Convoy mode, crash recovery, cost tracking, 80+ API endpoints. Self-hosted v

aiagent automation openclaw typescriptby crshdnTypeScript

memind 📁main@2026-04-21🌿 Growing⭐360

Self-evolving cognitive memory and context engine for AI agents in Java. Empowering 24/7 proactive agents like OpenClaw with understanding and SOTA performance.

ai ai-agent ai-agents ai-memory context-engineering java memory openclawby openmemindJava

Awesome-World-Models 📁main@2026-04-21🌿 Growing⭐1,473

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website

artificial-intelligence autonomous-driving awesome deep-learning embodied-ai future-prediction video-prediction world-modelby leofan90

karpathy-llm-wiki 📁main@2026-04-21🌱 Seedling⭐34

The Self-Growing Karpathy LLM Wiki — grown by an AI agent yoyo from Karpathy's founding prompt

ai-agent karpathy knowledge-base llm typescript wikiby yologdevTypeScript

awesome-prompts 📁main@2026-04-21🌿 Growing⭐7,572

Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.

awesome awesome-list chatgpt gpt4 gpts gptstore papers prompt prompt-engineeringby ai-boost

honcho 📁main@2026-04-21🌿 Growing⭐2,030

Memory library for building stateful agents

agent-memory ai ai-agents ai-memory anthropic context-engineering continual-learning embeddings pythonby plastic-labsPython

aura 📁main@2026-04-21🌱 Seedling⭐47

A sovereign cognitive architecture with IIT 4.0 integrated information, residual-stream affective steering (CAA), Global Workspace Theory, active inference, and 72 consciousness modules — running loca

active-inference affective-computing apple-silicon artificial-consciousness autonomous-agent cognitive-architecture cognitive-science consciousness pythonby youngbryan97Python

deer-flow 📁main@2026-04-21🌿 Growing⭐60,446

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of ta

agent agentic agentic-framework agentic-workflow ai ai-agents deep-research harness pythonby bytedancePython

LLM-Agent-Paper-daily 📁main@2026-04-21🌱 Seedling⭐20

Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)

llm llm-agent pythonby Lyz103Python

Cogitator-AI 📁main@2026-04-21🌱 Seedling⭐35

🤖 Kubernetes for AI Agents. Self-hosted, production-grade runtime for orchestrating LLM swarms and autonomous agents. TypeScript-native.

agent agentic-ai agentic-framework agentic-workflow ai ai-framework automation gemini typescriptby cogitator-aiTypeScript

samples 📁main@2026-04-20🌿 Growing⭐717

Agent samples built using the Strands Agents SDK.

agentic agentic-ai agents ai anthropic autonomous-agents bedrock genai pythonby strands-agentsPython

haystack 📁v2.28.0🌿 Growing⭐24,806

Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, m

agent agents ai gemini generative-ai gpt-4 information-retrieval large-language-models mdxby deepset-aiMDX

agentscope 📁v1.0.19🌿 Growing⭐23,421

Build and run agents you can see, understand and trust.

agent chatbot large-language-models llm llm-agent mcp multi-agent multi-modal pythonby agentscope-aiPython

deepeval 📁v3.9.5🌳 Mature⭐14,701

The LLM Evaluation Framework

evaluation-framework evaluation-metrics llm-evaluation llm-evaluation-framework llm-evaluation-metrics pythonby confident-aiPython

awesome-code-agents 📁main@2026-04-20🌿 Growing⭐94

A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding — they're redefining how software changes the world.

pythonby EuniAIPython

Sign 📁main@2026-04-20🌱 Seedling⭐31

Sign integrity generic notation

aarch64 assembly code-generation compiler compiler-design create-programming-language design functional-programming languageby johnny-shamanAssembly

vexa 📁v0.10.2🌿 Growing⭐1,862

Open-source meeting transcription API for Google Meet, Microsoft Teams & Zoom. Auto-join bots, real-time WebSocket transcripts, MCP server for AI agents. Self-host or use hosted SaaS.

google-meet meeting-assistant meeting-minutes meeting-notes ms-teams ms-teams-app notetaker python zoomby Vexa-aiTypeScript

auto-deep-researcher-24x7 📁main@2026-04-19🌿 Growing⭐261

🔥 An autonomous AI agent that runs your deep learning experiments 24/7 while you sleep. Zero-cost monitoring, Leader-Worker architecture, constant-size memory.

ai-agent autonomous-agent claude-code deep-learning experiment-automation gpu hyperparameter-tuning llm-agent pythonby Xiangyue-ZhangPython

skills-vote 📁main@2026-04-19🌱 Seedling⭐31

The Next-Gen Agent-Native Skill Recommendation Engine

agent-skill agent-skills llm llm-agent pythonby MemTensorPython

medusa 📁v2026.5.5🌿 Growing⭐252

AI-first security scanner with 76 analyzers, 9,600+ detection rules, and repo poisoning detection for AI/ML, LLM agents, and MCP servers. Scan any GitHub repo with: medusa scan --git user/repo

agent-security ai-security code-analysis cve-detection devsecops llm-security mcp nextjs pythonby Pantheon-SecurityPython

crewAI 📁1.14.2🌿 Growing⭐48,611

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

agents ai ai-agents aiagentframework llms pythonby crewAIIncPython

giskard-oss 📁giskard-checks/v1.0.2b1🌱 Seedling⭐5,225

🐢 Open-Source Evaluation & Testing library for LLM Agents

agent-evaluation ai-red-team ai-security ai-testing fairness-ai llm llm-eval llm-evaluation pythonby Giskard-AIPython

maverick-mcp 📁main@2026-04-17🌿 Growing⭐479

MaverickMCP - Personal Stock Analysis MCP Server

anthropic artificial-intelligence claude equities fastmcp finance financial-analysis fintech pythonby wshobsonPython

trulens 📁trulens-2.7.2🌱 Seedling⭐3,237

Evaluation and Tracking for LLM Experiments and AI Agents

agent-evaluation agentops ai-agents ai-monitoring ai-observability evals explainable-ml llm-eval pythonby trueraPython

AReaL 📁v1.0.3🌿 Growing⭐5,017

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

agent llm llm-agent llm-reasoning machine-learning-systems mlsys python reinforcement-learning rlby inclusionAIPython

Awesome-Agent-Memory 📁main@2026-04-16🌿 Growing⭐333

Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.

agent-memory ai-agent ai-agent-memory awesome-agent-memory llm-memory memory memory-management multimodal-llm-memoryby TeleAI-UAGI

beads 📁v1.0.2🌿 Growing⭐20,577

Beads - A memory upgrade for your coding agent

agents claude-code coding goby gastownhallGo

paiml-mcp-agent-toolkit 📁v3.14.0🌿 Growing⭐148

Pragmatic AI Labs MCP Agent Toolkit - An MCP Server designed to make code with agents more deterministic

agentic c deno kotlin mcp mcp-server paiml paiml-active-tool rustby paimlRust

mlflow 📁v3.11.1🌱 Seedling⭐25,285

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controllin

agentops agents ai ai-governance apache-spark evaluation langchain llm-evaluation pythonby mlflowPython

LLM-Wiki 📁main@2026-04-18🌱 Seedling⭐7

Autonomous knowledge base plugin for Claude Code - captures reserch, ideas, and decisions into an interlinked wiki with reserch-on-miss, semantic search, and a Wikipedia-style web UI. Knowledge compou

ai-tools autonomous-agent claude-code claude-code-plugin fastapi knowledge-base knowledge-management llm pythonby OshayrPython

cognitive-dissonance-dspy 📁main@2026-04-14🌿 Growing⭐276

A multi-agent LLM system for detecting and resolving cognitive dissonance.

pythonby evalopsPython

ai-real-estate-assistant 📁dev@2026-04-13🌿 Growing⭐159

Advanced AI Real Estate Assistant using RAG, LLMs, and Python. Features market analysis, property valuation, and intelligent search.

ai assistant chatbot docker fastapi llm nextjs proptech python vector-databaseby AleksNeStuPython

carapace 📁v0.7.0🌱 Seedling⭐42

A secure, stable Rust alternative to openclaw/moltbot/clawdbot

ai-assistant anthropic chatbot discord-bot gemini llm llm-agent local-first rustby puremachineryRust

awesome-vector-database 📁main@2026-04-13🌿 Growing⭐341

A curated list of awesome works related to high dimensional structure/vector search & database

approximate-nearest-neighbor-search embedding-similarity embeddings-similarity nearest-neighbor-search search-engine similarity-search vector-database vector-searchby dangkhoasdc

Multi-Agent-Custom-Automation-Engine-Solution-Accelerator 📁v4.1.1🌿 Growing⭐770

The Multi-Agent Custom Automation Engine Solution Accelerator is an AI-driven system that manages a group of AI agents to accomplish tasks based on user input. Powered by Microsoft Agent Framework, Az

ai-azd-templates azd-templates pythonby microsoftPython

voltagent 📁@voltagent/server-elysia@2.0.7🌿 Growing⭐7,851

AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework

agents ai ai-agents ai-agents-framework aiagentframework chatbots chatgpt framework typescriptby VoltAgentTypeScript

AutoRAG 📁v0.3.22🌱 Seedling⭐4,693

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

analysis automl benchmarking document-parser embeddings evaluation llm llm-evaluation pythonby Marker-Inc-KoreaPython

cyber-pilot 📁v3.7.0-beta🌿 Growing⭐53

Cyber Pilot is a traceable delivery system for requirements, design, plans, and code.

agents ai architecture code-generation code-review code-validation codegen codegeneration pythonby cyberfabricPython

Awesome-Repo-Level-Code-Generation 📁main@2026-04-10🌿 Growing⭐274

Must-read papers on Repository-level Code Generation & Issue Resolution 🔥

ai4se automated-software-engineering code-generation large-language-models llm software-engineeringby YerbaPage

tensorzero 📁2026.4.0🌱 Seedling⭐11,204

TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

ai ai-engineering anthropic artificial-intelligence deep-learning genai generative-ai gpt rustby tensorzeroRust

ds_ex 📁main@2026-04-09🌱 Seedling⭐17

DSPEx - Declarative Self-improving Elixir | A BEAM-Native AI Program Optimization Framework

ai ai-framework automated-optimization beam declarative-programming dspy elixir erlang-vmby nshkrdotcomElixir

UltraRAG 📁v0.3.0.2🌿 Growing⭐5,480

A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

deepseek demo easy embedding flask gpt huggingface-transformers llm pythonby OpenBMBPython

clawtrace 📁main@2026-04-16🌱 Seedling⭐10

Make your OpenClaw agents better, cheaper, and faster.

agent-observability agent-telemetry ai-agent ai-agent-observability ai-evaluation ai-observability automomous-agents claude-harness typescriptby epsilla-cloudTypeScript

sv-excel-agent 📁0.0.0🌱 Seedling⭐179

An Excel AI agent that uses MCP tools to let LLMs read, edit, and automate Excel spreadsheets.

pythonby SylvianAIPython

mastra 📁@mastra/core@1.24.0🌱 Seedling⭐22,899

From the team behind Gatsby, Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.

agents ai chatbots evals javascript llm mcp nextjs typescriptby mastra-aiTypeScript

b2b-sdr-agent-template 📁v3.6.0🌱 Seedling⭐40

Open-source AI SDR template for B2B export. 10-stage sales pipeline, 10 cron jobs, 4-engine memory, multi-channel (WhatsApp+Telegram+Email). Built on OpenClaw.

ai-agent ai-sales ai-sdr b2b b2b-sales cold-email crm export-business shellby iPythoningShell

trpc-agent-go 📁v1.8.0🌱 Seedling⭐1,085

trpc-agent-go is a powerful Go framework for building intelligent agent systems using large language models (LLMs) and tools.

a2a agent ai go llm mcpby trpc-groupGo

skill 📁v1.2.1🌱 Seedling⭐978

PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

pythonby pinchbenchPython

kernel 📁v3.97.0🌱 Seedling⭐12

kbot — the AI agent that dreams, learns, and evolves. 764+ tools, 35 agents, 20 providers. Music production, iPhone control, financial analysis, cyber threat intel. Always-on daemon. Runs offline. npm

ai-agent anthropic cli coding-agent cybersecurity defi kbot llm typescriptby isaacsightTypeScript

everything-claude-code 📁v1.10.0🌱 Seedling⭐151,139

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

ai-agents anthropic claude claude-code developer-tools javascript llm mcp productivityby affaan-mJavaScript

ai-memecoin-trading-bot 📁0.0.0🌱 Seedling⭐63

AI-powered meme coin trading bot for Solana and Base that automatically scans new tokens, detects honeypots, calculates win probability, executes trades. Built in Go with a multi-agent architecture, r

ai-crypto-trading-agent ai-trading-agent ai-trading-bot autonomous-agent base base-memecoin-sniper crypto-trading-bot goby Jackhuang166Go

llm_intents 📁1.7.1🌱 Seedling⭐111

Exposes internet search tools for use by LLM-backed Assist in Home Assistant

assist hacs hacs-integration hassio hassio-integration home-assistant home-assistant-integration home-assistant-voice pythonby skye-harrisPython

Standard 📁0.0.0🌱 Seedling⭐18

JSON Agents - A universal JSON-native standard for describing AI agents, their capabilities, tools, runtimes, and governance in a portable, framework-agnostic format. Based on RFC 8259, JSON Schema 2

agent-governance agent-manifest agent-orchestration agent-specification ai-agents ai-framework interoperability json pythonby JSON-AgentsPython

any-agent 📁1.18.0🌱 Seedling⭐1,141

A single interface to use and evaluate different agent frameworks

a2a agent-evaluation agents ai mcp pythonby mozilla-aiPython

camel 📁v0.2.90🌱 Seedling⭐16,654

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

agent ai-societies artificial-intelligence communicative-ai cooperative-ai deep-learning large-language-models multi-agent-systems pythonby camel-aiPython

codexmaster 📁prompt-generator🌱 Seedling⭐76

Master Codex with this Framework file system + Prompt Generator consisting of 32 markdown files that will set such strict constraints and rules for Codex that its output is nearly flawless. Files for:

agentsmd ai ai-agent ai-coding ai-coding-tools ai-framework chatgpt codex htmlby robbiecalvinHTML

membrane 📁v0.2.0🌱 Seedling⭐75

A selective learning and memory substrate for agentic systems — typed, revisable, decayable memory with competence learning and trust-aware retrieval.

agent agent-framework agent-memory agent-skills agentic ai-agents autonomous-agents collaborate goby GustyCubeGo

mattermost-plugin-agents 📁v1.14.0🌱 Seedling⭐217

Mattermost Agents plugin supporting multiple LLMs

ai go llm mattermost mattermost-pluginby mattermostGo

KawaiiGPT 📁KawaiiGPT🌱 Seedling⭐831

KawaiiGPT — Open-source LLM gateway accessing DeepSeek, Gemini, and Kimi-K2 through reverse-engineered Pollinations API with no API keys required, built-in prompt injection capabilities for security r

ai-chatbot deepseek free-llm-access gemini kawaiigpt kimi-k2 linux-cli llm-jailbreak pythonby MarCmcbri1982Python

bisheng 📁v2.3.0🌱 Seedling⭐11,293

BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SF

agent ai chatbot enterprise finetune genai gpt langchian typescriptby dataelementTypeScript

RAGElo 📁0.4.0🌱 Seedling⭐128

RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker

pythonby zetaalphavectorPython

deltallm 📁v0.1.20-rc2🌱 Seedling⭐3

Route, manage, and analyze your LLM requests across multiple providers with a unified API interface

ai-gateway ai-infrastructure api-gateway kubernetes llm-gateway llm-proxy llm-routing mcp model-context-protocol pythonby deltawiPython

ragas 📁v0.4.3🌱 Seedling⭐13,329

Supercharge Your LLM Application Evaluations 🚀

evaluation llm llmops pythonby explodinggradientsPython

LightAgent 📁v0.5.0🌱 Seedling⭐831

LightAgent: Lightweight AI agent framework with memory, tools & tree-of-thought. Supports multi-agent collaboration, self-learning, and major LLMs (OpenAI/DeepSeek/Qwen). Open-source with MCP/SSE prot

pythonby wanxingaiPython

prd-taskmaster 📁v3.0.0🌱 Seedling⭐184

AI-powered PRD generation for Claude Code with taskmaster integration

ai-development claude-code claude-skills prd product-management product-requirements python requirements-engineering taskmasterby anombyte93Python

PAI-RAG 📁v0.4.3🌱 Seedling⭐450

An easy-to-use framework for modular RAG

pythonby aigc-appsPython

ai-agents 📁v0.3.0🌱 Seedling⭐20

Multi-agent system for software development

agentic-ai ai-agents ai-assistant anthropic-claude automation ci-cd claude-code code-generation markdownby rjmurilloMarkdown

py-gpt 📁v2.7.12🌱 Seedling⭐1,724

Desktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, spe

ai ai-assistant artificial-intelligence autonomous-agent chatbot claude deepseek desktop-app pythonby szczyglis-devPython

sofia 📁main@2026-04-11🌱 Seedling⭐2

Autonomous local AI assistant in Go — 40+ tools, 20+ LLM providers, multi-agent orchestration, self-improving

ai ai-agent anthropic artificial-intelligence assistant automation autonomous-agent cli goby grasbergGo

aictl 📁v0.28.0🌱 Seedling⭐1

🤖 AI agent in your terminal

ai ai-agent rustby pwittchenRust

evo-agents 📁master@2026-04-19🌱 Seedling⭐3

Complete Workspace Template for OpenClaw - Full agent lifecycle with unified memory system (Markdown + SQLite), self-evolution, RAG. Not for SubAgent/Skill use.

agent ai-memory bge-m3 chinese-nlp fts5 local-ai markdown memory-system python ragby luoboaskPython

uniAI 📁0.0.0🌱 Seedling⭐1

Syllabus-aware RAG study assistant for university students. Answers strictly from your own notes & PDFs, unit-scoped retrieval, cross-encoder reranking, and a hallucination gate — built to help studen

ai chromadb django genai information-retrieval llm local-llm ollama python vector-databaseby git-pratap-shreyPython

heartbeat-agent-framework 📁0.0.0🌱 Seedling⭐1

The open-source framework that makes AI agents proactive, self-learning, and autonomous. Multi-project tracking, full logging pipeline, message discipline, and memory review system.

agent-framework agent-orchestration ai-agent autonomous-agent heartbeat llm memory-system proactive-agentby muxueqingze

gptme 📁v0.31.0🌱 Seedling⭐4,266

Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!

agent agents ai-agents ai-assistant anthropic chatbot chatgpt cli pythonby gptmePython

ryvos 📁v0.9.0🌱 Seedling⭐2

Open-source autonomous AI assistant with 5-tier security, 62 tools, 14 LLM providers. Written in Rust. Single binary.

ai ai-agent ai-assistant assistant autonomous autonomous-agent chatbot cli rustby RyvosRust

CodeRAG 📁main@2026-04-21🌱 Seedling⭐1

Build semantic vector databases from code and docs to enable AI agents to understand and navigate your entire codebase effectively.

ai ai-tools code-analysis embeddings execution-based-evaluation game-development game-programming game-source rag typescriptby Eyram233TypeScript

harness 📁master@2026-04-21🌱 Seedling⭐1

Define and control AI agents in markdown with full prompt transparency, persistent memory, and integrated tools via the Claude Agent SDK.

ai claude claude-code claude-skills code-repository evaluation-framework gemini git llm-agent typescriptby heba-ramdanTypeScript

agenttel-sdk 📁v0.2.0-alpha🌱 Seedling⭐6

Agent-ready telemetry SDK — enriches OpenTelemetry across Java, Go, Python, Node.js, and browser with structured context for AI-driven observability.

agentic-ai ai-agents browser-telemetry frontend-observability genai incident-response java langchain4j mcp-serverby AgentTelJava

multi-agent-orchestration-framework 📁v0.1.0🌱 Seedling⭐26

Modular multi-agent orchestration framework powered by LangGraph and FastAPI.

agent ai-framework fastapi langchain langgraph llm memory multi-agent pythonby yx-fanPython

llm-agents.nix 📁assets🌱 Seedling⭐988

Nix packages for AI coding agents and development tools. Automatically updated daily.

buildbot-numtide nixby numtideNix

Government-Citizen-Services-Voice-Agent 📁main@2026-04-15🌱 Seedling⭐1

Autonomous, multilingual AI voice agent using ElevenLabs, LangGraph, and RAG for government services

conversational-ai elevenlabs fastapi govtech langgraph python rag voice-agentby AutomaticarePython

LettuceDetect 📁0.1.8💤 Dormant⭐545

Lightweight hallucination detection framework for RAG applications

bert hallucination-detection hallucination-evaluation information-extraction nlp python pytorch token-classificationby KRLabsOrgPython

Neuroverseos-governance 📁v0.3.0🌱 Seedling⭐1

Deterministic governance engine for AI agents. Enforce rules defined in .md governance files across AI systems.

agent-framework agent-guardrails agent-harness ai ai-agents ai-governance ai-guardrails ai-safety mcp-server typescriptby NeuroverseOSTypeScript

TSUKUYOMI 📁2.6.0💤 Dormant⭐86

TSUKUYOMI is an advanced modular intelligence framework designed for the democratization of Intelligence Analysis via systematic analysis, processing, and reporting across multiple domains. Built on a

ai ai-agent ai-framework js json osint osint-toolby savannah-i-g

HealthFlow 📁datasets💤 Dormant⭐40

HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research

ai-for-healthcare ai-for-science ehr llm llm-agent multi-agent pythonby yhzhu99Python

RagaAI-Catalyst 📁v2.2.4💤 Dormant⭐16,130

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal

agentic-ai agentic-ai-development agentneo agents ai-agent-monitoring ai-application-debugging ai-evaluation-tools ai-performance-optimization pythonby raga-ai-hubPython

ClosedSSPM 📁v0.4.1🌱 Seedling⭐1

An open-source SSPM tool written in Go

cli entra-id go golang google-workspace mcp open-source saasby PiotrMackowskiGo

FlexRAG 📁0.3.0💤 Dormant⭐235

FlexRAG: A RAG Framework for Information Retrieval and Generation.

llms nlp python ragby ictnlpPython

Qwen-Agent 📁v0.0.26💤 Dormant⭐15,963

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

pythonby QwenLMPython

judge0 📁v1.13.1⚰️ Archived⭐4,082

Robust, fast, scalable, and sandboxed open-source online code execution system for humans and AI.

ai-agent-tools ai-agents ai-tools code-execution code-executor code-runner competitive-programming html online-compilerby judge0HTML

Promptgpt 📁v1.2⚰️ Archived⭐119

PromptGPT is an opensource framework that enables users to automatically generate high-quality prompts with zero installations, coding necessary or technical knowledge. Promptgpt follows industry best

by howard9192

medicalAI 📁v1.2.9-rc⚰️ Archived⭐21

Medical-AI is a AI framework specifically for Medical Applications https://aibharata.github.io/medicalAI/

ai-framework keras medical-applications medical-imaging pdf-report prediction python tensorflow tensorflow2by aibharataPython