freshcrate

Search results for "eval"

190 results found
llama-index๐Ÿ“0.14.21๐Ÿ›๏ธ Flagshipโญ48,773

Interface between LLMs and your data

mlflow-skinny๐Ÿ“3.11.1๐Ÿ›๏ธ Flagshipโญ25,478

MLflow is an open source platform for the complete machine learning lifecycle

langsmith๐Ÿ“0.7.33๐ŸŒณ Matureโญ858

Client library to connect to the LangSmith Observability and Evaluation Platform.

google-cloud-aiplatform๐Ÿ“1.148.1๐ŸŒณ Matureโญ880

Vertex AI API client library

onyx๐Ÿ“v3.2.6๐Ÿ›๏ธ Flagshipโญ27,905

Open Source AI Platform - AI Chat with advanced features that works with every LLM

chinese-llm-benchmark๐Ÿ“v5.10๐Ÿ›๏ธ Flagshipโญ5,889

ReLE่ฏ„ๆต‹๏ผšไธญๆ–‡AIๅคงๆจกๅž‹่ƒฝๅŠ›่ฏ„ๆต‹๏ผˆๆŒ็ปญๆ›ดๆ–ฐ๏ผ‰๏ผš็›ฎๅ‰ๅทฒๅ›Šๆ‹ฌ359ไธชๅคงๆจกๅž‹๏ผŒ่ฆ†็›–chatgptใ€gpt-5.2ใ€o4-miniใ€่ฐทๆญŒgemini-3-proใ€Claude-4.6ใ€ๆ–‡ๅฟƒERNIE-X1.1ใ€ERNIE-5.0ใ€qwen3-maxใ€qwen3.5-plusใ€็™พๅทใ€่ฎฏ้ฃžๆ˜Ÿ็ซใ€ๅ•†ๆฑคsenseChat็ญ‰ๅ•†็”จๆจกๅž‹๏ผŒ ไปฅๅŠstep3.5-flashใ€kimi-k2.5ใ€ernie4.5ใ€Min

opik๐Ÿ“2.0.9๐Ÿ›๏ธ Flagshipโญ18,965

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

ragflow๐Ÿ“v0.25.0๐Ÿ›๏ธ Flagshipโญ78,674

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

voratiq๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ67

Agent ensembles to design, generate, and select the best code for every task.

adk-python๐Ÿ“v1.31.1๐Ÿ›๏ธ Flagshipโญ19,165

An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

agenta๐Ÿ“v0.96.7๐ŸŒณ Matureโญ4,045

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

haystack๐Ÿ“v2.28.0๐Ÿ›๏ธ Flagshipโญ24,941

Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, m

SeekStorm๐Ÿ“v3.0.0๐ŸŒณ Matureโญ1,865

SeekStorm: vector & lexical search - in-process library & multi-tenancy server, in Rust.

mentisdb๐Ÿ“0.9.3.39๐ŸŒฟ Growingโญ64

Memory that lasts and compounds. MentisDB gives agents durable memory so they do not just remember, they improve over time. It stores append-only thought chains plus a Git-like skills registry, lett

langwatch๐Ÿ“python-sdk@v0.21.0๐ŸŒณ Matureโญ3,206

The platform for LLM evaluations and AI agent testing

evals๐Ÿ“v0.1.15๐ŸŒฟ Growingโญ106

A comprehensive evaluation framework for AI agents and LLM applications.

arthur-engine๐Ÿ“2.1.529๐ŸŒฟ Growingโญ77

Make AI work for Everyone - Monitoring and governing for your AI/ML

weaviate๐Ÿ“v1.35.18๐Ÿ›๏ธ Flagshipโญ16,051

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c

langfuse๐Ÿ“v3.169.0๐Ÿ›๏ธ Flagshipโญ25,291

๐Ÿชข Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. ๐ŸŠYC W23

AI-Infra-Guard๐Ÿ“v4.1.4๐ŸŒณ Matureโญ3,521

A full-stack AI Red Teaming platform securing AI ecosystems via OpenClaw Security Scan, Agent Scan, Skills Scan, MCP scan, AI Infra scan and LLM jailbreak evaluation.

fast-agent๐Ÿ“v0.6.17๐ŸŒณ Matureโญ3,750

Code, Build and Evaluate agents - excellent Model and Skills/MCP/ACP Support

logfire๐Ÿ“v4.32.1๐ŸŒณ Matureโญ4,185

AI observability platform for production LLM and agent systems.

WeKnora๐Ÿ“v0.4.0๐Ÿ›๏ธ Flagshipโญ13,971

LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.

promptfoo๐Ÿ“code-scan-action-0.1.5๐Ÿ›๏ธ Flagshipโญ20,382

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and

mlflow๐Ÿ“ts/v0.2.0-rc.1๐Ÿ›๏ธ Flagshipโญ25,479

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controllin

prism-mcp๐Ÿ“v9.3.0๐ŸŒฟ Growingโญ128

The Mind Palace for AI Agents โ€” Autonomous Cognitive OS with affect-tagged memory (valence engine), token-economic RL (surprisal gate + UBI), Hebbian learning, ACT-R spreading activation, Synapse Engi

giskard-oss๐Ÿ“giskard-checks/v1.0.2b1๐Ÿ›๏ธ Flagshipโญ5,289

๐Ÿข Open-Source Evaluation & Testing library for LLM Agents

trulens๐Ÿ“trulens-2.7.2๐ŸŒณ Matureโญ3,261

Evaluation and Tracking for LLM Experiments and AI Agents

mastra๐Ÿ“@mastra/core@1.24.0๐Ÿ›๏ธ Flagshipโญ23,202

From the team behind Gatsby, Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.

AutoRAG๐Ÿ“v0.3.22๐ŸŒณ Matureโญ4,713

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

langroid๐Ÿ“0.61.1๐ŸŒณ Matureโญ3,976

Harness LLMs with Multi-Agent Programming

RAG-Anything๐Ÿ“v1.2.10๐Ÿ›๏ธ Flagshipโญ16,790

"RAG-Anything: All-in-One RAG Framework"

fast-plaid๐Ÿ“1.4.5๐ŸŒฟ Growingโญ245

High-Performance Engine for Multi-Vector Search

membrane๐Ÿ“v0.2.0๐ŸŒฟ Growingโญ80

A selective learning and memory substrate for agentic systems โ€” typed, revisable, decayable memory with competence learning and trust-aware retrieval.

txtai๐Ÿ“v9.7.0๐Ÿ›๏ธ Flagshipโญ12,412

๐Ÿ’ก All-in-one AI framework for semantic search, LLM orchestration and language model workflows

codingbuddy๐Ÿ“v5.6.3๐ŸŒฑ Seedlingโญ31

Codingbuddy orchestrates 29 specialized AI agents to deliver code quality comparable to a team of human experts through a PLAN โ†’ ACT โ†’ EVAL workflow.

SageFs๐Ÿ“v0.6.243๐ŸŒฟ Growingโญ60

Sage Mode for F# development โ€” REPL with solution or project loading, Live Testing for FREE, Hot Reload, and session management.

llm-wiki๐Ÿ“v1.1.0-rc8๐ŸŒฟ Growingโญ139

LLM-powered knowledge base from your Claude Code, Codex CLI, Copilot, Cursor & Gemini sessions. Karpathy's LLM Wiki pattern โ€” implemented and shipped.

qwe-qwe๐Ÿ“v0.17.6๐ŸŒฑ Seedlingโญ35

โšก Lightweight offline AI agent for local models. No cloud, no API keys โ€” just your GPU.

ai-plugin-scanner๐Ÿ“v2.0.45๐ŸŒฟ Growingโญ158

Security and best-practices scanner for AI Plugins, covering Codex, Claude, Opencode, Gemini & more. Scores trust for plugins 0-100.

cognithor๐Ÿ“v0.92.3๐ŸŒฟ Growingโญ115

Cognithor - Agent OS: Local-first autonomous agent operating system. 16 LLM providers, 17 channels, 112+ MCP tools, 5-tier memory, A2A protocol, knowledge vault, voice, browser automation, Computer-us

PraisonAI๐Ÿ“v4.6.27๐Ÿ›๏ธ Flagshipโญ6,969

PraisonAI ๐Ÿฆž โ€” Hire a 24/7 AI Workforce. Stop writing boilerplate and start shipping autonomous agents that research, plan, code, and execute tasks. Deployed in 5 lines of code with built-in memory, R

latitude-llm๐Ÿ“claude-code-telemetry-0.0.6๐ŸŒณ Matureโญ3,957

Latitude is the open-source agent engineering platform

neurolink๐Ÿ“v9.56.1๐ŸŒฟ Growingโญ83

Universal AI Development Platform with MCP server integration, multi-provider support, and professional CLI. Build, test, and deploy AI applications with multiple ai providers.

claude-code-plugins-plus-skills๐Ÿ“v4.26.0๐ŸŒณ Matureโญ1,995

423 plugins, 2,849 skills, 177 agents for Claude Code. Open-source marketplace at tonsofskills.com with the ccpi CLI package manager.

mcp-client-for-ollama๐Ÿ“v0.28.0๐ŸŒณ Matureโญ655

A text-based user interface (TUI) client for interacting with MCP servers using Ollama. Features include agent mode, multi-server, model switching, streaming responses, tool management, human-in-the-l

ha-mcp๐Ÿ“v7.3.0.dev386๐ŸŒณ Matureโญ2,465

The Unofficial and Awesome Home Assistant MCP Server

restai๐Ÿ“v6.1.45๐ŸŒฟ Growingโญ485

RESTai is an AIaaS (AI as a Service) open-source platform. Supports many public and local LLM suported by Ollama/vLLM/etc. Precise embeddings usage, tuning, analytics etc. Built-in image/audio generat

strudel-mcp-server๐Ÿ“v2.0.0๐ŸŒฟ Growingโญ193

A Model Context Protocol (MCP) server that gives Claude direct control over Strudel.cc for AI-assisted music generation and live coding.

Auto-claude-code-research-in-sleep๐Ÿ“v0.4.4๐Ÿ›๏ธ Flagshipโญ7,173

ARIS โš”๏ธ (Auto-Research-In-Sleep) โ€” Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in โ€” works wi

vespa๐Ÿ“v8.675.23๐Ÿ›๏ธ Flagshipโญ6,886

AI + Data, online. https://vespa.ai

OmniRoute๐Ÿ“v3.6.9๐ŸŒณ Matureโญ3,250

OmniRoute is an AI gateway for multi-provider LLMs: an OpenAI-compatible endpoint with smart routing, load balancing, retries, and fallbacks. Add policies, rate limits, caching, and observability for

mcp-devtools๐Ÿ“v0.59.53๐ŸŒฟ Growingโญ134

A modular MCP server that provides commonly used developer tools for AI coding agents

node9-proxy๐Ÿ“v1.11.3๐ŸŒฟ Growingโญ118

The Execution Security Layer for the Agentic Era. Providing deterministic "Sudo" governance and audit logs for autonomous AI agents.

codebase-context๐Ÿ“v2.3.0๐ŸŒฑ Seedlingโญ43

Generate a map of your codebaseto help AI Agents understand your architecture, coding conventions and patterns. Discoverable with Semantic Search

ISC-Bench๐Ÿ“v0.0.5๐ŸŒณ Matureโญ799

Internal Safety Collapse: Turning the LLM or an AI Agent into a sensitive data generator.

synaptic-memory๐Ÿ“v0.16.0๐ŸŒฑ Seedlingโญ27

Brain-inspired knowledge graph: spreading activation, Hebbian learning, memory consolidation.

capsule๐Ÿ“v0.8.8๐ŸŒฟ Growingโญ278

A secure, durable runtime to sandbox AI agent tasks. Run untrusted code in isolated WebAssembly environments.

vobase๐Ÿ“create-vobase@0.6.2๐ŸŒฑ Seedlingโญ44

The app framework built for AI coding agents. Own every line. Your AI already knows how to build on it.

connectonion๐Ÿ“v0.9.1๐ŸŒณ Matureโญ863

The Best AI Agent Framework for Agent Collaboration.

caveman๐Ÿ“v1.6.0๐Ÿ›๏ธ Flagshipโญ42,198

๐Ÿชจ why use many token when few token do trick โ€” Claude Code skill that cuts 65% of tokens by talking like caveman

panguard-ai๐Ÿ“v1.4.19๐ŸŒฑ Seedlingโญ38

Open-source security platform for AI agents -- audits skills before install, monitors 24/7, shares threat intelligence across all users. | AI Agent ้–‹ๆบๅฎ‰ๅ…จๅนณๅฐ -- ๅฎ‰่ฃๅ‰ๅฏฉ่จˆ skillใ€24/7 ๅณๆ™‚็›ฃๆŽงใ€็คพ็พคๅ…ฑไบซๅจ่„…ๆƒ…ๅ ฑใ€‚

sentry-mcp๐Ÿ“0.32.0๐ŸŒณ Matureโญ658

An MCP server for interacting with Sentry via LLMs.

solon-ai๐Ÿ“v3.10.2๐ŸŒฟ Growingโญ362

Java AI application development framework (supports LLM-tool,skill; RAG; MCP; Agent-ReAct,Team-Agent). Compatible with java8 ~ java25. It can also be embedded in SpringBoot, jFinal, Vert.x, Quarkus, a

voltagent๐Ÿ“@voltagent/server-elysia@2.0.7๐Ÿ›๏ธ Flagshipโญ8,380

AI Agent Engineering Platform built on an Open Source TypeScript AI Agent Framework

agentic-memory๐Ÿ“0.0.0๐ŸŒฟ Growingโญ179

No description

by lhl
agent-skills-standard๐Ÿ“php-v1.3.2๐ŸŒฟ Growingโญ428

A collection of Agent Skills Standard and Best Practice for Programming Languages, Frameworks that help our AI Agent follow best practies on frameworks and programming laguages

LRAT๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ39

The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.

trpc-agent-go๐Ÿ“v1.8.0๐ŸŒณ Matureโญ1,110

trpc-agent-go is a powerful Go framework for building intelligent agent systems using large language models (LLMs) and tools.

everything-claude-code๐Ÿ“v1.10.0๐Ÿ›๏ธ Flagshipโญ163,083

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

Boucle-framework๐Ÿ“v0.12.0๐ŸŒฟ Growingโญ75

Autonomous agent framework with structured memory, safety hooks, and loop management. Built by the agent that runs on it.

any-agent๐Ÿ“1.18.0๐ŸŒณ Matureโญ1,153

A single interface to use and evaluate different agent frameworks

ai-agents-reality-check๐Ÿ“0.0.0๐ŸŒฟ Growingโญ57

Benchmarking the gap between AI agent hype and architecture. Three agent archetypes, 73-point performance spread, stress testing, network resilience, and ensemble coordination analysis with statistica

llmware๐Ÿ“v0.4.6๐ŸŒฟ Growingโญ14,862

Unified framework for building enterprise RAG pipelines with small, specialized models

claude-codex-settings๐Ÿ“v2.3.0๐ŸŒณ Matureโญ623

My personal Claude Code and OpenAI Codex setup with battle-tested skills, commands, hooks, agents and MCP servers that I use daily.

LIA-Assistant๐Ÿ“v1.17.1๐ŸŒฑ Seedlingโญ17

Open-source multi-agent AI assistant powered by LangGraph, FastAPI & Next.js โ€” 16+ agents, Human-in-the-Loop, MCP integration, voice TTS, RAG, 500+ metrics, 6 languages.

vibescan๐Ÿ“0.0.0๐ŸŒฟ Growingโญ52

Security scanner for AI-generated ("vibe-coded") code. Runs SAST, DAST, and sandboxed exploit simulation across 15+ languages using 30+ tools. Catches what LLMs introduce before it ships โ€” wit

turing๐Ÿ“v2026.2.4๐ŸŒฟ Growingโญ70

:sparkles: :dna: Turing ES - Enterprise Search, Semantic Navigation, Chatbot using Search Engine and Generative AI.

ai-agent-handbook๐Ÿ“0.0.0๐ŸŒฟ Growingโญ67

Comprehensive guide to AI agent engineering: how 30+ frameworks actually work under the hood. Context rot, compaction, system prompt assembly, SOUL.md, agent loops, memory systems, tool sprawl, MCP,

arag๐Ÿ“v0.1.0๐ŸŒฟ Growingโญ252

A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.

career-ops๐Ÿ“v1.5.0๐ŸŒฟ Growingโญ37,883

AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.

camofox-browser๐Ÿ“v2.1.1๐ŸŒฟ Growingโญ80

Anti-detection browser server for AI agents โ€” REST API wrapping Camoufox engine with OpenClaw plugin support

cyllama๐Ÿ“0.2.11๐ŸŒฑ Seedlingโญ25

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

seismic๐Ÿ“v0.4.0๐ŸŒฟ Growingโญ118

Official repository of the Seismic library.

RAGElo๐Ÿ“0.4.0๐ŸŒฟ Growingโญ128

RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker

JRVS๐Ÿ“0.0.0๐ŸŒฟ Growingโญ236

JRVS AI Agent with JARCORE autonomous coding engine - RAG knowledge base, web scraping, calendar, code generation. Powered by whatever local AI you choose.

ragas๐Ÿ“v0.4.3๐ŸŒณ Matureโญ13,570

Supercharge Your LLM Application Evaluations ๐Ÿš€

Observal๐Ÿ“v0.2.0๐ŸŒฟ Growingโญ572

Observal is an AI agent registry with first in class observabilty and eval framework

GTA๐Ÿ“v0.2.0๐ŸŒฟ Growingโญ143

[NeurIPS 2024 D&B] GTA: A Benchmark for General Tool Agents & [arXiv 2026] GTA-2

MiniSearch๐Ÿ“main@2026-04-20๐ŸŒฟ Growingโญ558

Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space

yao-meta-skill๐Ÿ“main@2026-04-19๐ŸŒฟ Growingโญ297

YAO = Yielding AI Outcomes. A lightweight but rigorous system for creating, evaluating, packaging, and governing reusable agent skills.

OpenClawProBench๐Ÿ“main@2026-04-15๐ŸŒฟ Growingโญ453

OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

claw-eval๐Ÿ“main@2026-04-15๐ŸŒฟ Growingโญ465

Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.

TrustRAG๐Ÿ“0.0.0๐ŸŒณ Matureโญ1,253

TrustRAG๏ผšThe RAG Framework within Reliable input,Trusted output

AgenticX๐Ÿ“v0.3.7๐ŸŒฟ Growingโญ114

AgenticX is a unified, production-ready multi-agent platform โ€” Python SDK + CLI (agx) + Studio server + Machi desktop app. Features Meta-Agent orchestration, 15+ LLM providers, MCP Hub, hierarchical m

magi-markdown๐Ÿ“main@2026-04-11๐ŸŒฟ Growingโญ552

MAGI: Markdown for Agent Guidance & Instruction - A next-generation markdown extension designed specifically for AI systems. MAGI enhances standard markdown with structured metadata, embedded AI instr

PageIndex๐Ÿ“main@2026-04-10๐ŸŒฟ Growingโญ25,597

๐Ÿ“‘ PageIndex: Document Index for Vectorless, Reasoning-based RAG

atomic-knowledge๐Ÿ“v0.2.0๐ŸŒฑ Seedlingโญ36

Markdown-first work-memory protocol for existing agents, with maintained knowledge, candidate notes, evals, and an example KB.

tulip_agent๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ44

autonomous agent with access to a tool library

sec-edgar-mcp๐Ÿ“v1.0.8๐ŸŒฟ Growingโญ253

A SEC EDGAR MCP (Model Context Protocol) Server

ruby_llm-contract๐Ÿ“v0.7.0๐ŸŒฑ Seedlingโญ25

Handle LLM output variance for ruby_llm โ€” retry on malformed JSON or rule violations, escalate to a smarter model, measure variance on datasets, gate CI on regressions.

awesome-pydantic-ai๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ58

An opinionated list of awesome Pydantic-AI frameworks, libraries, software and resources.

memind๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ481

Self-evolving cognitive memory and context engine for AI agents in Java. Empowering 24/7 proactive agents like OpenClaw with understanding and SOTA performance.

pdd๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ656

Prompt Driven Development Command Line Interface

karpathy-llm-wiki๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ43

The Self-Growing Karpathy LLM Wiki โ€” grown by an AI agent yoyo from Karpathy's founding prompt

awesome-prompts๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ7,671

Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.

deer-flow๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ63,234

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of ta

multi-agent-ralph-loop๐Ÿ“main@2026-04-20๐ŸŒฟ Growingโญ126

Autonomous orchestration framework for Claude Code with MemPalace-inspired memory (4-layer stack, 818-token wake-up), parallel-first Agent Teams (6 teammates), Aristotle First Principles methodology,

deep-code-reasoning-mcp๐Ÿ“main@2026-04-20๐ŸŒฟ Growingโญ105

A Model Context Protocol (MCP) server that provides advanced code analysis and reasoning capabilities powered by Google's Gemini AI

agentic-chatops๐Ÿ“main@2026-04-20๐ŸŒฟ Growingโญ100

3-tier agentic ChatOps (n8n + GPT-4o + Claude Code) implementing all 21 patterns from "Agentic Design Patterns" โ€” solo operator managing 137 devices

auto-deep-researcher-24x7๐Ÿ“main@2026-04-19๐ŸŒฟ Growingโญ622

๐Ÿ”ฅ An autonomous AI agent that runs your deep learning experiments 24/7 while you sleep. Zero-cost monitoring, Leader-Worker architecture, constant-size memory.

cognitive-dissonance-dspy๐Ÿ“main@2026-04-14๐ŸŒฟ Growingโญ276

A multi-agent LLM system for detecting and resolving cognitive dissonance.

hermes-agent-rs๐Ÿ“v0.0.4๐ŸŒฑ Seedlingโญ17

Hermes Agent rewritten in Rust: production-grade multi-platform AI agent runtime with gateway adapters, tool orchestration, MCP, memory plugins, and cost-safe autonomous loops.

Awesome-Repo-Level-Code-Generation๐Ÿ“main@2026-04-10๐ŸŒฟ Growingโญ280

Must-read papers on Repository-level Code Generation & Issue Resolution ๐Ÿ”ฅ

cdpilot๐Ÿ“v0.3.0๐ŸŒฑ Seedlingโญ25

Zero-dependency browser automation CLI. 70+ commands, 10 test assertions, smart commands (click/fill by text โ€” no LLM needed). MCP server for AI agents with 500x fewer tokens. Extract, observe, script

agentshield๐Ÿ“v1.4.0๐ŸŒฟ Growingโญ522

AI agent security scanner. Detect vulnerabilities in agent configurations, MCP servers, and tool permissions. Available as CLI, GitHub Action, ECC plugin, and GitHub App integration. ๐Ÿ›ก๏ธ

Cogitator-AI๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ36

๐Ÿค– Kubernetes for AI Agents. Self-hosted, production-grade runtime for orchestrating LLM swarms and autonomous agents. TypeScript-native.

memory_agent_hub๐Ÿ“main@2026-04-20๐ŸŒฑ Seedlingโญ40

2026 swarm Agent ๅนด๏ผŒswarm Agent ใ€Agent teamใ€ ai codingใ€skillใ€memoryใ€evolveใ€agentic RL ็ญ‰ AI Agent้›†ๅˆ

polymarket-trader-mcp๐Ÿ“v1.6.7๐ŸŒฑ Seedlingโญ5

The most comprehensive MCP server for Polymarket โ€” 48 tools spanning direct trading, market discovery, smart money tracking, copy trading, backtesting, risk management, and portfolio optimization. Wor

nix-ai๐Ÿ“v1.48.2๐ŸŒฑ Seedlingโญ5

Your AI coding toolkit, declared in Nix โ€” Claude, Gemini, Copilot, 15+ MCP servers, one flake

mayros๐Ÿ“v0.3.2๐ŸŒฑ Seedlingโญ10

Production-ready AI agent framework โ€” semantic memory, multi-agent mesh, MCP server, intelligent routing, governance, and 67+ platform integrations.

claude-code-config๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ88

Claude Code skills, architectural principles, and alternative approaches for AI-assisted development

learn-hermes-agent๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ16

A 27-chapter hands-on tutorial for building an autonomous AI agent from zero in Python. Agent loop, tool system, memory, skills, MCP, multi-platform gateway, and self-evolution โ€” inspired by Herme

goal-md๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ128

A goal-specification file for autonomous coding agents. Generalizes Karpathy's autoresearch to domains with constructed metrics.

NanoCoder-Pro๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ54

NanoCoder Pro โ€” Autonomous Coding Agent with Master-SubAgent Architecture

simplenote-mcp-server๐Ÿ“v1.15.0๐ŸŒฑ Seedlingโญ17

MCP Server for Simplenote integration with Claude Desktop

statelessagent๐Ÿ“v0.12.5๐ŸŒฑ Seedlingโญ16

Your AI forgets everything between sessions. SAME fixes that. Local-first, no API keys, single binary.

llm_context_benchmarks๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ59

๐Ÿ“Š LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimiz

weave-cli๐Ÿ“v0.12.3๐ŸŒฑ Seedlingโญ21

A universal CLI for Weaviate, Milvus, Chroma, Qdrant, and other vector DBs to help view, list, create, delete, and search collections and documents in collections for development, test, and debugging

elsium-ai๐Ÿ“elsium-ai@0.10.0๐ŸŒฑ Seedlingโญ8

Production-grade TypeScript AI runtime focused on reliability, governance, and reproducible LLM systems. Multi-provider gateway, agents, RAG, workflows, policy engine, audit trails, and deterministic

kernel๐Ÿ“v3.97.0๐ŸŒฑ Seedlingโญ12

kbot โ€” the AI agent that dreams, learns, and evolves. 764+ tools, 35 agents, 20 providers. Music production, iPhone control, financial analysis, cyber threat intel. Always-on daemon. Runs offline. npm

dory๐Ÿ“v0.1.0๐ŸŒฑ Seedlingโญ14

One memory layer for every AI agent. Local-first, markdown source of truth, and CLI/HTTP/MCP native. Your agent forgot who you are. Again. Dory fixes that.

sinain-hud๐Ÿ“overlay-v2.8.0๐ŸŒฑ Seedlingโญ5

Ambient intelligence that sees what you see, hears what you hear, and acts on your behalf

smalltalk-dev-plugin๐Ÿ“v1.7.8๐ŸŒฑ Seedlingโญ11

Claude Code plugin for AI-driven Smalltalk (Pharo) development

agent2๐Ÿ“v0.1.0๐ŸŒฑ Seedlingโญ26

The production runtime for AI agents. Schema in, API out. Built on PydanticAI + FastAPI.

LettuceDetect๐Ÿ“0.1.8๐Ÿ’ค Dormantโญ565

Lightweight hallucination detection framework for RAG applications

claude-ruby-grape-rails๐Ÿ“v1.13.4๐ŸŒฑ Seedlingโญ5

Claude Code plugin for Ruby, Rails, Grape, PostgreSQL, Redis, and Sidekiq development

DreamServer๐Ÿ“v2.0.0๐ŸŒฟ Growingโญ443

Local AI anywhere, for everyone โ€” LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.

claude-skills๐Ÿ“v2.0.0๐ŸŒฟ Growingโญ12,208

220+ Claude Code skills & agent plugins for Claude Code, Codex, Gemini CLI, Cursor, and 8 more coding agents โ€” engineering, marketing, product, compliance, C-level advisory.

deltallm๐Ÿ“v0.1.21-rc1๐ŸŒฑ Seedlingโญ4

Route, manage, and analyze your LLM requests across multiple providers with a unified API interface

rex-cli๐Ÿ“v0.17.0๐ŸŒฑ Seedlingโญ34

Local-first AI agent bootstrap: Playwright Browser MCP + ContextDB for Codex CLI, Claude Code, Gemini CLI, and OpenCode.

agent-knowledge-cycle๐Ÿ“v2.0.0๐ŸŒฑ Seedlingโญ3

Memory-centric self-improving harness for AI agents. Six-phase cycle + Security by Absence. ADRs, JSON schemas, and a dependency-free Python reference.

claude-forge๐Ÿ“v1.0.0๐ŸŒฑ Seedlingโญ659

Supercharge Claude Code with 11 AI agents, 36 commands & 15 skills โ€” the claude-code plugin framework inspired by oh-my-zsh. 6-layer security hooks included. 5-min install.

Geneclaw๐Ÿ“v0.1.0๐ŸŒฑ Seedlingโญ36

Self-evolving AI agent framework with 5-layer safety gatekeeper. Agents observe failures, propose fixes, and safely apply them. Built on HKUDS/nanobot.

Nightshift๐Ÿ“v0.0.7๐ŸŒฑ Seedlingโญ1

Autonomous overnight codebase improvement agent for Claude Code. Run it before bed, wake up to production-ready fixes.

OriginDL๐Ÿ“v1.0.0๐ŸŒฑ Seedlingโญ260

Implement a Pytorch-like DL library in C++ from scratch, step by step

devkit๐Ÿ“v2.1.29๐ŸŒฑ Seedlingโญ2

A deterministic development harness for Claude Code โ€” MCP workflow engine, enforcement hooks, YAML workflows, and multi-agent consensus (Claude + Codex + Gemini)

RagaAI-Catalyst๐Ÿ“v2.2.4๐Ÿ’ค Dormantโญ16,141

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal

ZimaOS-Blue๐Ÿ“0.10.39๐ŸŒฑ Seedlingโญ10

ZimaOS Blue - A Local-First Agent Runtime for Bold Builders. Out-of-the-Box, Open-Source, Universal, Vendor-Neutral

DOX๐Ÿ“main@2026-04-15๐ŸŒฑ Seedlingโญ2

Broken RAG For The Broken Souls

surf๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ1

The open framework for extensible & grounded AI agent orchestration.

ryvos๐Ÿ“v0.9.0๐ŸŒฑ Seedlingโญ2

Open-source autonomous AI assistant with 5-tier security, 62 tools, 14 LLM providers. Written in Rust. Single binary.

best-agent๐Ÿ“v1.0.0๐ŸŒฑ Seedlingโญ6

Self-evolving Claude Code wrapper โ€” handles any computer work a human can do. 94+ skills, 14 agents, computer use, self-improvement.

uniAI๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ1

Syllabus-aware RAG study assistant for university students. Answers strictly from your own notes & PDFs, unit-scoped retrieval, cross-encoder reranking, and a hallucination gate โ€” built to help studen

geon-decoder๐Ÿ“main@2026-04-11๐ŸŒฑ Seedlingโญ3

GEON: Structure-first decoding via equivalence classes and field closure

ralphglasses๐Ÿ“v0.2.0๐ŸŒฑ Seedlingโญ3

Multi-LLM agent orchestration TUI โ€” parallel Claude/Gemini/Codex sessions, 126 MCP tools

sawzhang_skills๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ2

Claude Code skills collection โ€” CCA study guides, Twitter research, MCP review, auto-iteration tools

agenttel-sdk๐Ÿ“v0.3.0-alpha๐ŸŒฑ Seedlingโญ6

Agent-ready telemetry SDK โ€” enriches OpenTelemetry across Java, Go, Python, Node.js, and browser with structured context for AI-driven observability.

cf-browser๐Ÿ“v2.0.0๐ŸŒฑ Seedlingโญ5

Open-source Cloudflare Browser Rendering proxy โ€” 10 MCP tools for Claude Code (content, screenshot, PDF, markdown, scrape, JSON AI extraction, links, a11y, crawl)

pytorch_template๐Ÿ“v0.3.0๐ŸŒฑ Seedlingโญ10

AI-agent-friendly PyTorch research pipeline โ€” one YAML config drives preflight, training, Optuna HPO, and real-time TUI monitoring

Agent_Life_Space๐Ÿ“v1.36.0๐ŸŒฑ Seedlingโญ1

Self-hosted autonomous AI agent โ€” 9-layer cascade, Docker sandbox, encrypted vault, review/build/control plane, 1407+ tests

@poofnew/vibe-check๐Ÿ“0.1.1๐ŸŒฑ Seedlingโญ5

AI agent evaluation framework for Claude and beyond

evo-agents๐Ÿ“master@2026-04-19๐ŸŒฑ Seedlingโญ3

Complete Workspace Template for OpenClaw - Full agent lifecycle with unified memory system (Markdown + SQLite), self-evolution, RAG. Not for SubAgent/Skill use.

RustClaw๐Ÿ“v0.5.0๐ŸŒฑ Seedlingโญ2

Lean Rust AI agent: 6MB binary, 7.9MB RAM. OpenClaw replacement. Telegram + Discord + GitHub auto-PR. Ollama/Anthropic support.

agent-regression-testing๐Ÿ“0.1.14๐ŸŒฑ Seedling

A standalone library for AI agent regression testing using LLM-as-judge evaluation

self-correcting-rag-chatbot๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ2

๐Ÿค– Enhance chatbot accuracy with a self-correcting RAG system that ingests documents, retrieves data, and evaluates responses in real-time.

ClosedSSPM๐Ÿ“v0.4.1๐ŸŒฑ Seedlingโญ1

An open-source SSPM tool written in Go

reasonkit-mem๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ1

๐Ÿš€ Build memory and retrieval infrastructure for ReasonKit, enhancing data management and access for your applications with ease and efficiency.

Government-Citizen-Services-Voice-Agent๐Ÿ“main@2026-04-15๐ŸŒฑ Seedlingโญ1

Autonomous, multilingual AI voice agent using ElevenLabs, LangGraph, and RAG for government services

CodeRAG๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ1

Build semantic vector databases from code and docs to enable AI agents to understand and navigate your entire codebase effectively.

retrieval-augmented-generation๐Ÿ“v1.0.0๐Ÿ’ค Dormantโญ33

Reference Implementations for the RAG bootcamp

idle-harness๐Ÿ“main@2026-04-18๐ŸŒฑ Seedlingโญ1

GAN-inspired multi-agent system that autonomously builds full-stack web apps from a single prompt using Claude AI agents

rag-news-summarizer๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ1

๐Ÿ“ฐ Fetch and summarize news articles locally using a Retrieval-Augmented Generation system powered by AI models for efficient information access.

harness๐Ÿ“master@2026-04-21๐ŸŒฑ Seedlingโญ1

Define and control AI agents in markdown with full prompt transparency, persistent memory, and integrated tools via the Claude Agent SDK.

selfmodel๐Ÿ“v0.3.0๐ŸŒฑ Seedlingโญ1

A self-evolving AI Agent Team โ€” agents that rewrite their own operating manual.

fastRAG๐Ÿ“v3.1.2๐Ÿ’ค Dormantโญ1,776

Efficient Retrieval Augmentation and Generation Framework

lmnr0.7.47๐ŸŒฑ Seedling

Python SDK for Laminar

boostedblob1.0.0๐ŸŒฑ Seedling

Command line tool and async library to perform basic file operations on local paths, Google Cloud Storage paths and Azure Blob Storage paths.

@wix/eval-assertions0.51.0๐ŸŒฑ Seedling

Assertion framework for AI agent evaluations - supports skill invocation checks, build validation, and LLM-based judging

chat-flow๐Ÿ“0.0.0โšฐ๏ธ Archivedโญ687

ChatFlow - AI-based chat flow framework, personalize your ChatGPT workflows and build the road to automationใ€‚ChatFlow โ€”โ€” ๆ‰“้€ ไธชๆ€งๅŒ– ChatGPT ๆต็จ‹๏ผŒๆž„ๅปบ่‡ชๅŠจๅŒ–ไน‹่ทฏ