Search results for "dataset"
AI Observability & Evaluation
OpenSource MCP Marketplace | MCP Servers Tools Meta Dataset | Web API | Web Client Integration
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pr
Local-first memory plugin for OpenClaw AI agents. LLM-powered extraction, plain markdown storage, hybrid search via QMD. Gives agents persistent long-term memory across conversations.
Persistent memory for AI coding agents
A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.
One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.
The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.
LLM Agent that leverages cheminformatics tools to provide informed responses.
Generic rag framework to apply the power of LLMs on any given dataset
Group Evolving Agents: Open-Ended Self-Improvement via Experience Sharing
Official Code Release of SAGE: Scalable Agentic 3D Scene Generation for Embodied AI
Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Management, Vault, Playground. ππ» Integrates with 50+ LLM Providers,
Knowledge Engine for AI Agent Memory in 6 lines of code
A secure, durable runtime to sandbox AI agent tasks. Run untrusted code in isolated WebAssembly environments.
Autonomous Agents (LLMs) research papers. Updated Daily.
Semiont supports human+ai collaborative knowledge work. Use it as: a Wiki, Semantic Layer, Context Graph, Knowledge Base, Annotator, Research Tool, or Agentic Memory...
π₯ Comprehensive survey on Context Engineering: from prompt engineering to production-grade AI systems. hundreds of papers, frameworks, and implementation guides for LLMs and AI agents.
Lint your repo for AI agent compatibility.
The platform for LLM evaluations and AI agent testing
Make AI work for Everyone - Monitoring and governing for your AI/ML
Internal Safety Collapse: Turning the LLM or an AI Agent into a sensitive data generator.
Brain-inspired knowledge graph: spreading activation, Hebbian learning, memory consolidation.
Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.
A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.
Self-evolving cognitive memory and context engine for AI agents in Java. Empowering 24/7 proactive agents like OpenClaw with understanding and SOTA performance.
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website
2026 swarm Agent εΉ΄οΌswarm Agent γAgent teamγ ai codingγskillγmemoryγevolveγagentic RL η AI Agentιε
A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding β they're redefining how software changes the world.
π₯ An autonomous AI agent that runs your deep learning experiments 24/7 while you sleep. Zero-cost monitoring, Leader-Worker architecture, constant-size memory.
METAβAGENTIC Ξ±βAGI ποΈβ¨ β Mission π― Endβtoβend: Identify π β OutβLearn π β OutβThink π§ β OutβDesign π¨ β OutβStrategise βοΈ β OutβExecute β‘
AI-first security scanner with 76 analyzers, 9,600+ detection rules, and repo poisoning detection for AI/ML, LLM agents, and MCP servers. Scan any GitHub repo with: medusa scan --git user/repo
BioMCP: Biomedical Model Context Protocol
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.
Pragmatic AI Labs MCP Agent Toolkit - An MCP Server designed to make code with agents more deterministic
Claw-Eval is an evaluation harness for evaluating LLM as agents. All tasks verified by humans.
Unified framework for building enterprise RAG pipelines with small, specialized models
Semantic code searcher and codebase utility
π¬ Harness Vibe Research with Self-evolving AI Scientists
NextPlaid, ColGREP: Multi-vector search, from database to coding agents.
Must-read papers on Repository-level Code Generation & Issue Resolution π₯
DSPEx - Declarative Self-improving Elixir | A BEAM-Native AI Program Optimization Framework
A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines
Declarative Self Improving Elixir - DSPy Orchestration in Elixir
No description
The LLM Evaluation Framework
A powerful multi-database server implementing the Model Context Protocol (MCP) to provide AI assistants with structured access to databases.
A lock-free, in-memory fuzzy search engine for Kotlin Multiplatform. L2-normalized sparse vector embeddings with O(1) cosine similarity β handles typos, transpositions, and blind continuation. Zero-al
kbot β the AI agent that dreams, learns, and evolves. 764+ tools, 35 agents, 20 providers. Music production, iPhone control, financial analysis, cyber threat intel. Always-on daemon. Runs offline. npm
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
Autonomous VAPT platform. Give it a target (FQDN, IP, CIDR) β it hunts, it reports. Inspired by the Obsidian Order.
Your second brain, starting today. CLI + MCP server that helps you build, maintain, and search a knowledge vault that gets better every day. Works with any AI provider. Local-first, zero-prereq instal
"RAG-Anything: All-in-One RAG Framework"
My personal Claude Code and OpenAI Codex setup with battle-tested skills, commands, hooks, agents and MCP servers that I use daily.
High-Performance Engine for Multi-Vector Search
LLM proxy to observe and debug what your AI agents are doing.
π« CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
A selective learning and memory substrate for agentic systems β typed, revisable, decayable memory with competence learning and trust-aware retrieval.
The fullstack MCP framework to develop MCP Apps for ChatGPT / Claude & MCP Servers for AI Agents.
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation, SF
The official PHP SDK for Model Context Protocol servers and clients. Maintained in collaboration with The PHP Foundation.
A MCP server to use StatCAN data
Benchmark for vector databases.
Implement a Pytorch-like DL library in C++ from scratch, step by step
π§ Capture and manage your team's knowledge effortlessly with Eywa, ensuring no valuable memory is ever lost.
Supercharge Your LLM Application Evaluations π
π€ Generate tailored AI training datasets quickly and easily, transforming your domain knowledge into essential training data for model fine-tuning.
Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)
π The open-source Wikipedia of AI β 2M+ apps, agents, LLMs & datasets. Updated daily with tools, tutorials & news.
BRUNELLA AGENT SYSTEM (BAS) β A JΓVΕ DIGITΓLIS SZERVEZETE
Lightweight hallucination detection framework for RAG applications
HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research
Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal
A Model Context Protocol (MCP) server that provides secure, read-only access to BigQuery datasets. Enables Large Language Models (LLMs) to safely query and analyze data through a standardized interfac
Robust, fast, scalable, and sandboxed open-source online code execution system for humans and AI.
Medical-AI is a AI framework specifically for Medical Applications https://aibharata.github.io/medicalAI/
