Search results for "datasets"
AI Observability & Evaluation
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
๐ฑ A little course on Reinforcement Learning Environments for evaluating and training Language Models
A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.
Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) workshop, jointly held at (COLING 2022)
One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.
The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.
๐ชข Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. ๐YC W23
Autonomous Agents (LLMs) research papers. Updated Daily.
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pr
The platform for LLM evaluations and AI agent testing
Latitude is the open-source agent engineering platform
Agentic RAG R1 Framework via Reinforcement Learning
SRE Agent - CNCF Sandbox Project
Fileless C2 agent written in pure x64 Assembly for Linux. Features stealth ICMP tunneling, memory-only execution via memfd_create, and terminal-independent daemonization.
ReLE่ฏๆต๏ผไธญๆAIๅคงๆจกๅ่ฝๅ่ฏๆต๏ผๆ็ปญๆดๆฐ๏ผ๏ผ็ฎๅๅทฒๅๆฌ359ไธชๅคงๆจกๅ๏ผ่ฆ็chatgptใgpt-5.2ใo4-miniใ่ฐทๆญgemini-3-proใClaude-4.6ใๆๅฟERNIE-X1.1ใERNIE-5.0ใqwen3-maxใqwen3.5-plusใ็พๅทใ่ฎฏ้ฃๆ็ซใๅๆฑคsenseChat็ญๅ็จๆจกๅ๏ผ ไปฅๅstep3.5-flashใkimi-k2.5ใernie4.5ใMin
Make AI work for Everyone - Monitoring and governing for your AI/ML
Brain-inspired knowledge graph: spreading activation, Hebbian learning, memory consolidation.
Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.
A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.
Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, backup, re-embed (using any model) or access your vector data fro
Self-evolving cognitive memory and context engine for AI agents in Java. Empowering 24/7 proactive agents like OpenClaw with understanding and SOTA performance.
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website
A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding โ they're redefining how software changes the world.
Excalibase GraphQL instantly turns your database into a GraphQL API. Built with Spring Boot, it supports schema discovery, subscriptions, and type handling โ no manual resolvers needed.
The Next-Gen Agent-Native Skill Recommendation Engine
Conversational & memory-enabled AI research partner for multi-omics analysis. From biological idea to full research paper.
BioMCP: Biomedical Model Context Protocol
OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.
Unified framework for building enterprise RAG pipelines with small, specialized models
A multi-agent LLM system for detecting and resolving cognitive dissonance.
RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.
A curated list of awesome works related to high dimensional structure/vector search & database
NextPlaid, ColGREP: Multi-vector search, from database to coding agents.
Must-read papers on Repository-level Code Generation & Issue Resolution ๐ฅ
DSPEx - Declarative Self-improving Elixir | A BEAM-Native AI Program Optimization Framework
A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines
BigQuery MCP server for Claude โ query any BigQuery dataset in natural language, with built-in SEO analysis tools for GSC bulk export data
Multi-agent swing trading system โ automated screening, research, and execution with backtesting and live trading
The LLM Evaluation Framework
A collection of Agent Skills Standard and Best Practice for Programming Languages, Frameworks that help our AI Agent follow best practies on frameworks and programming laguages
Self-hosted AI Agent Memory + Code Intelligence Platform โ one MCP endpoint for persistent memory, AST-aware code search, shared knowledge, and quality enforcement across all your AI coding agents.
The official TypeScript/Node client for the Pinecone vector database
SQLite-Vector is a cross-platform, ultra-efficient SQLite extension that brings vector search capabilities to your embedded database.
Swift-based vector database for on-device RAG using MLTensor and MLX Embedders
๐ช ๐ง Model Context Protocol (MCP) Server for Jupyter.
structured outputs for llms
TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
Developer-focused Mapbox MCP Server
A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.
The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text.
"RAG-Anything: All-in-One RAG Framework"
High-Performance Engine for Multi-Vector Search
๐ซ CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
ArcadeDB Multi-Model Database, one DBMS that supports SQL, Cypher, Gremlin, HTTP/JSON, MongoDB and Redis. ArcadeDB is a conceptual fork of OrientDB, the first Multi-Model DBMS. ArcadeDB supports Vecto
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
Build AI agents that actually do things. Synapse is an open-source platform for creating, connecting, and orchestrating AI agents powered by any LLM โ local or cloud.
๐ง Discover and evaluate advanced benchmark datasets for Large Language Model agents to enhance performance assessment in real-world tasks.
Benchmark for vector databases.
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
๐ง Capture and manage your team's knowledge effortlessly with Eywa, ensuring no valuable memory is ever lost.
Enable autonomous AI workflows with a local-first, zero-trust Rust framework for high-performance multi-agent orchestration and deterministic execution.
File-based autonomous agentic research swarm template (Planner/Worker/Judge) with contracts, workstreams, and deterministic quality gates.
Supercharge Your LLM Application Evaluations ๐
๐ค Generate tailored AI training datasets quickly and easily, transforming your domain knowledge into essential training data for model fine-tuning.
A modular deep learning framework for training and evaluating image classification models on datasets like CIFAR-10 and MNIST. Supports configurable CNN architectures, automated training, and performa
Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)
๐ The open-source Wikipedia of AI โ 2M+ apps, agents, LLMs & datasets. Updated daily with tools, tutorials & news.
โก Optimize vector searches with a hyper-efficient cache that uses machine learning for faster, smarter data access and reduced costs.
๐ Build an enterprise-ready RAG system to enhance technical documentation querying with LangGraph and multi-step reasoning workflows.
High-Performance Tokenizer implementation in PHP.
Official Python package for working with the Roboflow API
A record linkage toolkit for linking and deduplication
Microsoft Corporation Azure AI Projects Client Library for Python
dlt is an open-source python-first scalable data loading library that does not require any backend to run.
Toolbox for imbalanced dataset in machine learning
Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge base
Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal
A Python-Script Based Generative AI platform
A Model Context Protocol (MCP) server that provides secure, read-only access to BigQuery datasets. Enables Large Language Models (LLMs) to safely query and analyze data through a standardized interfac
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and
