freshcrate

Search results for "datasets"

83 results found
opik๐Ÿ“2.0.6๐ŸŒณ Matureโญ18,767

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

llm-rl-environments-lil-course๐Ÿ“main@2026-04-17๐ŸŒฟ Growingโญ57

๐ŸŒฑ A little course on Reinforcement Learning Environments for evaluating and training Language Models

RAGHub๐Ÿ“main@2026-04-17๐ŸŒณ Matureโญ1,712

A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.

Constrained-Text-Generation-Studio๐Ÿ“0.0.0๐ŸŒฟ Growingโญ216

Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) workshop, jointly held at (COLING 2022)

LLM-Agents-Ecosystem-Handbook๐Ÿ“0.0.0๐ŸŒณ Matureโญ508

One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.

LRAT๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ34

The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.

langfuse๐Ÿ“v3.169.0๐ŸŒฟ Growingโญ24,578

๐Ÿชข Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. ๐ŸŠYC W23

Autonomous-Agents๐Ÿ“main@2026-04-16๐ŸŒฟ Growingโญ1,211

Autonomous Agents (LLMs) research papers. Updated Daily.

CodeGen๐Ÿ“0.0.0๐ŸŒณ Matureโญ773

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pr

langwatch๐Ÿ“skills@v0.3.0๐ŸŒฟ Growingโญ3,193

The platform for LLM evaluations and AI agent testing

latitude-llm๐Ÿ“claude-code-telemetry-0.0.5๐ŸŒฟ Growingโญ3,955

Latitude is the open-source agent engineering platform

Agentic-RAG-R1๐Ÿ“0.0.0๐ŸŒฟ Growingโญ412

Agentic RAG R1 Framework via Reinforcement Learning

ICMP-Ghost-A-Fileless-x64-Assembly-C2-Agent๐Ÿ“v3.6.2๐ŸŒฟ Growingโญ163

Fileless C2 agent written in pure x64 Assembly for Linux. Features stealth ICMP tunneling, memory-only execution via memfd_create, and terminal-independent daemonization.

chinese-llm-benchmark๐Ÿ“v5.9๐ŸŒฟ Growingโญ5,841

ReLE่ฏ„ๆต‹๏ผšไธญๆ–‡AIๅคงๆจกๅž‹่ƒฝๅŠ›่ฏ„ๆต‹๏ผˆๆŒ็ปญๆ›ดๆ–ฐ๏ผ‰๏ผš็›ฎๅ‰ๅทฒๅ›Šๆ‹ฌ359ไธชๅคงๆจกๅž‹๏ผŒ่ฆ†็›–chatgptใ€gpt-5.2ใ€o4-miniใ€่ฐทๆญŒgemini-3-proใ€Claude-4.6ใ€ๆ–‡ๅฟƒERNIE-X1.1ใ€ERNIE-5.0ใ€qwen3-maxใ€qwen3.5-plusใ€็™พๅทใ€่ฎฏ้ฃžๆ˜Ÿ็ซใ€ๅ•†ๆฑคsenseChat็ญ‰ๅ•†็”จๆจกๅž‹๏ผŒ ไปฅๅŠstep3.5-flashใ€kimi-k2.5ใ€ernie4.5ใ€Min

arthur-engine๐Ÿ“2.1.529๐ŸŒฟ Growingโญ75

Make AI work for Everyone - Monitoring and governing for your AI/ML

synaptic-memory๐Ÿ“v0.16.0๐ŸŒฑ Seedlingโญ25

Brain-inspired knowledge graph: spreading activation, Hebbian learning, memory consolidation.

datagouv-mcp๐Ÿ“v0.2.23๐ŸŒฟ Growingโญ1,216

Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.

arag๐Ÿ“v0.1.0๐ŸŒฟ Growingโญ247

A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.

vector-io๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ266

Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, backup, re-embed (using any model) or access your vector data fro

memind๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ360

Self-evolving cognitive memory and context engine for AI agents in Java. Empowering 24/7 proactive agents like OpenClaw with understanding and SOTA performance.

Awesome-World-Models๐Ÿ“main@2026-04-21๐ŸŒฟ Growingโญ1,473

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website

awesome-code-agents๐Ÿ“main@2026-04-20๐ŸŒฟ Growingโญ94

A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding โ€” they're redefining how software changes the world.

excalibase-graphql๐Ÿ“main@2026-04-19๐ŸŒฑ Seedlingโญ31

Excalibase GraphQL instantly turns your database into a GraphQL API. Built with Spring Boot, it supports schema discovery, subscriptions, and type handling โ€” no manual resolvers needed.

skills-vote๐Ÿ“main@2026-04-19๐ŸŒฑ Seedlingโญ31

The Next-Gen Agent-Native Skill Recommendation Engine

OmicsClaw๐Ÿ“main@2026-04-18๐ŸŒฟ Growingโญ116

Conversational & memory-enabled AI research partner for multi-omics analysis. From biological idea to full research paper.

biomcp๐Ÿ“v0.8.21๐ŸŒฟ Growingโญ488

BioMCP: Biomedical Model Context Protocol

OpenClawProBench๐Ÿ“main@2026-04-15๐ŸŒฟ Growingโญ340

OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

llmware๐Ÿ“v0.4.6๐ŸŒฟ Growingโญ14,857

Unified framework for building enterprise RAG pipelines with small, specialized models

cognitive-dissonance-dspy๐Ÿ“main@2026-04-14๐ŸŒฟ Growingโญ276

A multi-agent LLM system for detecting and resolving cognitive dissonance.

rag-chatbot๐Ÿ“main@2026-04-14๐ŸŒฟ Growingโญ402

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

awesome-vector-database๐Ÿ“main@2026-04-13๐ŸŒฟ Growingโญ341

A curated list of awesome works related to high dimensional structure/vector search & database

next-plaid๐Ÿ“v1.2.0๐ŸŒฟ Growingโญ331

NextPlaid, ColGREP: Multi-vector search, from database to coding agents.

Awesome-Repo-Level-Code-Generation๐Ÿ“main@2026-04-10๐ŸŒฟ Growingโญ274

Must-read papers on Repository-level Code Generation & Issue Resolution ๐Ÿ”ฅ

ds_ex๐Ÿ“main@2026-04-09๐ŸŒฑ Seedlingโญ17

DSPEx - Declarative Self-improving Elixir | A BEAM-Native AI Program Optimization Framework

UltraRAG๐Ÿ“v0.3.0.2๐ŸŒฟ Growingโญ5,480

A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

Suganthans-BigQuery-MCP-Server๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ25

BigQuery MCP server for Claude โ€” query any BigQuery dataset in natural language, with built-in SEO analysis tools for GSC bulk export data

swing-trading-agent๐Ÿ“0.0.0๐ŸŒฑ Seedlingโญ7

Multi-agent swing trading system โ€” automated screening, research, and execution with backtesting and live trading

agent-skills-standard๐Ÿ“php-v1.3.2๐ŸŒฑ Seedlingโญ391

A collection of Agent Skills Standard and Best Practice for Programming Languages, Frameworks that help our AI Agent follow best practies on frameworks and programming laguages

cortex-hub๐Ÿ“v0.7.0๐ŸŒฑ Seedlingโญ48

Self-hosted AI Agent Memory + Code Intelligence Platform โ€” one MCP endpoint for persistent memory, AST-aware code search, shared knowledge, and quality enforcement across all your AI coding agents.

pinecone-ts-client๐Ÿ“v7.2.0๐ŸŒฑ Seedlingโญ269

The official TypeScript/Node client for the Pinecone vector database

sqlite-vector๐Ÿ“0.9.95๐ŸŒฑ Seedlingโญ832

SQLite-Vector is a cross-platform, ultra-efficient SQLite extension that brings vector search capabilities to your embedded database.

VecturaKit๐Ÿ“5.3.0๐ŸŒฑ Seedlingโญ280

Swift-based vector database for on-device RAG using MLTensor and MLX Embedders

jupyter-mcp-server๐Ÿ“v1.0.0๐ŸŒฑ Seedlingโญ1,025

๐Ÿช ๐Ÿ”ง Model Context Protocol (MCP) Server for Jupyter.

instructor๐Ÿ“v1.15.1๐ŸŒฑ Seedlingโญ12,743

structured outputs for llms

tensorzero๐Ÿ“2026.4.0๐ŸŒฑ Seedlingโญ11,204

TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

mcp-devkit-server๐Ÿ“v0.6.0๐ŸŒฑ Seedlingโญ46

Developer-focused Mapbox MCP Server

spiceai๐Ÿ“v1.11.5๐ŸŒฑ Seedlingโญ2,868

A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.

infinity๐Ÿ“v0.7.0-dev5๐ŸŒฑ Seedlingโญ4,476

The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text.

RAG-Anything๐Ÿ“v1.2.10๐ŸŒฑ Seedlingโญ15,557

"RAG-Anything: All-in-One RAG Framework"

fast-plaid๐Ÿ“1.4.5๐ŸŒฑ Seedlingโญ239

High-Performance Engine for Multi-Vector Search

camel๐Ÿ“v0.2.90๐ŸŒฑ Seedlingโญ16,654

๐Ÿซ CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

arcadedb๐Ÿ“26.3.2๐ŸŒฑ Seedlingโญ793

ArcadeDB Multi-Model Database, one DBMS that supports SQL, Cypher, Gremlin, HTTP/JSON, MongoDB and Redis. ArcadeDB is a conceptual fork of OrientDB, the first Multi-Model DBMS. ArcadeDB supports Vecto

edsl๐Ÿ“wasm-wheel๐ŸŒฑ Seedlingโญ454

Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.

synapse-ai๐Ÿ“v1.0.0๐ŸŒฑ Seedlingโญ1

Build AI agents that actually do things. Synapse is an open-source platform for creating, connecting, and orchestrating AI agents powered by any LLM โ€” local or cloud.

awesome-agent-benchmarks๐Ÿ“master@2026-04-21๐ŸŒฑ Seedlingโญ3

๐Ÿง  Discover and evaluate advanced benchmark datasets for Large Language Model agents to enhance performance assessment in real-world tasks.

ragflow๐Ÿ“v0.24.0๐ŸŒฑ Seedlingโญ77,784

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

eywa๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ1

๐Ÿง  Capture and manage your team's knowledge effortlessly with Eywa, ensuring no valuable memory is ever lost.

axon๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ2

Enable autonomous AI workflows with a local-first, zero-trust Rust framework for high-performance multi-agent orchestration and deterministic execution.

autonomous-agentic-research-swarm๐Ÿ“main@2026-04-11๐ŸŒฑ Seedlingโญ4

File-based autonomous agentic research swarm template (Planner/Worker/Judge) with contracts, workstreams, and deterministic quality gates.

ragas๐Ÿ“v0.4.3๐ŸŒฑ Seedlingโญ13,329

Supercharge Your LLM Application Evaluations ๐Ÿš€

ai-dataset-generator๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ1

๐Ÿค– Generate tailored AI training datasets quickly and easily, transforming your domain knowledge into essential training data for model fine-tuning.

modular-image-classification-framework๐Ÿ“main@2026-04-20๐ŸŒฑ Seedlingโญ1

A modular deep learning framework for training and evaluating image classification models on datasets like CIFAR-10 and MNIST. Supports configurable CNN architectures, automated training, and performa

fluid๐Ÿ“v1.0.8๐ŸŒฑ Seedlingโญ1,908

Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)

inAI-wiki๐Ÿ“v0.1.0๐Ÿ’ค Dormantโญ50

๐ŸŒ The open-source Wikipedia of AI โ€” 2M+ apps, agents, LLMs & datasets. Updated daily with tools, tutorials & news.

vector-cache-optimizer๐Ÿ“base-setup@2026-04-21๐ŸŒฑ Seedlingโญ1

โšก Optimize vector searches with a hyper-efficient cache that uses machine learning for faster, smarter data access and reduced costs.

langgraph-rag-assistant๐Ÿ“main@2026-04-21๐ŸŒฑ Seedlingโญ1

๐Ÿš€ Build an enterprise-ready RAG system to enhance technical documentation querying with LangGraph and multi-step reasoning workflows.

tokenizer๐Ÿ“1.0.0๐Ÿ’ค Dormantโญ6

High-Performance Tokenizer implementation in PHP.

roboflow๐Ÿ“1.3.3๐ŸŒฑ Seedling

Official Python package for working with the Roboflow API

recordlinkage๐Ÿ“0.16๐ŸŒฑ Seedling

A record linkage toolkit for linking and deduplication

azure-ai-projects2.1.0๐ŸŒฑ Seedling

Microsoft Corporation Azure AI Projects Client Library for Python

dlt๐Ÿ“1.25.0๐ŸŒฑ Seedling

dlt is an open-source python-first scalable data loading library that does not require any backend to run.

imbalanced-learn๐Ÿ“0.14.1๐ŸŒฑ Seedling

Toolbox for imbalanced dataset in machine learning

keras๐Ÿ“3.14.0๐ŸŒฑ Seedling

Multi-backend Keras

transformers๐Ÿ“5.5.4๐ŸŒฑ Seedling

Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

KAG๐Ÿ“v0.8.0๐Ÿ’ค Dormantโญ8,668

KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge base

RagaAI-Catalyst๐Ÿ“v2.2.4๐Ÿ’ค Dormantโญ16,130

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal

mcp-bigquery-server๐Ÿ“v1.0.3๐Ÿ’ค Dormantโญ136

A Model Context Protocol (MCP) server that provides secure, read-only access to BigQuery datasets. Enables Large Language Models (LLMs) to safely query and analyze data through a standardized interfac

dingo๐Ÿ“v0.9.0โšฐ๏ธ Archivedโญ1,699

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and