freshcrate

Search results for "datasets"

Clear filters
60 results found (Python)
roboflow📁1.3.3🌳 Mature559

Official Python package for working with the Roboflow API

recordlinkage📁0.16🌳 Mature1,046

A record linkage toolkit for linking and deduplication

imbalanced-learn📁0.14.1🏛️ Flagship7,096

Toolbox for imbalanced dataset in machine learning

keras📁3.14.0🏛️ Flagship64,025

Multi-backend Keras

transformers📁5.5.4🏛️ Flagship159,705

Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

datagouv-mcp📁v0.2.23🌳 Mature1,368

Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.

opik📁2.0.9🏛️ Flagship18,965

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

ragflow📁v0.25.0🏛️ Flagship78,674

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

claude-code-plugins-plus-skills📁v4.26.0🌳 Mature1,995

423 plugins, 2,849 skills, 177 agents for Claude Code. Open-source marketplace at tonsofskills.com with the ccpi CLI package manager.

synapse-ai📁v1.0.0🌱 Seedling41

Build AI agents that actually do things. Synapse is an open-source platform for creating, connecting, and orchestrating AI agents powered by any LLM — local or cloud.

restai📁v6.1.45🌿 Growing485

RESTai is an AIaaS (AI as a Service) open-source platform. Supports many public and local LLM suported by Ollama/vLLM/etc. Precise embeddings usage, tuning, analytics etc. Built-in image/audio generat

pixeltable📁v0.5.28🌳 Mature1,549

Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.

arthur-engine📁2.1.529🌿 Growing77

Make AI work for Everyone - Monitoring and governing for your AI/ML

llamafarm📁v0.0.31🌳 Mature819

Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes

synaptic-memory📁v0.16.0🌱 Seedling27

Brain-inspired knowledge graph: spreading activation, Hebbian learning, memory consolidation.

llm-rl-environments-lil-course📁main@2026-04-17🌿 Growing140

🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models

camel📁v0.2.91a1🏛️ Flagship16,753

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) workshop, jointly held at (COLING 2022)

LLM-Agents-Ecosystem-Handbook📁0.0.0🌳 Mature512

One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation tools.

UltraRAG📁v0.3.0.2🌳 Mature5,510

A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines

LRAT📁0.0.0🌱 Seedling39

The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.

jupyter-mcp-server📁v1.0.0🌳 Mature1,025

🪐 🔧 Model Context Protocol (MCP) Server for Jupyter.

instructor📁v1.15.1🏛️ Flagship12,803

structured outputs for llms

RAG-Anything📁v1.2.10🏛️ Flagship16,790

"RAG-Anything: All-in-One RAG Framework"

fast-plaid📁1.4.5🌿 Growing245

High-Performance Engine for Multi-Vector Search

CodeGen📁0.0.0🌳 Mature774

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pr

vector-graph-rag📁v0.1.3🌿 Growing66

Graph RAG with pure vector search, achieving SOTA performance in multi-hop reasoning scenarios.

Agentic-RAG-R1📁0.0.0🌿 Growing413

Agentic RAG R1 Framework via Reinforcement Learning

orbit📁v2.6.6🌿 Growing250

One API for 20+ LLM providers, your databases, and your files — self-hosted, open-source AI gateway with RAG, voice, and guardrails.

llmware📁v0.4.6🌿 Growing14,862

Unified framework for building enterprise RAG pipelines with small, specialized models

arag📁v0.1.0🌿 Growing252

A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.

ragas📁v0.4.3🌳 Mature13,570

Supercharge Your LLM Application Evaluations 🚀

awesome-code-agents📁main@2026-04-20🌿 Growing98

A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding — they're redefining how software changes the world.

awesome-opensource-ai📁main@2026-04-20🌿 Growing2,849

Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

skills-vote📁main@2026-04-19🌿 Growing50

The Next-Gen Agent-Native Skill Recommendation Engine

OmicsClaw📁main@2026-04-18🌿 Growing124

Conversational & memory-enabled AI research partner for multi-omics analysis. From biological idea to full research paper.

vector-db-benchmark📁master@2026-04-17🌿 Growing356

Framework for benchmarking vector search engines

OpenClawProBench📁main@2026-04-15🌿 Growing453

OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

cognitive-dissonance-dspy📁main@2026-04-14🌿 Growing276

A multi-agent LLM system for detecting and resolving cognitive dissonance.

rag-chatbot📁main@2026-04-14🌿 Growing407

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

developers-guide-to-ai📁main@2026-04-09🌱 Seedling36

The Developer's Guide to AI - A Field Guide for the Working Developer

learn-hermes-agent📁0.0.0🌱 Seedling16

A 27-chapter hands-on tutorial for building an autonomous AI agent from zero in Python. Agent loop, tool system, memory, skills, MCP, multi-platform gateway, and self-evolution — inspired by Herme

simplenote-mcp-server📁v1.15.0🌱 Seedling17

MCP Server for Simplenote integration with Claude Desktop

edsl📁wasm-wheel🌿 Growing454

Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.

swing-trading-agent📁0.0.0🌱 Seedling7

Multi-agent swing trading system — automated screening, research, and execution with backtesting and live trading

RagaAI-Catalyst📁v2.2.4💤 Dormant16,141

Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced anal

KAG📁v0.8.0💤 Dormant8,688

KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge base

modular-image-classification-framework📁main@2026-04-20🌱 Seedling1

A modular deep learning framework for training and evaluating image classification models on datasets like CIFAR-10 and MNIST. Supports configurable CNN architectures, automated training, and performa

ai-dataset-generator📁main@2026-04-21🌱 Seedling1

🤖 Generate tailored AI training datasets quickly and easily, transforming your domain knowledge into essential training data for model fine-tuning.

vector-cache-optimizer📁base-setup@2026-04-21🌱 Seedling1

⚡ Optimize vector searches with a hyper-efficient cache that uses machine learning for faster, smarter data access and reduced costs.

langgraph-rag-assistant📁main@2026-04-21🌱 Seedling1

🚀 Build an enterprise-ready RAG system to enhance technical documentation querying with LangGraph and multi-step reasoning workflows.

azure-ai-projects2.1.0🌱 Seedling

Microsoft Corporation Azure AI Projects Client Library for Python

dlt📁1.25.0🌱 Seedling

dlt is an open-source python-first scalable data loading library that does not require any backend to run.