Search results for "experiments"
AI Experiments A public repository of AI/ML projects exploring generative models, NLP, computer vision, and autonomous agents. Includes code, documentation, and demos for educational purposes.
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Universal AI Development Platform with MCP server integration, multi-provider support, and professional CLI. Build, test, and deploy AI applications with multiple ai providers.
ARIS βοΈ (Auto-Research-In-Sleep) β Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in β works wi
The agent engineering platform
π± A little course on Reinforcement Learning Environments for evaluating and training Language Models
The implementation for SIGIR 2026: Learning to Retrieve from Agent Trajectories.
Autonomous goal-directed iteration for Gemini CLI. Inspired by Karpathy's autoresearch. Modify β Verify β Keep/Discard β Repeat forever.
Seth's AI Tools: A Unity based front end that uses ComfyUI and LLMs to create stories, images, movies, quizzes and posters
Autonomous Agents (LLMs) research papers. Updated Daily.
Agents and tools for using Quint with LLMs
π₯ An autonomous AI agent that runs your deep learning experiments 24/7 while you sleep. Zero-cost monitoring, Leader-Worker architecture, constant-size memory.
Latitude is the open-source agent engineering platform
Agentic RAG R1 Framework via Reinforcement Learning
Convoke extends BMAD Method AI agents with two types of installable modules: Teams bring new agents for a domain, Skills add new capabilities to existing agents. Install them independently or combine
Internal Safety Collapse: Turning the LLM or an AI Agent into a sensitive data generator.
Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.
Multi-Agent workflow running into a Laravel application with Neuron PHP AI framework
Dragon Brain β persistent long-term memory for AI agents via MCP (Model Context Protocol). Knowledge graph (FalkorDB) + vector search (Qdrant) + CUDA GPU embeddings. Works with Claude, Gemini CLI, Cur
METAβAGENTIC Ξ±βAGI ποΈβ¨ β Mission π― Endβtoβend: Identify π β OutβLearn π β OutβThink π§ β OutβDesign π¨ β OutβStrategise βοΈ β OutβExecute β‘
A comprehensive evaluation framework for AI agents and LLM applications.
MaverickMCP - Personal Stock Analysis MCP Server
Evaluation and Tracking for LLM Experiments and AI Agents
Claude Autoresearch Skill β Autonomous goal-directed iteration for Claude Code. Inspired by Karpathy's autoresearch. Modify β Verify β Keep/Discard β Repeat forever.
Unified framework for building enterprise RAG pipelines with small, specialized models
A curated list of awesome works related to high dimensional structure/vector search & database
NextPlaid, ColGREP: Multi-vector search, from database to coding agents.
A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines
Autonomous quantitative trading research platform that transforms stock lists into fully backtested strategies using AI agents, real market data, and mathematical formulations, all without requiring a
Open-Sable is a local-first autonomous agent framework with AGI-inspired cognitive subsystems (goals, memory, metacognition, tool use). It can run continuously on your machine, integrate with chat int
The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controllin
TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
Deep research agent built with Neuron PHP AI framewokrk
PolyCouncil is an open-source multi-model deliberation engine for LM Studio. It runs multiple LLMs in parallel, gathers their answers, scores each response using a shared rubric, and produces a final,
π« CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
A deterministic development harness for Claude Code β MCP workflow engine, enforcement hooks, YAML workflows, and multi-agent consensus (Claude + Codex + Gemini)
Local-first AI agent bootstrap: Playwright Browser MCP + ContextDB for Codex CLI, Claude Code, Gemini CLI, and OpenCode.
RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker
[Community Supported] Perforce P4 MCP Server is a Model Context Protocol (MCP) server that integrates with the Perforce P4 version control system.
File-based autonomous agentic research swarm template (Planner/Worker/Judge) with contracts, workstreams, and deterministic quality gates.
PromptManager is a desktop application for cataloguing, searching, and executing AI prompts, and much more.
Nix packages for AI coding agents and development tools. Automatically updated daily.
Skip to content github / docs Code Issues 80 Pull requests 35 Discussions Actions Projects 2 Security Insights Merge branch 'main' into 1862-Add-Travis-CI-migration-table 1862-Add-Travis-CI-migration
π€π aiFlows: The building blocks of your collaborative AI
