Search results for "efficient"
FlashInfer: Kernel Library for LLM Serving
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML.
Python framework for fast Vector Space Modelling
Cutting-edge framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
asyncio rate limiter, a leaky bucket implementation
Core component of the Microsoft Graph Python SDK
A Python Progressbar library to provide visual (yet text based) progress to long running operations.
Open World Holidays Framework
SGLang is a fast serving framework for large language models and vision language models.
Automatic MCP server generator for FastAPI applications - converts FastAPI endpoints to MCP tools for LLM integration
Library for building powerful interactive command lines in Python
The leading, most token-efficient MCP server for GitHub source code exploration via tree-sitter AST parsing
The leading, most token-efficient MCP server for documentation exploration and retrieval via structured section indexing
A high-throughput and memory-efficient inference and serving engine for LLMs
Token-efficient MCP server for tabular data retrieval. Index CSV/Excel files, query rows, aggregate — 99%+ token savings vs raw file reads.
AGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, a
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
A production-ready runtime framework for agent apps with secure tool sandboxing, Agent-as-a-Service APIs, scalable deployment, full-stack observability, and broad framework compatibility.
Official MCP Servers for AWS
A text-based user interface (TUI) client for interacting with MCP servers using Ollama. Features include agent mode, multi-server, model switching, streaming responses, tool management, human-in-the-l
Build and run agents you can see, understand and trust.
Vibe-Skills is an all-in-one AI skills package. It seamlessly integrates expert-level capabilities and context management into a general-purpose skills package, enabling any AI agent to instantly upgr
🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models
A simple and well-tailored LLM application framework that enables you to seamlessly integrate LLM capabilities in the most "Code-Centric" manner. LLM As Function, Prompt As Code. 一个简单的恰到
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
MCP server for Fabric Real-Time Intelligence (https://aka.ms/fabricrti) supporting tools for Eventhouse (https://aka.ms/eventhouse), Azure Data Explorer (https://aka.ms/adx, and other RTI services (co
🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
The Multi-Agent Custom Automation Engine Solution Accelerator is an AI-driven system that manages a group of AI agents to accomplish tasks based on user input. Powered by Microsoft Agent Framework, Az
Accelerating Long Context LLM Inference with Accuracy-Preserving Context Optimization in SGLang, vLLM, llama.cpp, OpenClaw, RAG, and Agentic AI.
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac
An open-source AI assistant framework with skills and agent architecture
Secure AI conversations with documents, video, audio, and more. Personal workspaces for focused context, group spaces for shared insight. Classify docs, reuse prompts, and extend with modular features
AINL helps turn AI from "a smart conversation" into "a structured worker." It is designed for teams building AI workflows that need multiple steps, state and memory, tool use, repeatable execution, v
Enterprise-ready MCP Gateway & Registry that centralizes AI development tools with secure OAuth authentication, dynamic tool discovery, and unified access for both autonomous AI agents and AI coding a
OpenAI-compatible HTTP LLM proxy / gateway for multi-provider inference (Google, Anthropic, OpenAI, PyTorch). Lightweight, extensible Python/FastAPI—use as library or standalone service.
MCP Server for Computer Use in Windows
Harness LLMs with Multi-Agent Programming
RAGLight is a modular framework for Retrieval-Augmented Generation (RAG). It makes it easy to plug in different LLMs, embeddings, and vector stores, and now includes seamless MCP integration to connec
High-Performance Engine for Multi-Vector Search
Agentic DOCX Redlining Engine. Enables LLMs to read Word documents and inject native Track Changes (w:ins, w:del) and Comments without breaking formatting. Includes Model Context Protocol (MCP) Server
Lad MCP Server: Autonomous code & system design review for AI coding agents (Claude Code, Cursor, Codex, etc.). Features multi-model consensus via OpenRouter and context-aware reviews via Serena.
A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp
Structured Outputs
Compact, efficient, and extensible long-term memory for LLM agents.
LightAgent: Lightweight AI agent framework with memory, tools & tree-of-thought. Supports multi-agent collaboration, self-learning, and major LLMs (OpenAI/DeepSeek/Qwen). Open-source with MCP/SSE prot
Open Framework for AI Agents to play Red Alert through Reinforcement Learning
Agentic RAG R1 Framework via Reinforcement Learning
Benchmark for vector databases.
"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"
Unified framework for building enterprise RAG pipelines with small, specialized models
Command Line telepathy. An Autonomous Al Agent for your Terminal that turns intent into Execution (Windows/Linux/Mac)
A-RAG: Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces. State-of-the-art RAG framework with keyword, semantic, and chunk read tools for multi-hop QA.
Supercharge Your LLM Application Evaluations 🚀
Prompt Driven Development Command Line Interface
Self-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumption
A curated list of products, benchmarks, and research papers on autonomous code agents. Beyond coding — they're redefining how software changes the world.
Curated list of the best truly open-source AI projects, models, tools, and infrastructure.
Automated security investigation tool using Microsoft MCP Servers, GitHub Copilot, Python Modules and custom copilot-instructions.
A desktop AI agent that controls your local machine — runs commands, manages files, executes code, browses the web autonomously etc. Supports Claude, GPT, Gemini, Llama, DeepSeek, and more. .exe avail
OllamaFreeAPI: Free Distributed API for Ollama LLMs Public gateway to our managed Ollama servers with: - Zero-configuration access to 50+ models - Auto load-balanced across global nodes - Free tier w
A tool for resolving PEP 735 Dependency Group data
Zero-dependency browser automation CLI. 70+ commands, 10 test assertions, smart commands (click/fill by text — no LLM needed). MCP server for AI agents with 500x fewer tokens. Extract, observe, script
🏛️ Hermes Gate — Terminal TUI for managing remote Hermes Agent sessions with auto-reconnect, detach support, and zero config
Synthadoc: An open-source LLM knowledge compilation engine that turns raw documents into structured, local-first wikis. A transparent, human-readable alternative to traditional RAG, which can be self-
📊 LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimiz
Next generation FEniCS Form Compiler for finite element forms
A comprehensive suite of protocols, meta-prompts, and orchestration tools designed to streamline software development workflows, project management, and team collaboration. Includes the VibeCode Proto
Local AI server with persistent memory, RAG, and multi-backend inference (MLX / llama.cpp / Ollama). Runs entirely on your machine — zero data sent to external services.
🛠️ Automate penetration testing with SploitGPT, an AI agent using Kali Linux tools for efficient security assessments and minimal user input.
Ship customer-facing AI with isolation, spend controls, and provenance.
Lightweight hallucination detection framework for RAG applications
A command-line interface tool for serving LLM using vLLM.
Automatically Update LLM-Agent Papers Daily using Github Actions (Update Every 12th hours)
CloneMe is an advanced AI platform that builds your digital twin—an AI that chats like you, remembers details, and supports multiple platforms. Customizable, memory-driven, and hot-reloadable, it's th
💰 Optimize your Claude API usage to save 50-90% on costs with batching techniques and efficient request management.
Python LLM-RAG deep agent using LangChain, LangGraph and LangSmith built on Quart web microframework and served using Hypercorn ASGI and WSGI web server.
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge base
🔍 Accelerate research using a Multi Agent System for efficient context engineering with DeepAgent and LangChain's library.
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
🔍 Enhance code quality with Argus MCP, an AI-driven code review server using a Zero-Trust model for safe and efficient development.
✍️ Revise and enhance novels with ReNovel-AI, your smart tool for story reimagining and memory-driven writing assistance.
Deploy a local, multi-user RAG system to query PDF and DOCX documents using a local LLM without cloud or API dependencies.
🔍 Implement hybrid search using Vespa and FastAPI, blending BM25 and dense semantic retrieval for efficient, accurate information retrieval.
A code generator for array-based code on CPUs and GPUs
⚡ Optimize vector searches with a hyper-efficient cache that uses machine learning for faster, smarter data access and reduced costs.
🤖 Orchestrate AI agents at scale using the MCP framework, enabling seamless context sharing, communication, and integration for enhanced collaboration.
🖼️ Convert images quickly between formats with ImC, a fast and simple CLI tool built on Pillow for efficient batch processing and clean command usage.
🧠 Qualify leads with an AI-driven system that understands intent, asks key questions, and structures quality leads without hardcoding processes.
🧠 Enhance LLM agents with an agentic memory system, featuring automatic note construction, dynamic memory updates, and intelligent semantic retrieval.
A Python-based framework for building multi-agent systems with LLMs. Currently in pre-launch alpha.
🛒 Build a leading-edge e-commerce recommendation system using RAG architecture, Groq Llama 3, LangChain, and AstraDB, deployed on Kubernetes for scalability.
🚀 Build and scale reliable Retrieval-Augmented Generation (RAG) systems with this curated collection of tools, frameworks, and best practices.
🧠 Enhance your AI coding assistant with a universal knowledge base and rules system, compatible with any project and editor.
Web-Use is a CDP powered Browser Agent
🧠 Build an offline RAG chatbot to answer questions from PDFs, adapting responses based on user experience levels with a smooth chat interface.
Efficient Retrieval Augmentation and Generation Framework
Intelligent Model Context Protocol (MCP) server for AI-assisted API development. Generate mock servers from OpenAPI specs with advanced logging, performance analytics, and server discovery. Optimized
AI News Scraper & Semantic Search: A Python application that scrapes news articles, uses GenAI to generate summaries and identify topics, and provides semantic search capabilities through vector embed
Demo RAG API (FastAPI, OpenAI, ChromaDB, Docker) automatically generated using the OpenAI Codex CLI tool. Highlights Codex's capability for rapid, complex application development.
