freshcrate — Search

Search results for "serving"

35 results found

llama.cpp 📁b8864🌳 Mature⭐103,119

LLM inference in C/C++

c++ggmlby ggerganovC++

npcpy 📁v1.4.21🌳 Mature⭐1,287

The python library for research and development in NLP, multimodal LLMs, Agents, ML, Knowledge Graphs, and more.

agents ai llm mcp mcp-client mcp-server ollama perplexity pythonby NPC-WorldwidePython

LeanKG 📁v0.16.5🌱 Seedling⭐32

LeanKG: Stop Burning Tokens. Start Coding Lean.

ai-agent claude claude-code claude-code-plugin concise-context cursor gemini graph-database rustby FreePeakRust

mentisdb 📁0.9.3.39🌿 Growing⭐56

Memory that lasts and compounds. MentisDB gives agents durable memory so they do not just remember, they improve over time. It stores append-only thought chains plus a Git-like skills registry, lett

ai ai-agents claude codex copilot infinite-memory openai openrouter rustby CloudLLM-aiRust

llm-rl-environments-lil-course 📁main@2026-04-17🌿 Growing⭐57

🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models

course grpo language-models llm llm-agent python reinforcement-learning reinforcement-learning-environments rlvrby anakin87Python

vllm 📁v0.19.1🌿 Growing⭐76,155

A high-throughput and memory-efficient inference and serving engine for LLMs

amd blackwell cuda deepseek deepseek-v3 gpt gpt-oss inference pythonby vllm-projectPython

sample-genai-on-eks-starter-kit 📁v1.1.0🌿 Growing⭐51

A comprehensive toolkit for deploying production-ready Generative AI infrastructure on Amazon EKS. Includes pre-configured components for: 🚀 AI Gateway (LiteLLM) 🤖 LLM Serving (vLLM, SGLang, Ollama

agentic-ai ai-engineering ai-platform javascript kubernetes llm-inference llmopsby aws-samplesJavaScript

promptfoo 📁code-scan-action-0.1.5🌿 Growing⭐19,943

Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and

ci ci-cd cicd evaluation evaluation-framework llm llm-eval llm-evaluation typescriptby promptfooTypeScript

coding-proxy 📁v0.3.0🌱 Seedling⭐6

A High-Availability, Transparent, and Smart Multi-Vendor Proxy for Claude Code. Support Claude Plans, GitHub Copilot, Google Antigravity, ZAI/GLM, MiniMax, Qwen, Xiaomi, Kimi, Doubao...

antigravity claude-code copilot doubao glm kimi llm-agent minimax pythonby ThreeFish-AIPython

awesome-prompts 📁main@2026-04-21🌿 Growing⭐7,572

Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.

awesome awesome-list chatgpt gpt4 gpts gptstore papers prompt prompt-engineeringby ai-boost

deer-flow 📁main@2026-04-21🌿 Growing⭐60,446

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of ta

agent agentic agentic-framework agentic-workflow ai ai-agents deep-research harness pythonby bytedancePython

claude-code-guide 📁main@2026-04-21🌿 Growing⭐3,908

Claude Code Guide - Setup, Commands, workflows, agents, skills & tips-n-tricks go from beginner to power user!

ai ai-agent ai-agent-tools anthropic-claude claude claude-ai claude-api claude-codeby zebbern

ag2 📁v0.12.0🌿 Growing⭐4,383

AG2 (formerly AutoGen): The Open-Source AgentOS.Join us at: https://discord.gg/sNGSwQME3x

a2a ag2 agent-framework agentic agentic-ai ai ai-agents-framework aiagents pythonby ag2aiPython

endee 📁v1.3.4🌿 Growing⭐933

Endee.io – A high-performance vector database, designed to handle up to 1B vectors on a single node, delivering significant performance gains through optimized indexing and execution. Also available i

ai-search ai-search-engine ann c++endee hnsw hybrid-search image-search vectorby endee-ioC++

langgraphjs 📁@langchain/langgraph-sdk@1.8.9🌿 Growing⭐2,775

Framework to build resilient language agents as graphs.

agents ai artificial-intelligence generative-ai llm node typescriptby langchain-aiTypeScript

Awesome-Agent-Memory 📁main@2026-04-16🌿 Growing⭐333

Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.

agent-memory ai-agent ai-agent-memory awesome-agent-memory llm-memory memory memory-management multimodal-llm-memoryby TeleAI-UAGI

mcp-go 📁v0.48.0🌿 Growing⭐8,573

A Go implementation of the Model Context Protocol (MCP), enabling seamless integration between LLM applications and external data sources and tools.

goby mark3labsGo

rag-chatbot 📁main@2026-04-14🌿 Growing⭐402

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

chatbot chromadb gpu lamacpp llama3 llm python qwen3-5 ragby umbertogriffoPython

vllm-mlx 📁v0.2.8🌿 Growing⭐798

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac

anthropic apple-silicon audio-processing claude-code computer-vision image-understanding inference llm pythonby waybarriosPython

AgenticGoKit 📁v0.5.9🌿 Growing⭐134

Open-source Agentic AI framework in Go for building, orchestrating, and deploying intelligent agents. LLM-agnostic, event-driven, with multi-agent workflows, MCP tool discovery, and production-grade o

agentic-ai agentic-ai-development agentic-coding agentic-framework agentic-rag agentic-workflow agentic-workflows agents goby AgenticGoKitGo

next-plaid 📁v1.2.0🌿 Growing⭐331

NextPlaid, ColGREP: Multi-vector search, from database to coding agents.

agentic-rag cli grep multi-vector rust vector-databaseby lightonaiRust

datagouv-mcp 📁v0.2.23🌿 Growing⭐1,216

Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.

mcp mcp-server open-data opendata pythonby datagouvPython

matrixone 📁v3.0.9🌱 Seedling⭐1,834

AI-native HTAP database with Git-for-Data and built-in vector search, serving as the data and memory backbone for intelligent agents and applications.

agents ai-native cloud-native database distributed-database distributed-systems fulltext-support git-for-data goby matrixoriginGo

memora 📁v0.2.27🌱 Seedling⭐386

Give your AI agents persistent memory.

ai-agent claude knowledge-graph llms mcp mcp-server memory python ragby agentic-boxPython

spiceai 📁v1.11.5🌱 Seedling⭐2,868

A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.

artificial-intelligence data data-federation developers full-text-search infrastructure llm-inference machine-learning rustby spiceaiRust

teleton-agent 📁v0.8.6🌱 Seedling⭐66

Teleton: Autonomous AI Agent for Telegram & TON Blockchain

ai-agent autonomous-agent gramjs llm nodejs open-source plugin-sdk rag typescriptby TONresistorTypeScript

MCP---Agent-Starter-Kit 📁main@2026-04-21🌱 Seedling⭐4

🚀 Build and explore multi-agent AI workflows with ready-to-use projects for document serving, Q/A bots, and orchestration.

ai automation chatbot demo fastapi mcp mcp-server multi-agent python vector-databaseby fub05Python

ragflow 📁v0.24.0🌱 Seedling⭐77,784

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

agent agentic agentic-ai agentic-workflow ai context-engineering context-retrieval deep-research pythonby infiniflowPython

agentic-news-generator 📁main@2026-04-20🌱 Seedling⭐1

Generate a custom newspaper with an AI agent based on your favorite YouTube channels.

agentic generative-ai jupyter notebook news videoby florianbuetowJupyter Notebook

multi-agent-orchestration-framework 📁v0.1.0🌱 Seedling⭐26

Modular multi-agent orchestration framework powered by LangGraph and FastAPI.

agent ai-framework fastapi langchain langgraph llm memory multi-agent pythonby yx-fanPython

coai 📁v4.0.0💤 Dormant⭐9,059

🚀 Next Generation Multi-tenant AI One-Stop Solution. Builtin Admin & Billing System. Enterprise-Grade Unified LLM Gateway Support for 200+ Models And 35+ Providers, Load Balacing w/ Priority-base Rou

ai-gateway api chat chatgpt cross-platform gemini golang llm-gateway typescriptby coaidevTypeScript

Government-Citizen-Services-Voice-Agent 📁main@2026-04-15🌱 Seedling⭐1

Autonomous, multilingual AI voice agent using ElevenLabs, LangGraph, and RAG for government services

conversational-ai elevenlabs fastapi govtech langgraph python rag voice-agentby AutomaticarePython

vllm-cli 📁v0.2.5💤 Dormant⭐487

A command-line interface tool for serving LLM using vLLM.

llm llm-inference llm-tools python vllmby Chen-zexiPython

mcp-servers 📁monorepo-latest-placeholder@0.0.0💤 Dormant⭐63

MCP (Model Context Protocol) Servers authored and maintained by the PulseMCP team. We build reliable servers thoughtfully designed specifically for MCP Client-powered workflows.

typescriptby pulsemcpTypeScript

dingo 📁v0.9.0⚰️ Archived⭐1,699

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and

embedding-search embedding-store hybrid-search java key-value-distributed-store mysql-compatibility real-time-semantic-search serving structured-databy dingodbJava