freshcrate — Search

Search results for "gguf"

18 results found (Python)

vllm 📁v0.19.1🏛️ Flagship⭐77,587

A high-throughput and memory-efficient inference and serving engine for LLMs

amd blackwell cuda deepseek deepseek-v3 gpt gpt-oss inference pythonby vllm-projectPython

llamafarm 📁v0.0.31🌳 Mature⭐819

Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes

ai aiproject chatgpt claude edge edge-computing finetuning-llms gemma prompt-engineering pythonby llama-farmPython

ContextPilot 📁v0.4.1🌿 Growing⭐79

Accelerating Long Context LLM Inference with Accuracy-Preserving Context Optimization in SGLang, vLLM, llama.cpp, OpenClaw, RAG, and Agentic AI.

ai-agents context-api context-engineering hermes-agent inference-optimization openclaw prompt-engineering pythonby EfficientContextPython

vmlx 📁v1.3.34🌿 Growing⭐348

vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth

anthropic-api kvcache-compression kvcache-optimization kvcache-reuse llm lmstudio macbook mcp-server pythonby jjang-aiPython

cyllama 📁0.2.11🌱 Seedling⭐25

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

agents cython cython-wrapper llama-cpp python python3 rag stable-diffusion-cpp whisper-cppby shakfuPython

llmware 📁v0.4.6🌿 Growing⭐14,862

Unified framework for building enterprise RAG pipelines with small, specialized models

agents generative-ai-tools llamacpp llm onnx openvino parsing python retrieval-augmented-generationby llmware-aiPython

little-coder 📁v0.0.4🌱 Seedling⭐31

A coding agent optimized to smaller LLMs

ai-coding-assistant aider-polygot benchmark code-generation coding-agent coding-agents local-llm ollama pythonby itayinbarrPython

aura 📁main@2026-04-21🌿 Growing⭐55

A sovereign cognitive architecture with IIT 4.0 integrated information, residual-stream affective steering (CAA), Global Workspace Theory, active inference, and 72 consciousness modules — running loca

active-inference affective-computing apple-silicon artificial-consciousness autonomous-agent cognitive-architecture cognitive-science consciousness pythonby youngbryan97Python

awesome-opensource-ai 📁main@2026-04-20🌿 Growing⭐2,849

Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

agents ai artificial-intelligence awesome awesome-list generative-ai llm machine-learning python ragby alvinrealPython

AGI-Alpha-Agent-v0 📁main@2026-04-18🌿 Growing⭐284

META‑AGENTIC α‑AGI 👁️✨ — Mission 🎯 End‑to‑end: Identify 🔍 → Out‑Learn 📚 → Out‑Think 🧠 → Out‑Design 🎨 → Out‑Strategise ♟️ → Out‑Execute ⚡

agentic agentic-ai agentic-framework ai aiagent aiagents llm meta-agentic pythonby MontrealAIPython

rag-chatbot 📁main@2026-04-14🌿 Growing⭐407

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

chatbot chromadb gpu lamacpp llama3 llm python qwen3-5 ragby umbertogriffoPython

deep-research-mcp 📁main@2026-04-13🌿 Growing⭐77

MCP server for OpenAI's Deep Research APIs, Gemini Deep Research Agent, and Hugging Face's Open Deep Research

pythonby pminerviniPython

llm_context_benchmarks 📁0.0.0🌱 Seedling⭐59

📊 LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimiz

ai benchmarking llms pythonby ivanfioravantiPython

server-nexe 📁v1.0.2-beta🌱 Seedling⭐9

Local AI server with persistent memory, RAG, and multi-backend inference (MLX / llama.cpp / Ollama). Runs entirely on your machine — zero data sent to external services.

ai apple-silicon embeddings fastapi llama-cpp llm local-ai mlx python vector-databaseby jgoy-labsPython

vllm-cli 📁v0.2.5💤 Dormant⭐491

A command-line interface tool for serving LLM using vLLM.

llm llm-inference llm-tools python vllmby Chen-zexiPython

MOP 📁0.0.0🌱 Seedling⭐1

A local LLM-based autonomous agent orchestration platform featuring async background tasks, context-isolated sub-agents, dynamic knowledge injection, and strict security approval gates (Plan Mode).

ai-agent autonomous-agent customkinter harness llama-cpp llm local-ai multi-agent-system pythonby lhc5407Python

local-rag-server 📁main@2026-04-21🌱 Seedling⭐2

Deploy a local, multi-user RAG system to query PDF and DOCX documents using a local LLM without cloud or API dependencies.

agent-skills ai-tools antigravity comfyui hybrid-search linux mcp mcp-proxy mcp-server pythonby umiii100Python

langgraph-llama-cpp-starter 📁main@2026-04-21🌱 Seedling⭐1

🤖 Build intelligent, offline LLM agents with LangGraph and llama-cpp-python using this starter template for local, private tool-calling applications.

ai-chatbot function-calling gguf langgraph llama-cpp-python llama3 llm-agent local-llm pythonby irandysousaPython