freshcrate

Search results for "gguf"

Clear filters
18 results found (Python)
vllmπŸ“v0.19.1πŸ›οΈ Flagship⭐77,587

A high-throughput and memory-efficient inference and serving engine for LLMs

llamafarmπŸ“v0.0.31🌳 Mature⭐819

Deploy any AI model, agent, database, RAG, and pipeline locally or remotely in minutes

ContextPilotπŸ“v0.4.1🌿 Growing⭐79

Accelerating Long Context LLM Inference with Accuracy-Preserving Context Optimization in SGLang, vLLM, llama.cpp, OpenClaw, RAG, and Agentic AI.

vmlxπŸ“v1.3.34🌿 Growing⭐348

vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth

cyllamaπŸ“0.2.11🌱 Seedling⭐25

A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp

llmwareπŸ“v0.4.6🌿 Growing⭐14,862

Unified framework for building enterprise RAG pipelines with small, specialized models

auraπŸ“main@2026-04-21🌿 Growing⭐55

A sovereign cognitive architecture with IIT 4.0 integrated information, residual-stream affective steering (CAA), Global Workspace Theory, active inference, and 72 consciousness modules β€” running loca

awesome-opensource-aiπŸ“main@2026-04-20🌿 Growing⭐2,849

Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

AGI-Alpha-Agent-v0πŸ“main@2026-04-18🌿 Growing⭐284

META‑AGENTIC α‑AGI πŸ‘οΈβœ¨ β€” Mission 🎯 End‑to‑end: Identify πŸ” β†’ Out‑Learn πŸ“š β†’ Out‑Think 🧠 β†’ Out‑Design 🎨 β†’ Out‑Strategise β™ŸοΈ β†’ Out‑Execute ⚑

rag-chatbotπŸ“main@2026-04-14🌿 Growing⭐407

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

deep-research-mcpπŸ“main@2026-04-13🌿 Growing⭐77

MCP server for OpenAI's Deep Research APIs, Gemini Deep Research Agent, and Hugging Face's Open Deep Research

llm_context_benchmarksπŸ“0.0.0🌱 Seedling⭐59

πŸ“Š LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimiz

server-nexeπŸ“v1.0.2-beta🌱 Seedling⭐9

Local AI server with persistent memory, RAG, and multi-backend inference (MLX / llama.cpp / Ollama). Runs entirely on your machine β€” zero data sent to external services.

vllm-cliπŸ“v0.2.5πŸ’€ Dormant⭐491

A command-line interface tool for serving LLM using vLLM.

MOPπŸ“0.0.0🌱 Seedling⭐1

A local LLM-based autonomous agent orchestration platform featuring async background tasks, context-isolated sub-agents, dynamic knowledge injection, and strict security approval gates (Plan Mode).

local-rag-serverπŸ“main@2026-04-21🌱 Seedling⭐2

Deploy a local, multi-user RAG system to query PDF and DOCX documents using a local LLM without cloud or API dependencies.

langgraph-llama-cpp-starterπŸ“main@2026-04-21🌱 Seedling⭐1

πŸ€– Build intelligent, offline LLM agents with LangGraph and llama-cpp-python using this starter template for local, private tool-calling applications.