freshcrate — Search

Search results for "quantization"

16 results found (Python)

flashinfer-python 📁0.6.8.post1🏛️ Flagship⭐5,467

FlashInfer: Kernel Library for LLM Serving

pypiby FlashInfer teamPython

torchao 📁0.17.0🌳 Mature⭐2,790

Package for applying ao techniques to GPU models

pypiby pypiPython

faster-whisper 📁1.2.1🏛️ Flagship⭐22,327

Faster Whisper transcription with CTranslate2

ctranslate2 inference openai pypi quantization speech transformer whisperby Guillaume KleinPython

vllm 📁v0.19.1🏛️ Flagship⭐77,587

A high-throughput and memory-efficient inference and serving engine for LLMs

amd blackwell cuda deepseek deepseek-v3 gpt gpt-oss inference pythonby vllm-projectPython

vllm-mlx 📁v0.2.8🌳 Mature⭐917

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac

anthropic apple-silicon audio-processing claude-code computer-vision image-understanding inference llm pythonby waybarriosPython

vmlx 📁v1.3.34🌿 Growing⭐348

vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth

anthropic-api kvcache-compression kvcache-optimization kvcache-reuse llm lmstudio macbook mcp-server pythonby jjang-aiPython

fast-plaid 📁1.4.5🌿 Growing⭐245

High-Performance Engine for Multi-Vector Search

colbert colpali information-retrieval python rust vector-databaseby lightonaiPython

cognita 📁0.0.0🌳 Mature⭐4,405

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

agent ai application data deep-learning fine-tuning framework generative-ai pythonby truefoundryPython

VectorDBBench 📁v1.0.20🌳 Mature⭐1,078

Benchmark for vector databases.

benchmark cost-effectiveness performance python vector-database vector-search vectordbby zilliztechPython

llmware 📁v0.4.6🌿 Growing⭐14,862

Unified framework for building enterprise RAG pipelines with small, specialized models

agents generative-ai-tools llamacpp llm onnx openvino parsing python retrieval-augmented-generationby llmware-aiPython

tsunami 📁main@2026-04-21🌱 Seedling⭐16

autonomous AI agent that builds full-stack apps. local models. no cloud. no API keys. runs on your hardware.

agentic-ai ai-agent ai-coding-assistant app-builder autonomous-agent code-generation coding-agent developer-tools pythonby gobbleyourdongPython

awesome-opensource-ai 📁main@2026-04-20🌿 Growing⭐2,849

Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

agents ai artificial-intelligence awesome awesome-list generative-ai llm machine-learning python ragby alvinrealPython

rag-chatbot 📁main@2026-04-14🌿 Growing⭐407

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

chatbot chromadb gpu lamacpp llama3 llm python qwen3-5 ragby umbertogriffoPython

contemplative-agent 📁v2.1.0🌱 Seedling⭐4

A self-improving AI agent that learns from experience. Runs entirely on a local 9B model. Security by absence — dangerous capabilities were never built.

agent-framework agent-simulation ai-agent ai-ethics ai-safety ai-security autonomous-agent contemplative-ai pythonby shimo4228Python

MOP 📁0.0.0🌱 Seedling⭐1

A local LLM-based autonomous agent orchestration platform featuring async background tasks, context-isolated sub-agents, dynamic knowledge injection, and strict security approval gates (Plan Mode).

ai-agent autonomous-agent customkinter harness llama-cpp llm local-ai multi-agent-system pythonby lhc5407Python

vector-cache-optimizer 📁base-setup@2026-04-21🌱 Seedling⭐1

⚡ Optimize vector searches with a hyper-efficient cache that uses machine learning for faster, smarter data access and reduced costs.

ai ai-assisted backend compiler database distributed-systems mathematics matrix-multiplication python vector-databaseby Ronakagrwal000Python