freshcrate

Search results for "quantization"

41 results found
flashinfer-python📁0.6.8.post1🏛️ Flagship5,467

FlashInfer: Kernel Library for LLM Serving

torchao📁0.17.0🌳 Mature2,790

Package for applying ao techniques to GPU models

faster-whisper📁1.2.1🏛️ Flagship22,327

Faster Whisper transcription with CTranslate2

ctranslate2📁4.7.1🌳 Mature4,444

Fast inference engine for Transformer models

llama.cpp📁b8871🏛️ Flagship105,537

LLM inference in C/C++

SeekStorm📁v3.0.0🌳 Mature1,865

SeekStorm: vector & lexical search - in-process library & multi-tenancy server, in Rust.

vllm📁v0.19.1🏛️ Flagship77,587

A high-throughput and memory-efficient inference and serving engine for LLMs

weaviate📁v1.35.18🏛️ Flagship16,051

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c

zvec📁v0.3.1🌳 Mature9,474

A lightweight, lightning-fast, in-process vector database

milvus📁v2.6.15🏛️ Flagship43,898

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

octocode📁0.14.0🌿 Growing327

Semantic code searcher and codebase utility

vllm-mlx📁v0.2.8🌳 Mature917

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac

ruflo📁v3.5.80🏛️ Flagship32,695

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade archit

prism-mcp📁v9.3.0🌿 Growing128

The Mind Palace for AI Agents — Autonomous Cognitive OS with affect-tagged memory (valence engine), token-economic RL (surprisal gate + UBI), Hebbian learning, ACT-R spreading activation, Synapse Engi

next-plaid📁v1.2.0🌿 Growing383

NextPlaid, ColGREP: Multi-vector search, from database to coding agents.

vmlx📁v1.3.34🌿 Growing348

vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth

sqlite-vector📁0.9.95🌳 Mature855

SQLite-Vector is a cross-platform, ultra-efficient SQLite extension that brings vector search capabilities to your embedded database.

LocalAI📁v4.1.3🏛️ Flagship45,672

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

qdrant📁v1.17.1🏛️ Flagship30,532

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

fast-plaid📁1.4.5🌿 Growing245

High-Performance Engine for Multi-Vector Search

cognita📁0.0.0🌳 Mature4,405

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

VectorChord📁1.1.1🌳 Mature1,646

Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.

claude-flows📁0.0.0🌿 Growing94

🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architect

llmware📁v0.4.6🌿 Growing14,862

Unified framework for building enterprise RAG pipelines with small, specialized models

vectorizer📁vectorizer-3.0.0🌱 Seedling21

A high-performance, in-memory vector database written in Rust, designed for semantic search and top-k nearest neighbor queries in AI-driven applications, with binary file persistence for durability.

Awesome-World-Models📁main@2026-04-21🌿 Growing1,542

A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website

tsunami📁main@2026-04-21🌱 Seedling16

autonomous AI agent that builds full-stack apps. local models. no cloud. no API keys. runs on your hardware.

awesome-prompts📁main@2026-04-21🌿 Growing7,671

Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.

awesome-opensource-ai📁main@2026-04-20🌿 Growing2,849

Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

rag-chatbot📁main@2026-04-14🌿 Growing407

RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.

oasisdb📁v0.1.2🌿 Growing97

OasisDB: A minimal and lightweight vector database

coordinode📁v0.4.1🌱 Seedling4

The graph-native hybrid retrieval engine for AI and GraphRAG. Graph + Vector + Full-Text in a single transactional engine.

contemplative-agent📁v2.1.0🌱 Seedling4

A self-improving AI agent that learns from experience. Runs entirely on a local 9B model. Security by absence — dangerous capabilities were never built.

vectro📁v4.8.0🌱 Seedling7

⚡💾 Vectro — Compress LLM embeddings 🧠🚀 Save memory, speed up retrieval, and keep semantic accuracy 🎯✨ Lightning-fast quantization for Python + Mojo, vector DB friendly 🗄️, and perfect for RAG pip

awesome-vector-databases📁0.0.0🌱 Seedling14

A curated list of vector database solutions, libraries, and resources for AI applications - https://vectordb.works

mddb📁v2.9.14🌱 Seedling3

A minimal, lightweight structured data store designed for small applications, scripts and automation workflows. Built for simplicity, portability and low overhead.

MOP📁0.0.0🌱 Seedling1

A local LLM-based autonomous agent orchestration platform featuring async background tasks, context-isolated sub-agents, dynamic knowledge injection, and strict security approval gates (Plan Mode).

vector-cache-optimizer📁base-setup@2026-04-21🌱 Seedling1

⚡ Optimize vector searches with a hyper-efficient cache that uses machine learning for faster, smarter data access and reduced costs.

ASAN-Architecture📁0.0.0🌱 Seedling6

ASAN: A conceptual architecture for a self-creating (autopoietic), energy-efficient, and governable multi-agent AI system.