Search results for "quantization"
FlashInfer: Kernel Library for LLM Serving
Faster Whisper transcription with CTranslate2
Fast inference engine for Transformer models
SeekStorm: vector & lexical search - in-process library & multi-tenancy server, in Rust.
A high-throughput and memory-efficient inference and serving engine for LLMs
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c
A lightweight, lightning-fast, in-process vector database
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
Semantic code searcher and codebase utility
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac
🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade archit
The Mind Palace for AI Agents — Autonomous Cognitive OS with affect-tagged memory (valence engine), token-economic RL (surprisal gate + UBI), Hebbian learning, ACT-R spreading activation, Synapse Engi
NextPlaid, ColGREP: Multi-vector search, from database to coding agents.
vMLX - Home of JANG_Q - Cont Batch, Prefix, Paged, KV Cache Quant, VL - Powers MLX Studio. Image gen/edit, OpenAI/Anth
SQLite-Vector is a cross-platform, ultra-efficient SQLite extension that brings vector search capabilities to your embedded database.
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
High-Performance Engine for Multi-Vector Search
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.
🌊 The leading agent orchestration platform for Claude. Deploy intelligent multi-agent swarms, coordinate autonomous workflows, and build conversational AI systems. Features enterprise-grade architect
Benchmark for vector databases.
Unified framework for building enterprise RAG pipelines with small, specialized models
A high-performance, in-memory vector database written in Rust, designed for semantic search and top-k nearest neighbor queries in AI-driven applications, with binary file persistence for durability.
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related website
autonomous AI agent that builds full-stack apps. local models. no cloud. no API keys. runs on your hardware.
Curated list of chatgpt prompts from the top-rated GPTs in the GPTs Store. Prompt Engineering, prompt attack & prompt protect. Advanced Prompt Engineering papers.
Curated list of the best truly open-source AI projects, models, tools, and infrastructure.
RAG (Retrieval-augmented generation) ChatBot that provides answers based on contextual information extracted from a collection of Markdown files.
A curated list of awesome works related to high dimensional structure/vector search & database
OasisDB: A minimal and lightweight vector database
The graph-native hybrid retrieval engine for AI and GraphRAG. Graph + Vector + Full-Text in a single transactional engine.
A self-improving AI agent that learns from experience. Runs entirely on a local 9B model. Security by absence — dangerous capabilities were never built.
⚡💾 Vectro — Compress LLM embeddings 🧠🚀 Save memory, speed up retrieval, and keep semantic accuracy 🎯✨ Lightning-fast quantization for Python + Mojo, vector DB friendly 🗄️, and perfect for RAG pip
A curated list of vector database solutions, libraries, and resources for AI applications - https://vectordb.works
A minimal, lightweight structured data store designed for small applications, scripts and automation workflows. Built for simplicity, portability and low overhead.
A local LLM-based autonomous agent orchestration platform featuring async background tasks, context-isolated sub-agents, dynamic knowledge injection, and strict security approval gates (Plan Mode).
⚡ Optimize vector searches with a hyper-efficient cache that uses machine learning for faster, smarter data access and reduced costs.
ASAN: A conceptual architecture for a self-creating (autopoietic), energy-efficient, and governable multi-agent AI system.
