Awesome Vector Databases

A curated list of vector database solutions, libraries, and resources for AI applications.

🔥 Acknowledgements

This directory was built and is maintained using the Ever Works Directory Builder platform.
The public-facing website is based on the open-source Directory Website Template.

📑 Table of Contents

Concepts & Definitions (201)
Machine Learning Models (68)
LLM Tools (58)
Vector Database Engines (50)
Cloud Services (15)
LLM Tools (5)
Managed Vector Databases (13)
Managed Vector Databases (22)
Multi Model & Hybrid Databases (15)
SDKs & Libraries (63)
Cloud Services (7)
Curated Resource Lists (51)
Curated Resource Lists (14)
LLM Frameworks (32)
Research Papers & Surveys (110)
Vector Database Engines (40)
Vector Database Extensions (20)
Benchmarks & Evaluation (27)
Data Integration & Migration (9)
Hybrid Search Engines (8)
Llm Frameworks (3)
Machine Learning Models (2)
Multi Model & Hybrid Databases (5)
Sdks & Libraries (9)
Vector Database Extensions (8)
Vector Search Libraries (8)
Benchmarks & Evaluation (7)
Cloud & Managed (3)
Commerce (1)
Commerce (6)
Concepts & Definitions (12)
Data Integration & Migration (12)
Data Processing (4)
Edge Database (2)
Embedded Vector Databases (9)
Evaluation & Observability (2)
Graph Database (4)
Graph Database (1)
Integrations & Extensions (3)
LLM Frameworks (1)
Llm Tools (5)
Multi-Model & Hybrid Databases (2)
Open Source Vector Databases (16)
Open Sources (10)
RAG Frameworks & Pipelines (1)
Relational Databases (3)
Relational Databases (2)
Research Papers & Surveys (23)
Rust-based Vector Databases (4)
SDKs & Libraries (49)
Sdks Libraries (31)
Security & Governance (10)
Security & Governance (1)
Tools (7)
Vector Indexing Research (3)

Concepts & Definitions

Agentic RAG - An advanced RAG architecture where an AI agent autonomously decides which questions to ask, which tools to use, when to retrieve information, and how to aggregate results. Represents a major trend in 2026 for more intelligent and adaptive retrieval systems. (Read more) Rag Ai Agents Llm
ASMR Technique - Agentic Search and Memory Retrieval technique by Supermemory using parallel reader agents and search agents that achieved ~99% accuracy on LongMemEval benchmark. (Read more) Agent Memory Retrieval Multi Agent
Cascading Retrieval - Advanced retrieval approach combining dense vectors, sparse vectors, and reranking in a multi-stage pipeline, achieving up to 48% better performance than single-method retrieval. (Read more) Hybrid Search Rag Retrieval
Dense-Sparse Hybrid Embeddings - Combining dense vector embeddings with sparse representations in a single unified model. Captures both semantic meaning (dense) and exact term matching (sparse) for superior retrieval performance. (Read more) Hybrid Embeddings Sparse
HNSW-IF - Hybrid billion-scale vector search method combining HNSW with inverted file indexes, enabling cost-efficient search by keeping centroids in memory while storing vectors on disk. (Read more) Hnsw Disk Based Scalability
Hybrid Search - A search architecture that combines dense vector embeddings (semantic search) with sparse representations like BM25 (lexical search) to achieve better overall search quality. The industry standard approach for production RAG systems in 2026. (Read more) Hybrid Search Best Practices
Matryoshka Embeddings - Representation learning approach encoding information at multiple granularities, allowing embeddings to be truncated while maintaining performance. Enables 14x smaller sizes and 5x faster search. (Read more) Embeddings Optimization Research
Multimodal RAG - Retrieval-Augmented Generation extended to handle multiple modalities including text, images, video, and audio. Uses multimodal embeddings like Gemini Embedding 2 or CLIP to enable cross-modal search and generation. (Read more) Multimodal Rag Embeddings
RecursiveCharacterTextSplitter - LangChain's hierarchical text chunking strategy achieving 85-90% accuracy by recursively splitting using progressively finer separators to preserve semantic boundaries. (Read more) Chunking Text Processing Rag
Vector Index Comparison Guide (Flat, HNSW, IVF) - Comprehensive comparison of vector indexing strategies including Flat, HNSW, and IVF approaches. Covers performance characteristics, memory requirements, and use case recommendations for 2026. (Read more) Indexing Comparison Best Practices
ACORN Algorithm - Performant and predicate-agnostic search algorithm for vector embeddings with structured data. Uses two-hop graph expansion to maintain high recall under selective filters in Weaviate. (Read more) Ann Graph Based Filtering
ACORN Algorithm for Filtered Vector Search - Advanced algorithm designed to make hybrid searches combining metadata filters and vector similarity more efficient, implemented in Apache Solr and other vector search systems. (Read more) Algorithm Filtering Hybrid Search Optimization
Agent Orchestrator - System that coordinates multiple AI agents to work together on complex tasks, managing task distribution, parallel execution, and result synthesis. Key component in ASMR and other multi-agent systems. (Read more) Multi Agent Orchestration Coordination
Agentic Chunking - An advanced RAG chunking strategy that uses LLMs to dynamically determine optimal document splitting based on semantic meaning and content structure. Agentic chunking analyzes document characteristics and adapts the chunking approach per document for superior retrieval accuracy. (Read more) Chunking Llm Rag Text Processing
Anisotropic Vector Quantization - An advanced quantization technique introduced by Google's ScaNN that prioritizes preserving parallel components between vectors rather than minimizing overall distance. Optimized for Maximum Inner Product Search (MIPS) and significantly improves retrieval accuracy. (Read more) Quantization Algorithm Compression
Ann Algorithm Comparison - Placeholder - comprehensive documentation for ann-algorithm-comparison in vector databases and RAG systems. (Read more) placeholder
ANN Algorithm Complexity Analysis - Computational complexity comparison of approximate nearest neighbor algorithms including build time, query time, and space complexity. Essential for understanding performance characteristics and choosing appropriate algorithms for different scales. (Read more) Algorithm Performance Complexity
Approximate Nearest Neighbors (ANN) - Algorithms and techniques for finding nearest neighbors in high-dimensional vector spaces with speed-accuracy trade-offs. ANN methods like HNSW, IVF, and DiskANN enable billion-scale vector search by sacrificing small amounts of recall for massive performance gains over exact search. (Read more) Algorithm approximate Scalability
Asymmetric Search - A search paradigm where queries and documents are encoded differently, optimized for scenarios where queries are short and documents are long. Common in information retrieval and modern embedding models designed specifically for search. (Read more) Search Embeddings Retrieval
Async Vector Search - Placeholder - comprehensive documentation for async-vector-search in vector databases and RAG systems. (Read more) placeholder
Ball-Tree - Tree-based spatial data structure organizing vectors using spherical regions instead of axis-aligned splits, making it better suited for high-dimensional data compared to KD-trees. (Read more) Tree Based Indexing High Dimensional
BBQ Binary Quantization - Elasticsearch and Lucene's implementation of RaBitQ algorithm for 1-bit vector quantization, renamed as BBQ. Provides 32x compression with asymptotically optimal error bounds, enabling efficient vector search at massive scale with minimal accuracy loss. (Read more) Quantization Compression Elasticsearch
Binary Quantization - Extreme vector compression technique converting each dimension to a single bit (0 or 1), achieving 32x memory reduction and enabling ultra-fast Hamming distance calculations with acceptable accuracy trade-offs. (Read more) Quantization Compression Optimization
Binary Quantization for Vector Search - Compression technique that converts full-precision vectors to binary representations, achieving 32x storage reduction while maintaining 90-95% recall for efficient large-scale vector search. (Read more) Quantization Compression Optimization Binary
BM25 - Best Matching 25 ranking function for information retrieval that ranks documents based on query term frequency with length normalization. Core component of hybrid search RAG systems combining keyword and semantic search. (Read more) Information Retrieval Ranking Keyword Search
BM25 (Okapi BM25) - Probabilistic ranking function for estimating document relevance to search queries. Industry standard for keyword search, combining term frequency, rarity, and length normalization into a single scoring model. (Read more) Ranking Information Retrieval Keyword Search
BM42 - Experimental sparse embedding approach combining exact keyword search with transformer intelligence, integrating sparse and dense vector searches for improved RAG results, developed by Qdrant. (Read more) Sparse Hybrid Search Experimental
Chunk Overlap Strategy - Text chunking technique using 10-20% overlap between consecutive chunks to preserve context continuity and prevent information loss at chunk boundaries for improved retrieval. (Read more) Chunking Rag Text Processing
Chunk Size Optimization - The process of determining optimal text segment sizes for embedding and retrieval in vector databases. Chunk size significantly impacts RAG quality, balancing between capturing complete context (larger chunks) and retrieval precision (smaller chunks), typically ranging from 256 to 1024 tokens. (Read more) RAG Optimization Chunking
Chunking Strategies for RAG - Methods for splitting documents into optimal pieces for vector embedding and retrieval. Includes fixed-size, recursive, semantic, and agentic chunking approaches. (Read more) Rag Document Processing Chunking
Co-partitioned Vector Index - Indexing strategy where vector indexes are stored in the same partitions as corresponding table rows, ensuring data locality and operational advantages in distributed databases. (Read more) Distributed Indexing Architecture
ColBERT and Late Interaction - Multi-vector retrieval architecture where queries and documents are represented by multiple vectors enabling fine-grained matching and improved retrieval quality through late interaction scoring. (Read more) Retrieval Multi Vector Research
Cold Start Problem - The challenge of making recommendations or performing similarity search when there is insufficient historical data for new users, items, or embeddings. In vector databases and RAG systems, cold start affects new documents without usage data, requiring strategies like content-based filtering and hybrid approaches. (Read more) Recommendation challenge System Design
Cold Start Problem in Vector Search - Strategies for handling the cold start problem in vector databases and recommendation systems including hybrid approaches, popularity-based fallbacks, and collaborative filtering techniques. (Read more) cold-start recommendations bootstrapping
Compression Ratio Optimization - Techniques for optimizing the trade-off between memory usage and accuracy in vector quantization, achieving 5-40x compression in systems like Mastra's Observational Memory. (Read more) Compression Optimization Memory
Consistency Levels - Configuration options in distributed vector databases that trade off between data consistency, availability, and performance. Critical for understanding read/write behavior in production systems with replication. (Read more) Distributed Performance Reliability
Context Engineering - Context Engineering is an emerging discipline encompassing the systematic design, construction, and management of the entire information payload provided to an LLM at inference time. It moves beyond crafting single prompts to architecting the complete environment a model uses to reason and respond, including instructions, retrieved knowledge, tools, memory, state, and the user query as structured components. (Read more) Llm Architecture Retrieval Augmented Generation System Design
Context Precision - RAG evaluation metric assessing retriever's ability to rank relevant chunks higher than irrelevant ones, measuring context relevance and ranking quality for optimal retrieval. (Read more) Rag Evaluation Metrics
Context Recall - RAG evaluation metric measuring whether retrieved context contains all information required to produce ideal output, assessing completeness and sufficiency of retrieval. (Read more) Rag Evaluation Retrieval
Context Window - Maximum number of tokens an embedding model or LLM can process in a single input. Critical parameter for vector databases affecting chunk sizes, with modern models supporting 512 to 32,000+ tokens for long-document understanding. (Read more) Llm Embeddings Architecture
Context Window Management in RAG - Strategies for managing LLM context windows in RAG applications including chunk selection, context compression, and techniques for working within token limits while maintaining answer quality. (Read more) context-window Rag Optimization
Context Window Strategies - Techniques for managing limited LLM context windows in RAG systems, including chunk selection, summarization, and iterative retrieval. As context windows fill with retrieved documents, strategies ensure the most relevant information reaches the model while respecting token limits. (Read more) RAG LLM Optimization
Contextual Compression - A RAG optimization technique that compresses retrieved documents by extracting only the most relevant portions relative to the query. Reduces token usage and improves LLM response quality by removing irrelevant context. (Read more) Rag Optimization Compression
Contextual Retrieval - Anthropic's RAG technique that prepends chunk-specific explanatory context before embedding, reducing failed retrievals by 49% (67% with reranking). Uses Contextual Embeddings and Contextual BM25. (Read more) Rag Retrieval Context
Contextual Retrieval - A RAG enhancement technique from Anthropic that adds chunk-specific explanatory context to each document chunk before embedding. Contextual Retrieval reduces retrieval failure rates by 49% and improves accuracy by 67% compared to traditional RAG methods. (Read more) Rag Chunking Retrieval Accuracy
Cosine Similarity - Fundamental similarity metric for vector search measuring the cosine of the angle between vectors. Range from -1 to 1, with 1 indicating identical direction regardless of magnitude. (Read more) Similarity Distance Metric Vector Search
Cross Encoder Rerankers - Placeholder - comprehensive documentation for cross-encoder-rerankers in vector databases and RAG systems. (Read more) placeholder
Cross-Encoder - Neural reranking architecture that examines full query-document pairs simultaneously for deeper semantic understanding, achieving higher accuracy than bi-encoders at the cost of computational efficiency. (Read more) Reranking Neural Networks Nlp
Cross-Encoder Reranking - Two-stage retrieval where initial results from bi-encoder vector search are reranked using more expensive cross-encoder models for higher accuracy. Used in Hindsight and other systems. (Read more) Reranking Retrieval Accuracy
Cross-Modal Search - Search across different modalities using multimodal embeddings, enabling queries like text-to-image, image-to-text, or text-to-video. Powered by models like CLIP, ImageBind, and Gemini Embedding 2 that map different modalities into a shared embedding space. (Read more) Multimodal Cross Modal Search
Cursor-Based Pagination - A pagination technique for efficiently scrolling through large vector database result sets using cursors instead of offsets. Essential for retrieving all vectors in a collection or iterating through search results without performance degradation. (Read more) Pagination Performance Best Practices
Dense Retrieval - An information retrieval approach using dense vector representations (embeddings) to encode queries and documents. Unlike sparse methods like BM25, dense retrieval captures semantic meaning in continuous vector spaces, enabling neural search and forming the foundation of modern RAG systems. (Read more) Retrieval Embeddings Neural Search
Dense Vector Formats - Placeholder - comprehensive documentation for dense-vector-formats in vector databases and RAG systems. (Read more) placeholder
Dense vs Sparse Retrieval - Comparison of dense vector retrieval (neural embeddings) and sparse retrieval (keyword-based) approaches including strengths, weaknesses, and when to use hybrid methods. (Read more) Retrieval Comparison Search
Distance Metrics for Vector Search - Overview of distance metrics including Euclidean, cosine similarity, dot product, and Manhattan distance, with guidance on when to use each for optimal retrieval performance. (Read more) Distance Metrics Similarity Algorithms
Document Chunking Strategies - Placeholder - comprehensive documentation for document-chunking-strategies in vector databases and RAG systems. (Read more) placeholder
Document Parsing for RAG - Critical preprocessing step for RAG systems involving extraction of text, tables, and images from various document formats (PDF, DOCX, HTML) using tools like Unstructured, LlamaParse, and PyPDF. (Read more) Document Processing Rag Preprocessing
Dot Product - Vector similarity metric measuring both directional similarity and magnitude of vectors. Used by many LLMs for training and equivalent to cosine similarity for normalized data. Reports both angle and magnitude information. (Read more) Similarity Distance Metric Llm
Dot Product (Inner Product) - Similarity metric computing sum of element-wise products between vectors. Efficient for normalized vectors, equivalent to cosine similarity when vectors are unit length. (Read more) Similarity Distance Metric Vector Search
Dot Product Similarity - Vector similarity metric combining both angle and magnitude information for comprehensive similarity measurement, equivalent to cosine similarity when vectors are normalized. (Read more) Similarity Search Metrics Algorithm
Early Termination Strategy for HNSW - Optimization technique that allows HNSW vector searches to exit early when the candidate queue remains saturated, reducing latency and resource usage with minimal recall impact. (Read more) Optimization Hnsw Performance Algorithm
Embedding API Latency - The time required to generate vector embeddings from text, images, or other data via API calls or local inference. Embedding latency significantly impacts RAG system performance, with typical ranges from 10ms (local, batch) to 500ms+ (API, single) depending on model size and deployment. (Read more) Performance latency Optimization
Embedding Cache - Caching mechanism for storing and reusing previously computed embeddings to reduce API costs and latency. Essential optimization for production RAG systems processing repeated or similar content. (Read more) Caching Optimization Cost Reduction
Embedding Cache Warming - Placeholder - comprehensive documentation for embedding-cache-warming in vector databases and RAG systems. (Read more) placeholder
Embedding Dimension Selection - Guide to choosing optimal embedding dimensions balancing accuracy, storage costs, and computational requirements, covering Matryoshka embeddings and dimension reduction techniques. (Read more) Embeddings Optimization Dimensions
Embedding Dimensionality - The size of vector embeddings, typically ranging from 384 to 4096 dimensions. Higher dimensions capture more information but increase storage, compute, and latency costs. (Read more) Embeddings Optimization Dimensions
Embedding Dimensions - The size of vector embeddings, typically ranging from 128 to 1536 dimensions for text models. Higher dimensions capture more nuanced semantics but require more storage and computation. Modern techniques like Matryoshka embeddings allow flexible dimension selection from a single model. (Read more) Embeddings Architecture Optimization
Embedding Fine Tuning - Placeholder - comprehensive documentation for embedding-fine-tuning in vector databases and RAG systems. (Read more) placeholder
Embedding Model Distillation - Placeholder - comprehensive documentation for embedding-model-distillation in vector databases and RAG systems. (Read more) placeholder
Embedding Models Overview - Neural networks that convert text, images, or other data into dense vector representations. Enable semantic understanding by mapping similar concepts to nearby points in vector space. (Read more) Embeddings Models Neural Networks
Euclidean Distance - Straight-line distance metric between vectors in multidimensional space, sensitive to both magnitude and direction, ideal when embedding magnitude carries important information. (Read more) Similarity Search Metrics Algorithm
Euclidean Distance (L2 Distance) - Distance metric measuring straight-line distance between vectors in multi-dimensional space. Lower values indicate higher similarity, with 0 meaning identical vectors. (Read more) Distance Metric Similarity Vector Search
Event-Driven Agent Core - Agent architecture pattern in AG2 where agents respond to events rather than polling, enabling better async execution, scalability, and resource efficiency. (Read more) Event Driven Agents Architecture
Faithfulness - RAG evaluation metric measuring whether generated answers accurately align with retrieved context without hallucination, ensuring factual grounding of LLM responses. (Read more) Rag Evaluation Llm
Filtered Vector Search - Combining vector similarity search with metadata filtering. Enables queries like find similar documents published after 2023 in category Technology. (Read more) Filtering Metadata Hybrid Search
Filtered Vector Search Guide - Complete guide to metadata filtering in vector search covering pre-filtering, post-filtering, and hybrid approaches. Addresses the Achilles heel of vector search with modern solutions. (Read more) Filtering Metadata Best Practices
Graph RAG - RAG architecture that combines knowledge graphs with vector databases, enabling multi-hop reasoning, relationship traversal, and structured knowledge representation for more accurate and explainable AI responses. (Read more) Knowledge Graph Rag relationships
GraphRAG - Retrieval-Augmented Generation approach that combines graph databases with vector search for enhanced context retrieval. Uses graph structures to capture relationships between entities while leveraging vector embeddings for semantic search. (Read more) Rag Graph Database Hybrid Approach
GraphRAG - Microsoft's approach to RAG that uses knowledge graphs to enhance retrieval. GraphRAG builds structured representations of documents enabling better context understanding and multi-hop reasoning for complex queries. (Read more) Graph Rag Knowledge Graph Microsoft
Hamming Distance - A distance metric that measures the number of positions at which corresponding elements in two vectors differ. Particularly useful for binary vectors and categorical data, commonly used with binary quantization in vector search. (Read more) Distance Metric Binary Similarity
Hamming Distance for Binary Vector Search - Distance metric for comparing binary vectors using XOR operations, enabling efficient similarity search with dramatically reduced storage requirements compared to full-precision vectors. (Read more) Distance Metric Binary Optimization Local First
HCNNG - Hierarchical Clustering-based Nearest Neighbor Graph using MST to connect dataset points through multiple hierarchical clusters. Performs efficient guided search instead of traditional greedy routing. (Read more) Ann Graph Based Clustering
HNSW (Hierarchical Navigable Small World) - Graph-based algorithm for approximate nearest neighbor search that maintains multi-layer graph structures for efficient vector similarity search with logarithmic complexity, widely used in modern vector databases. (Read more) Algorithm Graph Ann
Hybrid Chunking Strategies - Advanced document chunking approaches that combine multiple chunking methods (fixed-size, semantic, structural) to optimize retrieval in RAG systems. Hybrid strategies adapt to document characteristics for superior performance. (Read more) Chunking Rag Best Practices Optimization
Hybrid Search (BM25 + Vector) - A search approach combining traditional keyword-based BM25 ranking with modern vector similarity search. By leveraging both lexical matching and semantic understanding, hybrid search provides superior retrieval quality through techniques like reciprocal rank fusion (RRF) to merge results from both methods. (Read more) Hybrid Search BM25 Semantic Search
Hybrid Search Best Practices - Comprehensive guide to combining BM25 keyword search with vector semantic search using reciprocal rank fusion and reranking. Essential pattern for production RAG systems in 2026. (Read more) Hybrid Search Rag Best Practices
Hybrid Search Techniques - Best practices for combining vector and keyword search using RRF and weighted fusion for improved retrieval accuracy in RAG systems. (Read more) Hybrid Search Best Practices Rag
Hybrid Search with Reciprocal Rank Fusion - Search technique combining BM25 lexical search and semantic vector search using Reciprocal Rank Fusion (RRF) to merge results, balancing precision of keyword matching with contextual understanding of neural embeddings. (Read more) Hybrid Search Bm25 Ranking
HybridRAG - Next evolution in RAG systems that combines vector databases for semantic similarity with graph databases for relationship exploration and multi-hop reasoning. (Read more) Rag Hybrid Search Graph Vector
Inner Product Similarity - A vector similarity metric that calculates the dot product of two vectors, combining both magnitude and direction. Equivalent to cosine similarity when vectors are normalized, and commonly used for Maximum Inner Product Search (MIPS). (Read more) Distance Metric Similarity Mips
Inverted File Index (IVF) - A vector indexing technique that partitions the vector space into clusters using k-means, then searches only the nearest clusters during queries. Foundation for efficient approximate nearest neighbor search, often combined with product quantization (IVF-PQ). (Read more) Indexing Ivf Clustering
IVF - Inverted File Index vector search algorithm that partitions high-dimensional vectors into clusters using k-means, enabling efficient nearest neighbor search by restricting searches to relevant clusters and dramatically reducing search space. (Read more) Algorithm Indexing Ann
IVF (Inverted File Index) - Clustering-based approximate nearest neighbor algorithm that partitions vector space into Voronoi cells. Fast search through coarse-to-fine strategy, often combined with Product Quantization (IVF-PQ). (Read more) Algorithm Clustering Ann
IVF-FLAT - Inverted File index with FLAT (uncompressed) vectors, partitioning the vector space into clusters with centroids, offering a balance between search speed and accuracy for approximate nearest neighbor search. (Read more) Indexing Ivf Clustering
IVF-FLAT Index - Inverted File Index with flat vectors using K-means clustering to partition high-dimensional space into regions, enhancing search efficiency by narrowing search area through neighbor partitions. (Read more) Indexing Algorithm Ann
IVF-PQ (Inverted File with Product Quantization) - Vector indexing method combining inverted file index with product quantization for memory-efficient search. Reduces storage from 128x4 bytes to 32x1 bytes (1/16th) while maintaining search quality. (Read more) Quantization Indexing Compression
k-NN Search - k-Nearest Neighbors search finds the k closest vectors to a query vector in high-dimensional space. A fundamental operation in vector databases and machine learning, k-NN can be exact (brute force) or approximate (ANN) depending on performance requirements and dataset size. (Read more) Algorithm Search fundamental
KD-Tree - Tree-based data structure for organizing vectors through recursive axis-aligned partitioning, enabling logarithmic time complexity searches for balanced data but struggling with high-dimensional spaces. (Read more) Tree Based Indexing Data Structure
L2 Normalization (Vector Normalization) - A preprocessing technique that scales vectors to unit length, ensuring all vectors lie on a hypersphere. Essential for making cosine similarity equivalent to inner product and improving embedding quality in many applications. (Read more) Normalization Preprocessing Embeddings
Late Chunking - Advanced chunking technique for long-context embeddings where documents are embedded first as a whole, then chunked, preserving contextual information and improving retrieval quality especially for technical documents. (Read more) Chunking Embeddings Rag
Late Interaction - Retrieval paradigm where query and document tokens are encoded separately and interactions computed at search time, combining efficiency of bi-encoders with expressiveness of cross-encoders. (Read more) Retrieval Colbert Neural Search
Late Interaction Retrieval - A retrieval paradigm where query and document encodings are kept separate until a late interaction stage, enabling more expressive and efficient similarity computations. Pioneered by ColBERT and extended by ColPali and ColQwen, this approach maintains fine-grained representations while enabling fast retrieval. (Read more) Retrieval Architecture ColBERT
Lazy Loading Filesystem - Modal Labs' FUSE-based filesystem implementation that loads container images and dependencies on-demand, enabling sub-second container startup times for GPU workloads. (Read more) Optimization Containers Performance
LIRE Protocol - Lightweight incremental rebalancing protocol used in SPFresh for billion-scale vector updates with only 1% DRAM and <10% cores compared to global rebuild approaches. (Read more) Indexing Incremental Algorithm
LLM Caching for Vector Search - Caching strategies for LLM and vector search systems including semantic caching, embedding caching, and response caching to reduce costs and improve latency in RAG applications. (Read more) Caching Performance Cost Optimization
LLMOps - Operational practices and tooling for deploying, monitoring, and maintaining LLM applications in production, encompassing prompt management, model versioning, evaluation, and observability. (Read more) Operations MLOps Production
Locality Sensitive Hashing (LSH) - Algorithmic technique for approximate nearest neighbor search in high-dimensional spaces using hash functions to map similar items to the same buckets with high probability. (Read more) Hashing Ann Algorithm
Locally-Adaptive Vector Quantization - Advanced quantization technique that applies per-vector normalization and scalar quantization, adapting the quantization bounds individually for each vector. Achieves four-fold reduction in vector size while maintaining search accuracy with 26-37% overall memory footprint reduction. (Read more) Quantization Compression Optimization
Manhattan Distance - Vector distance metric calculating the sum of absolute differences between vector components. Measures grid-like distance and is robust to outliers, with faster calculation as data dimensionality increases. (Read more) Similarity Distance Metric High Dimensional
Matryoshka Representation Learning - Training technique enabling flexible embedding dimensions by learning representations where truncated vectors maintain good performance, achieving 75% cost savings when using smaller dimensions. (Read more) Embeddings Optimization Machine Learning
Maximum Inner Product Search (MIPS) - A search problem focused on finding vectors that maximize the inner product with a query vector. Common in recommendation systems and neural search where magnitude carries semantic meaning, requiring specialized algorithms like those in ScaNN. (Read more) Search Algorithm Mips
MaxSim - Maximum Similarity late interaction function introduced by ColBERT for ranking. Calculates cosine similarity between query and document token embeddings, keeping maximum score per query token for highly effective long-document retrieval. (Read more) Colbert Ranking Late Interaction
MaxSim Operator - Scoring function used in late interaction models like ColBERT that computes query-document relevance by finding maximum similarity between each query token and document tokens, then summing. (Read more) Late Interaction Colbert Ranking
Metadata Filtering - The capability to filter vector search results based on metadata attributes before or during similarity search. Metadata filtering enables hybrid queries combining semantic search with structured constraints like dates, categories, tags, or user permissions, crucial for production RAG and search applications. (Read more) Filtering Metadata Search
MSTG (Multi-Stage Tree Graph) - Hierarchical vector index developed by MyScale overcoming IVF limitations through multi-layered design, creating multiple layers unlike IVF's single layer of cluster vectors for improved search performance. (Read more) Indexing Tree Based Hierarchical
Multi Vector Search - Placeholder - comprehensive documentation for multi-vector-search in vector databases and RAG systems. (Read more) placeholder
Multi-Tenancy in Vector Databases - Architectural patterns for isolating and managing data for multiple customers (tenants) in shared vector database infrastructure. Multi-tenancy strategies include namespace isolation, metadata filtering, and separate collections, each offering different trade-offs between performance, cost, and data isolation. (Read more) Architecture Security SaaS
Multi-Tenancy Patterns - Architectural patterns for isolating data between different tenants (customers/organizations) in vector databases. Includes collection-per-tenant, partition-per-tenant, and filter-based approaches with different trade-offs. (Read more) Multi Tenant Architecture Security
Multi-Vector Embeddings - Embedding approach where documents/images are represented by multiple vectors (one per token/patch) rather than a single vector, enabling fine-grained semantic matching. (Read more) Embeddings Colbert Retrieval
Multimodal Embeddings - Vector representations mapping different data types (text, images, audio, video) into a shared embedding space. Enables cross-modal search and understanding. (Read more) Multimodal Embeddings Cross Modal
Multimodal Embeddings (CLIP) - Embeddings that map multiple modalities (text, images, video) into a shared vector space, enabling cross-modal search and retrieval using models like CLIP, SigLIP, and voyage-multimodal-3. (Read more) Multimodal clip image-search
MVCC Vector Indexing - Multi-Version Concurrency Control for vector indexes enabling transactional guarantees and consistent reads in distributed vector databases like YugabyteDB. (Read more) Mvcc Transactions Distributed
Navigable Small World (NSW) - A graph-based approximate nearest neighbor search algorithm that uses both long-range and short-range links to achieve poly-logarithmic search complexity. Foundation for the more advanced HNSW algorithm. (Read more) Graph Based Ann Algorithm
NSW (Navigable Small World) - Graph-based algorithm for approximate nearest neighbor search where vertices represent vectors and edges are constructed heuristically. Foundation for HNSW with (poly/)logarithmic search complexity using greedy routing. (Read more) Ann Graph Based Algorithm
Observer-Reflector Architecture - Memory system architecture used in Mastra's Observational Memory with two background agents that compress and garbage collect conversation history achieving 5-40x compression. (Read more) Memory Compression Architecture
Parent Document Retriever - A RAG technique that indexes small chunks for precise matching but retrieves larger parent documents for LLM context. Balances retrieval precision with comprehensive context by separating indexing granularity from context size. (Read more) Rag Retrieval Chunking
Perpetual Sandbox - Sandbox architecture that maintains state indefinitely while scaling costs to zero during idle periods. Pioneered by Blaxel with sub-25ms resume times from standby mode. (Read more) Sandbox Architecture Cost Optimization
Plan-Execute-Verify Framework - Agent orchestration pattern used by Emergence AI that plans tasks, executes with specialized agents, and verifies results to achieve reliable autonomous workflow automation. (Read more) Agents Workflow Orchestration
Pluggable Orchestration Strategies - Modular agent coordination patterns in AG2 allowing developers to swap orchestration logic without changing agent code, enabling flexible multi-agent workflows. (Read more) Orchestration Modularity Agents
Product Quantization (PQ) - Vector compression technique that splits high-dimensional vectors into subvectors and quantizes each independently, achieving significant memory reduction while enabling approximate similarity search. (Read more) Quantization Compression Optimization
Product Quantization Compression - Lossy vector compression dividing vectors into subvectors for independent quantization. Achieves 8-64x storage reduction while enabling fast approximate distance computation via lookup tables. (Read more) Compression Quantization Pq
Progressive K-Annealing - Training technique in CSRv2 that stabilizes sparsity learning by gradually increasing sparsity constraints, reducing dead neurons from >80% to ~20%. (Read more) Training Sparse Embeddings Optimization
Prompt Engineering for RAG - Best practices and techniques for crafting effective prompts in RAG systems including context formatting, instruction design, few-shot examples, and prompt optimization strategies. (Read more) prompting Rag Llm
Query Expansion for Vector Search - Techniques to improve retrieval by expanding user queries with synonyms, related terms, and reformulations including HyDE, query rewriting, and multi-query approaches. (Read more) Query Optimization Retrieval Rag
Query Expansion Techniques - Placeholder - comprehensive documentation for query-expansion-techniques in vector databases and RAG systems. (Read more) placeholder
RAG (Retrieval-Augmented Generation) - AI technique combining information retrieval with LLM generation. Retrieves relevant context from knowledge base before generating responses, reducing hallucinations and enabling grounded answers. (Read more) Rag Llm Retrieval
Rag Evaluation Datasets - Placeholder - comprehensive documentation for rag-evaluation-datasets in vector databases and RAG systems. (Read more) placeholder
RAG Evaluation Metrics - Industry-standard metrics for evaluating Retrieval-Augmented Generation systems, including Answer Relevancy, Faithfulness, Context Relevance, Context Recall, and Context Precision to ensure quality and reliability. (Read more) Rag Evaluation Metrics
Rag Pipeline Optimization - Placeholder - comprehensive documentation for rag-pipeline-optimization in vector databases and RAG systems. (Read more) placeholder
Range Search - A vector search operation that retrieves all vectors within a specified distance threshold from the query vector, rather than a fixed number of nearest neighbors. Useful for finding all similar items above a quality threshold. (Read more) Search Similarity Threshold
Reciprocal Rank Fusion - Method for combining ranked lists from multiple retrieval systems in hybrid search. Standard technique in RAG pipelines for fusing BM25 and dense vector results before reranking, creating diverse high-confidence candidate sets. (Read more) Hybrid Search Ranking Fusion
Reciprocal Rank Fusion (RRF) - Hybrid search algorithm combining results from multiple ranking systems by computing reciprocal ranks, commonly used to merge dense vector search with sparse keyword search for improved retrieval. (Read more) Hybrid Search Ranking Fusion
Reranking - A two-stage retrieval process where initial candidates from vector search are reordered using more sophisticated models like cross-encoders. Reranking significantly improves result quality by applying computationally expensive models to a small set of candidates, commonly used in RAG systems and search applications. (Read more) Retrieval Ranking RAG
Retrieval Metrics - Performance measurement framework for vector search and RAG systems including recall, precision, nDCG, MRR, and context relevance metrics to evaluate retrieval quality and relevance. (Read more) Evaluation Metrics Performance
Scalar Quantization - Vector compression technique reducing precision of each vector component from 32-bit floats to 8-bit integers, achieving 4x memory reduction with minimal accuracy loss for vector search. (Read more) Quantization Compression Optimization
Self-Querying Retriever - An intelligent retrieval technique where an LLM decomposes natural language queries into semantic search components and metadata filters. Enables more precise retrieval by automatically extracting structured filters from unstructured queries. (Read more) Rag Retrieval Llm
Semantic Caching - AI caching pattern that stores vector embeddings of LLM queries and responses, serving cached results when new queries are semantically similar. Cuts LLM costs by 50%+ with millisecond response times versus seconds for fresh calls. (Read more) Caching Optimization Llm
Semantic Caching - A caching technique that uses vector embeddings to identify and reuse responses for semantically similar queries, reducing LLM costs and latency. Unlike traditional caches based on exact matches, semantic caching achieves cache hit ratios of up to 92% by matching queries based on semantic similarity. (Read more) Caching Embeddings Performance Cost Optimization
Semantic Chunking - Advanced text splitting technique using embeddings to divide documents based on semantic content instead of arbitrary positions, preserving cohesive ideas within chunks for improved RAG performance. (Read more) Chunking Rag Text Processing
Semantic Search - A search approach that understands the meaning and intent of queries rather than just matching keywords. Using vector embeddings and similarity measures, semantic search finds conceptually relevant results even when exact terms don't match, enabling natural language queries and cross-lingual retrieval. (Read more) Search NLP Embeddings
Sentence Window Retrieval - A RAG technique that indexes individual sentences for precise matching but retrieves surrounding sentences (a window) for context. Provides fine-grained retrieval precision while maintaining adequate context for LLM generation. (Read more) Rag Retrieval Chunking
SOAR (Spilling with Orthogonality-Amplified Residuals) - A major algorithmic advancement to Google's ScaNN that introduces controlled redundancy to the vector index, leading to improved search efficiency. Enables even faster vector search while maintaining or improving accuracy. (Read more) Algorithm Google Optimization
Sparse Retrieval - Information retrieval using high-dimensional sparse vectors where most values are zero, typically based on term frequency methods like BM25. Sparse retrieval excels at exact keyword matching and is interpretable, often combined with dense retrieval in hybrid search systems for robust performance. (Read more) Retrieval BM25 Keyword Search
Sparse Vectors (SPLADE) - Learned sparse representation technique that creates interpretable, high-dimensional sparse vectors for text, combining benefits of traditional keyword search with neural approaches for improved retrieval. (Read more) Sparse Vectors Neural Search Interpretable
Statistical Binary Quantization - Compression method developed by Timescale researchers that improves on standard Binary Quantization, reducing vector memory footprint by 32x while maintaining high accuracy for filtered searches. (Read more) Quantization Compression timescale
Streaming Vector Indexing - Real-time indexing of vectors as they arrive in a stream, enabling immediate searchability without batch processing delays. Critical for applications requiring up-to-the-second freshness like social media, news, or real-time recommendations. (Read more) Streaming Real Time Indexing
Supervised Contrastive Objectives - Training technique in CSRv2 that enhances representational quality of sparse embeddings by using labeled data to guide the learning process. (Read more) Training Machine Learning Optimization
Temporal Knowledge Graph - Knowledge graph architecture where facts have validity windows showing when they became true and were superseded. Core component of Zep AI's Graphiti and other agent memory systems. (Read more) Knowledge Graph Temporal Agent Memory
Term Expansion - A retrieval technique that expands queries or documents with related but not literally present terms. Key feature of learned sparse models like SPLADE, enabling identification of relevant documents even when exact terms don't match. (Read more) Search Splade Sparse Embeddings
Text Chunking Strategies for RAG - Essential techniques for splitting documents into optimal-sized chunks for Retrieval-Augmented Generation, including fixed-size, recursive, semantic, and document-based chunking with overlap strategies to preserve context. (Read more) Rag Text Processing Retrieval
Text-to-Cypher - Natural language to Cypher query generation for Neo4j graph databases. Enables users to query knowledge graphs using plain English, critical component of GraphRAG systems for generating graph traversal queries from natural language questions. (Read more) Graphrag Knowledge Graph Llm
Tree-Based Indexing - A family of vector indexing methods using tree data structures like KD-trees, Ball-trees, and R-trees for spatial partitioning. Provides logarithmic search complexity for low to medium dimensional data, though effectiveness decreases in very high dimensions. (Read more) Tree Based Indexing Spatial Indexing
TreeAH - Vector index type based on Google's ScaNN algorithm combining tree-like structure with Asymmetric Hashing quantization, optimized for batch queries with 10x faster index generation and smaller memory footprint. (Read more) Indexing Quantization Google
UMAP - Uniform Manifold Approximation and Projection - a non-linear dimensionality reduction technique that preserves both local and global data structure. More scalable than t-SNE while maintaining superior visualization quality and cluster separation for high-dimensional embeddings. (Read more) Dimensionality Reduction Visualization Manifold Learning
Vamana - Graph-based indexing algorithm powering Microsoft's DiskANN. Uses flat graph structure with minimized search diameter for efficient disk-based nearest neighbor search with 40x GPU speedup available via NVIDIA cuVS. (Read more) Ann Graph Based Algorithm
Vector Compression Techniques - Placeholder - comprehensive documentation for vector-compression-techniques in vector databases and RAG systems. (Read more) placeholder
Vector Database Backup and Recovery - Best practices for backing up vector databases, disaster recovery planning, point-in-time recovery, and data migration strategies to prevent data loss and ensure business continuity. (Read more) Backup Disaster Recovery Operations
Vector Database Backup and Recovery Guide - Best practices for backup and disaster recovery in vector databases. Covers full/incremental backups, replication strategies, and cloud-native approaches for safeguarding high-dimensional embeddings. (Read more) Backup Disaster Recovery Best Practices
Vector Database Backup and Restore - Strategies for backing up vector databases and restoring from failures, including snapshots, incremental backups, and disaster recovery. Proper backup procedures are essential for production vector databases to prevent data loss and ensure business continuity in RAG and search systems. (Read more) Backup Disaster Recovery Operations
Vector Database Backup Strategies - Best practices and techniques for backing up vector databases including snapshots, continuous backups, and disaster recovery. Critical for production systems to prevent data loss and enable point-in-time recovery. (Read more) Backup Disaster Recovery Operations
Vector Database Cost Optimization - Comprehensive strategies for reducing vector database costs through embedding model selection, quantization, caching, and infrastructure choices. Critical for production deployments at scale. (Read more) Cost Optimization pricing Best Practices Scalability
Vector Database Cost Optimization Guide - Comprehensive strategies for reducing vector database costs including storage management, compute optimization, and monitoring. Covers cloud pricing trends and hidden costs in 2026. (Read more) Cost Optimization Cloud Best Practices
Vector Database Deletion and Updates - Strategies for deleting and updating vectors in production systems including soft deletes, versioning, and rebuild patterns. Critical for maintaining data accuracy and handling GDPR/compliance requirements. (Read more) Operations Data Management Compliance
Vector Database Migration - Placeholder - comprehensive documentation for vector-database-migration in vector databases and RAG systems. (Read more) placeholder
Vector Database Migration Strategies - Guide to migrating vector databases including export/import procedures, zero-downtime migration patterns, data validation, and strategies for changing providers or versions. (Read more) Migration data-transfer Operations
Vector Database Monitoring - Placeholder - comprehensive documentation for vector-database-monitoring in vector databases and RAG systems. (Read more) placeholder
Vector Database Performance Tuning Guide - Comprehensive guide covering index optimization, quantization, caching, and parameter tuning for vector databases. Includes techniques for balancing performance, cost, and accuracy at scale. (Read more) Performance Optimization Best Practices
Vector Database Schema Design - Best practices for designing vector database schemas including vector dimensions, metadata structure, indexing strategies, and collection organization. Critical for performance, scalability, and maintainability. (Read more) Schema Design Best Practices
Vector Database Security - Placeholder - comprehensive documentation for vector-database-security in vector databases and RAG systems. (Read more) placeholder
Vector Database Sharding - Distributing vector data across multiple nodes for horizontal scaling. Enables handling billions of vectors by partitioning data and parallelizing queries. (Read more) Sharding Scalability Distributed
Vector Database Sharding Strategies - Approaches for distributing vectors across multiple nodes including horizontal sharding, data partitioning, and routing strategies for scaling vector search to billions of vectors. (Read more) Scalability distributed-systems Architecture
Vector Database Testing - Placeholder - comprehensive documentation for vector-database-testing in vector databases and RAG systems. (Read more) placeholder
Vector Database Testing Strategies - Comprehensive testing approaches for vector databases including unit tests, integration tests, performance tests, and chaos engineering for ensuring reliability and quality in production. (Read more) Testing qa Reliability
Vector Database Use Cases - Applications of vector databases across industries including semantic search, RAG systems, recommendations, anomaly detection, and multimodal search. (Read more) Use Cases Applications Ai
Vector Deduplication - Techniques for identifying and removing duplicate or near-duplicate vectors in databases using similarity thresholds. Deduplication reduces storage costs, improves search quality, and prevents redundant results in RAG systems by detecting semantically identical content even when textual representations differ. (Read more) Data Quality Optimization Preprocessing
Vector Dimensionality - Number of components in an embedding vector, typically ranging from 128 to 4096 dimensions. Higher dimensions can capture more information but increase storage, computation, and costs. Critical design parameter for vector databases. (Read more) Embeddings Optimization Architecture
Vector Dimensionality Reduction - Techniques for reducing embedding dimensions while preserving semantic information, including PCA, random projection, and learned compression methods like Matryoshka embeddings. Dimensionality reduction enables faster search, lower storage costs, and efficient deployment at scale. (Read more) Optimization Compression Embeddings
Vector Index Build Strategies - Techniques for efficiently building vector indexes including batch construction, incremental updates, and online indexing. Critical for production systems that need to balance indexing speed, search performance, and resource utilization. (Read more) Indexing Performance Operations
Vector Index Rebuild Strategies - Approaches for updating vector database indexes when data changes significantly, including zero-downtime rebuilds, incremental updates, and blue-green deployments. Index rebuilds are necessary when adding large batches of vectors, changing parameters, or optimizing performance in production systems. (Read more) Operations Maintenance Performance
Vector Index Sharding - Placeholder - comprehensive documentation for vector-index-sharding in vector databases and RAG systems. (Read more) placeholder
Vector Index Types - Different indexing strategies for vector databases including HNSW, IVF, LSH, and flat indexes. Each type offers different trade-offs between query speed, build time, accuracy, and memory usage. Understanding index types is crucial for optimizing vector database performance at scale. (Read more) Indexing Performance Algorithms
Vector Normalization - The process of scaling vectors to unit length (L2 normalization) or other standard forms. Normalized vectors enable cosine similarity computation via simple dot product and are essential for many embedding models and distance metrics used in vector databases. (Read more) Preprocessing mathematics Embeddings
Vector Normalization (L2 Normalization) - Essential preprocessing technique that scales embedding vectors to unit length using L2 norm, ensuring consistent magnitude and making cosine similarity equivalent to dot product for faster computation. (Read more) Preprocessing Normalization Embeddings
Vector Quantization Techniques - Methods for compressing vector embeddings to reduce storage and memory costs. Includes scalar quantization, product quantization, and binary quantization with varying compression-accuracy tradeoffs. (Read more) Compression Optimization Cost Reduction
Vector Query Optimization - Techniques for optimizing vector search queries including parameter tuning, result caching, batch queries, and index selection. Critical for achieving production-grade performance and cost efficiency. (Read more) Optimization Performance Query
Vector Search at the Edge - Techniques and tools for deploying vector search in edge environments including embedded databases, WASM implementations, and edge-optimized models for privacy and low-latency applications. (Read more) edge-computing Embedded Privacy
Vector Search Caching - Strategies for caching vector search results, embeddings, and frequently accessed data to reduce latency and costs in RAG systems. Effective caching can eliminate redundant embedding API calls and vector searches for common queries, significantly improving performance and reducing infrastructure costs. (Read more) Caching Performance Optimization
Vector Search Explain - Placeholder - comprehensive documentation for vector-search-explain in vector databases and RAG systems. (Read more) placeholder
Vector Similarity Metrics - Mathematical measures for comparing vector similarity including cosine similarity (directional), Euclidean distance (geometric), dot product (magnitude+direction), and Manhattan distance (grid-based) for AI and search applications. (Read more) Similarity Distance Metrics
Vector Similarity Search - Finding nearest vectors in high-dimensional space based on distance or similarity metrics. Core operation of vector databases enabling semantic search, recommendations, and RAG. (Read more) Similarity Search Vectors
Zero-Shot Classification with Embeddings - Using vector embeddings to classify items into categories without training data for those specific categories. Leverages semantic similarity between text and category descriptions for instant classification. (Read more) Classification Zero Shot Embeddings

Machine Learning Models

Release History
Version Changes Urgency Date
master@2026-05-03 Latest activity on master branch High 5/3/2026
0.0.0 No release found — using repo HEAD High 4/21/2026

awesome-vector-databases

Description

README

Awesome Vector Databases

🔥 Acknowledgements

📑 Table of Contents

Concepts & Definitions

Machine Learning Models

Dependencies & License Audit

Similar Packages

More from ever-works

More in Databases

Version	Changes	Urgency	Date
master@2026-05-03	Latest activity on master branch	High	5/3/2026
0.0.0	No release found — using repo HEAD	High	4/21/2026