A curated list of vector database solutions, libraries, and resources for AI applications.
This directory was built and is maintained using the Ever Works Directory Builder platform.
The public-facing website is based on the open-source Directory Website Template.
- Concepts & Definitions (201)
- Machine Learning Models (68)
- LLM Tools (58)
- Vector Database Engines (50)
- Cloud Services (15)
- LLM Tools (5)
- Managed Vector Databases (13)
- Managed Vector Databases (22)
- Multi Model & Hybrid Databases (15)
- SDKs & Libraries (63)
- Cloud Services (7)
- Curated Resource Lists (51)
- Curated Resource Lists (14)
- LLM Frameworks (32)
- Research Papers & Surveys (110)
- Vector Database Engines (40)
- Vector Database Extensions (20)
- Benchmarks & Evaluation (27)
- Data Integration & Migration (9)
- Hybrid Search Engines (8)
- Llm Frameworks (3)
- Machine Learning Models (2)
- Multi Model & Hybrid Databases (5)
- Sdks & Libraries (9)
- Vector Database Extensions (8)
- Vector Search Libraries (8)
- Benchmarks & Evaluation (7)
- Cloud & Managed (3)
- Commerce (1)
- Commerce (6)
- Concepts & Definitions (12)
- Data Integration & Migration (12)
- Data Processing (4)
- Edge Database (2)
- Embedded Vector Databases (9)
- Evaluation & Observability (2)
- Graph Database (4)
- Graph Database (1)
- Integrations & Extensions (3)
- LLM Frameworks (1)
- Llm Tools (5)
- Multi-Model & Hybrid Databases (2)
- Open Source Vector Databases (16)
- Open Sources (10)
- RAG Frameworks & Pipelines (1)
- Relational Databases (3)
- Relational Databases (2)
- Research Papers & Surveys (23)
- Rust-based Vector Databases (4)
- SDKs & Libraries (49)
- Sdks Libraries (31)
- Security & Governance (10)
- Security & Governance (1)
- Tools (7)
- Vector Indexing Research (3)
- Agentic RAG - An advanced RAG architecture where an AI agent autonomously decides which questions to ask, which tools to use, when to retrieve information, and how to aggregate results. Represents a major trend in 2026 for more intelligent and adaptive retrieval systems. (Read more)
RagAi AgentsLlm - ASMR Technique - Agentic Search and Memory Retrieval technique by Supermemory using parallel reader agents and search agents that achieved ~99% accuracy on LongMemEval benchmark. (Read more)
Agent MemoryRetrievalMulti Agent - Cascading Retrieval - Advanced retrieval approach combining dense vectors, sparse vectors, and reranking in a multi-stage pipeline, achieving up to 48% better performance than single-method retrieval. (Read more)
Hybrid SearchRagRetrieval - Dense-Sparse Hybrid Embeddings - Combining dense vector embeddings with sparse representations in a single unified model. Captures both semantic meaning (dense) and exact term matching (sparse) for superior retrieval performance. (Read more)
HybridEmbeddingsSparse - HNSW-IF - Hybrid billion-scale vector search method combining HNSW with inverted file indexes, enabling cost-efficient search by keeping centroids in memory while storing vectors on disk. (Read more)
HnswDisk BasedScalability - Hybrid Search - A search architecture that combines dense vector embeddings (semantic search) with sparse representations like BM25 (lexical search) to achieve better overall search quality. The industry standard approach for production RAG systems in 2026. (Read more)
HybridSearchBest Practices - Matryoshka Embeddings - Representation learning approach encoding information at multiple granularities, allowing embeddings to be truncated while maintaining performance. Enables 14x smaller sizes and 5x faster search. (Read more)
EmbeddingsOptimizationResearch - Multimodal RAG - Retrieval-Augmented Generation extended to handle multiple modalities including text, images, video, and audio. Uses multimodal embeddings like Gemini Embedding 2 or CLIP to enable cross-modal search and generation. (Read more)
MultimodalRagEmbeddings - RecursiveCharacterTextSplitter - LangChain's hierarchical text chunking strategy achieving 85-90% accuracy by recursively splitting using progressively finer separators to preserve semantic boundaries. (Read more)
ChunkingText ProcessingRag - Vector Index Comparison Guide (Flat, HNSW, IVF) - Comprehensive comparison of vector indexing strategies including Flat, HNSW, and IVF approaches. Covers performance characteristics, memory requirements, and use case recommendations for 2026. (Read more)
IndexingComparisonBest Practices - ACORN Algorithm - Performant and predicate-agnostic search algorithm for vector embeddings with structured data. Uses two-hop graph expansion to maintain high recall under selective filters in Weaviate. (Read more)
AnnGraph BasedFiltering - ACORN Algorithm for Filtered Vector Search - Advanced algorithm designed to make hybrid searches combining metadata filters and vector similarity more efficient, implemented in Apache Solr and other vector search systems. (Read more)
AlgorithmFilteringHybrid SearchOptimization - Agent Orchestrator - System that coordinates multiple AI agents to work together on complex tasks, managing task distribution, parallel execution, and result synthesis. Key component in ASMR and other multi-agent systems. (Read more)
Multi AgentOrchestrationCoordination - Agentic Chunking - An advanced RAG chunking strategy that uses LLMs to dynamically determine optimal document splitting based on semantic meaning and content structure. Agentic chunking analyzes document characteristics and adapts the chunking approach per document for superior retrieval accuracy. (Read more)
ChunkingLlmRagText Processing - Anisotropic Vector Quantization - An advanced quantization technique introduced by Google's ScaNN that prioritizes preserving parallel components between vectors rather than minimizing overall distance. Optimized for Maximum Inner Product Search (MIPS) and significantly improves retrieval accuracy. (Read more)
QuantizationAlgorithmCompression - Ann Algorithm Comparison - Placeholder - comprehensive documentation for ann-algorithm-comparison in vector databases and RAG systems. (Read more)
placeholder - ANN Algorithm Complexity Analysis - Computational complexity comparison of approximate nearest neighbor algorithms including build time, query time, and space complexity. Essential for understanding performance characteristics and choosing appropriate algorithms for different scales. (Read more)
AlgorithmPerformanceComplexity - Approximate Nearest Neighbors (ANN) - Algorithms and techniques for finding nearest neighbors in high-dimensional vector spaces with speed-accuracy trade-offs. ANN methods like HNSW, IVF, and DiskANN enable billion-scale vector search by sacrificing small amounts of recall for massive performance gains over exact search. (Read more)
AlgorithmapproximateScalability - Asymmetric Search - A search paradigm where queries and documents are encoded differently, optimized for scenarios where queries are short and documents are long. Common in information retrieval and modern embedding models designed specifically for search. (Read more)
SearchEmbeddingsRetrieval - Async Vector Search - Placeholder - comprehensive documentation for async-vector-search in vector databases and RAG systems. (Read more)
placeholder - Ball-Tree - Tree-based spatial data structure organizing vectors using spherical regions instead of axis-aligned splits, making it better suited for high-dimensional data compared to KD-trees. (Read more)
Tree BasedIndexingHigh Dimensional - BBQ Binary Quantization - Elasticsearch and Lucene's implementation of RaBitQ algorithm for 1-bit vector quantization, renamed as BBQ. Provides 32x compression with asymptotically optimal error bounds, enabling efficient vector search at massive scale with minimal accuracy loss. (Read more)
QuantizationCompressionElasticsearch - Binary Quantization - Extreme vector compression technique converting each dimension to a single bit (0 or 1), achieving 32x memory reduction and enabling ultra-fast Hamming distance calculations with acceptable accuracy trade-offs. (Read more)
QuantizationCompressionOptimization - Binary Quantization for Vector Search - Compression technique that converts full-precision vectors to binary representations, achieving 32x storage reduction while maintaining 90-95% recall for efficient large-scale vector search. (Read more)
QuantizationCompressionOptimizationBinary - BM25 - Best Matching 25 ranking function for information retrieval that ranks documents based on query term frequency with length normalization. Core component of hybrid search RAG systems combining keyword and semantic search. (Read more)
Information RetrievalRankingKeyword Search - BM25 (Okapi BM25) - Probabilistic ranking function for estimating document relevance to search queries. Industry standard for keyword search, combining term frequency, rarity, and length normalization into a single scoring model. (Read more)
RankingInformation RetrievalKeyword Search - BM42 - Experimental sparse embedding approach combining exact keyword search with transformer intelligence, integrating sparse and dense vector searches for improved RAG results, developed by Qdrant. (Read more)
SparseHybrid SearchExperimental - Chunk Overlap Strategy - Text chunking technique using 10-20% overlap between consecutive chunks to preserve context continuity and prevent information loss at chunk boundaries for improved retrieval. (Read more)
ChunkingRagText Processing - Chunk Size Optimization - The process of determining optimal text segment sizes for embedding and retrieval in vector databases. Chunk size significantly impacts RAG quality, balancing between capturing complete context (larger chunks) and retrieval precision (smaller chunks), typically ranging from 256 to 1024 tokens. (Read more)
RAGOptimizationChunking - Chunking Strategies for RAG - Methods for splitting documents into optimal pieces for vector embedding and retrieval. Includes fixed-size, recursive, semantic, and agentic chunking approaches. (Read more)
RagDocument ProcessingChunking - Co-partitioned Vector Index - Indexing strategy where vector indexes are stored in the same partitions as corresponding table rows, ensuring data locality and operational advantages in distributed databases. (Read more)
DistributedIndexingArchitecture - ColBERT and Late Interaction - Multi-vector retrieval architecture where queries and documents are represented by multiple vectors enabling fine-grained matching and improved retrieval quality through late interaction scoring. (Read more)
RetrievalMulti VectorResearch - Cold Start Problem - The challenge of making recommendations or performing similarity search when there is insufficient historical data for new users, items, or embeddings. In vector databases and RAG systems, cold start affects new documents without usage data, requiring strategies like content-based filtering and hybrid approaches. (Read more)
RecommendationchallengeSystem Design - Cold Start Problem in Vector Search - Strategies for handling the cold start problem in vector databases and recommendation systems including hybrid approaches, popularity-based fallbacks, and collaborative filtering techniques. (Read more)
cold-startrecommendationsbootstrapping - Compression Ratio Optimization - Techniques for optimizing the trade-off between memory usage and accuracy in vector quantization, achieving 5-40x compression in systems like Mastra's Observational Memory. (Read more)
CompressionOptimizationMemory - Consistency Levels - Configuration options in distributed vector databases that trade off between data consistency, availability, and performance. Critical for understanding read/write behavior in production systems with replication. (Read more)
DistributedPerformanceReliability - Context Engineering - Context Engineering is an emerging discipline encompassing the systematic design, construction, and management of the entire information payload provided to an LLM at inference time. It moves beyond crafting single prompts to architecting the complete environment a model uses to reason and respond, including instructions, retrieved knowledge, tools, memory, state, and the user query as structured components. (Read more)
Llm ArchitectureRetrieval Augmented GenerationSystem Design - Context Precision - RAG evaluation metric assessing retriever's ability to rank relevant chunks higher than irrelevant ones, measuring context relevance and ranking quality for optimal retrieval. (Read more)
RagEvaluationMetrics - Context Recall - RAG evaluation metric measuring whether retrieved context contains all information required to produce ideal output, assessing completeness and sufficiency of retrieval. (Read more)
RagEvaluationRetrieval - Context Window - Maximum number of tokens an embedding model or LLM can process in a single input. Critical parameter for vector databases affecting chunk sizes, with modern models supporting 512 to 32,000+ tokens for long-document understanding. (Read more)
LlmEmbeddingsArchitecture - Context Window Management in RAG - Strategies for managing LLM context windows in RAG applications including chunk selection, context compression, and techniques for working within token limits while maintaining answer quality. (Read more)
context-windowRagOptimization - Context Window Strategies - Techniques for managing limited LLM context windows in RAG systems, including chunk selection, summarization, and iterative retrieval. As context windows fill with retrieved documents, strategies ensure the most relevant information reaches the model while respecting token limits. (Read more)
RAGLLMOptimization - Contextual Compression - A RAG optimization technique that compresses retrieved documents by extracting only the most relevant portions relative to the query. Reduces token usage and improves LLM response quality by removing irrelevant context. (Read more)
RagOptimizationCompression - Contextual Retrieval - Anthropic's RAG technique that prepends chunk-specific explanatory context before embedding, reducing failed retrievals by 49% (67% with reranking). Uses Contextual Embeddings and Contextual BM25. (Read more)
RagRetrievalContext - Contextual Retrieval - A RAG enhancement technique from Anthropic that adds chunk-specific explanatory context to each document chunk before embedding. Contextual Retrieval reduces retrieval failure rates by 49% and improves accuracy by 67% compared to traditional RAG methods. (Read more)
RagChunkingRetrievalAccuracy - Cosine Similarity - Fundamental similarity metric for vector search measuring the cosine of the angle between vectors. Range from -1 to 1, with 1 indicating identical direction regardless of magnitude. (Read more)
SimilarityDistance MetricVector Search - Cross Encoder Rerankers - Placeholder - comprehensive documentation for cross-encoder-rerankers in vector databases and RAG systems. (Read more)
placeholder - Cross-Encoder - Neural reranking architecture that examines full query-document pairs simultaneously for deeper semantic understanding, achieving higher accuracy than bi-encoders at the cost of computational efficiency. (Read more)
RerankingNeural NetworksNlp - Cross-Encoder Reranking - Two-stage retrieval where initial results from bi-encoder vector search are reranked using more expensive cross-encoder models for higher accuracy. Used in Hindsight and other systems. (Read more)
RerankingRetrievalAccuracy - Cross-Modal Search - Search across different modalities using multimodal embeddings, enabling queries like text-to-image, image-to-text, or text-to-video. Powered by models like CLIP, ImageBind, and Gemini Embedding 2 that map different modalities into a shared embedding space. (Read more)
MultimodalCross ModalSearch - Cursor-Based Pagination - A pagination technique for efficiently scrolling through large vector database result sets using cursors instead of offsets. Essential for retrieving all vectors in a collection or iterating through search results without performance degradation. (Read more)
PaginationPerformanceBest Practices - Dense Retrieval - An information retrieval approach using dense vector representations (embeddings) to encode queries and documents. Unlike sparse methods like BM25, dense retrieval captures semantic meaning in continuous vector spaces, enabling neural search and forming the foundation of modern RAG systems. (Read more)
RetrievalEmbeddingsNeural Search - Dense Vector Formats - Placeholder - comprehensive documentation for dense-vector-formats in vector databases and RAG systems. (Read more)
placeholder - Dense vs Sparse Retrieval - Comparison of dense vector retrieval (neural embeddings) and sparse retrieval (keyword-based) approaches including strengths, weaknesses, and when to use hybrid methods. (Read more)
RetrievalComparisonSearch - Distance Metrics for Vector Search - Overview of distance metrics including Euclidean, cosine similarity, dot product, and Manhattan distance, with guidance on when to use each for optimal retrieval performance. (Read more)
Distance MetricsSimilarityAlgorithms - Document Chunking Strategies - Placeholder - comprehensive documentation for document-chunking-strategies in vector databases and RAG systems. (Read more)
placeholder - Document Parsing for RAG - Critical preprocessing step for RAG systems involving extraction of text, tables, and images from various document formats (PDF, DOCX, HTML) using tools like Unstructured, LlamaParse, and PyPDF. (Read more)
Document ProcessingRagPreprocessing - Dot Product - Vector similarity metric measuring both directional similarity and magnitude of vectors. Used by many LLMs for training and equivalent to cosine similarity for normalized data. Reports both angle and magnitude information. (Read more)
SimilarityDistance MetricLlm - Dot Product (Inner Product) - Similarity metric computing sum of element-wise products between vectors. Efficient for normalized vectors, equivalent to cosine similarity when vectors are unit length. (Read more)
SimilarityDistance MetricVector Search - Dot Product Similarity - Vector similarity metric combining both angle and magnitude information for comprehensive similarity measurement, equivalent to cosine similarity when vectors are normalized. (Read more)
Similarity SearchMetricsAlgorithm - Early Termination Strategy for HNSW - Optimization technique that allows HNSW vector searches to exit early when the candidate queue remains saturated, reducing latency and resource usage with minimal recall impact. (Read more)
OptimizationHnswPerformanceAlgorithm - Embedding API Latency - The time required to generate vector embeddings from text, images, or other data via API calls or local inference. Embedding latency significantly impacts RAG system performance, with typical ranges from 10ms (local, batch) to 500ms+ (API, single) depending on model size and deployment. (Read more)
PerformancelatencyOptimization - Embedding Cache - Caching mechanism for storing and reusing previously computed embeddings to reduce API costs and latency. Essential optimization for production RAG systems processing repeated or similar content. (Read more)
CachingOptimizationCost Reduction - Embedding Cache Warming - Placeholder - comprehensive documentation for embedding-cache-warming in vector databases and RAG systems. (Read more)
placeholder - Embedding Dimension Selection - Guide to choosing optimal embedding dimensions balancing accuracy, storage costs, and computational requirements, covering Matryoshka embeddings and dimension reduction techniques. (Read more)
EmbeddingsOptimizationDimensions - Embedding Dimensionality - The size of vector embeddings, typically ranging from 384 to 4096 dimensions. Higher dimensions capture more information but increase storage, compute, and latency costs. (Read more)
EmbeddingsOptimizationDimensions - Embedding Dimensions - The size of vector embeddings, typically ranging from 128 to 1536 dimensions for text models. Higher dimensions capture more nuanced semantics but require more storage and computation. Modern techniques like Matryoshka embeddings allow flexible dimension selection from a single model. (Read more)
EmbeddingsArchitectureOptimization - Embedding Fine Tuning - Placeholder - comprehensive documentation for embedding-fine-tuning in vector databases and RAG systems. (Read more)
placeholder - Embedding Model Distillation - Placeholder - comprehensive documentation for embedding-model-distillation in vector databases and RAG systems. (Read more)
placeholder - Embedding Models Overview - Neural networks that convert text, images, or other data into dense vector representations. Enable semantic understanding by mapping similar concepts to nearby points in vector space. (Read more)
EmbeddingsModelsNeural Networks - Euclidean Distance - Straight-line distance metric between vectors in multidimensional space, sensitive to both magnitude and direction, ideal when embedding magnitude carries important information. (Read more)
Similarity SearchMetricsAlgorithm - Euclidean Distance (L2 Distance) - Distance metric measuring straight-line distance between vectors in multi-dimensional space. Lower values indicate higher similarity, with 0 meaning identical vectors. (Read more)
Distance MetricSimilarityVector Search - Event-Driven Agent Core - Agent architecture pattern in AG2 where agents respond to events rather than polling, enabling better async execution, scalability, and resource efficiency. (Read more)
Event DrivenAgentsArchitecture - Faithfulness - RAG evaluation metric measuring whether generated answers accurately align with retrieved context without hallucination, ensuring factual grounding of LLM responses. (Read more)
RagEvaluationLlm - Filtered Vector Search - Combining vector similarity search with metadata filtering. Enables queries like find similar documents published after 2023 in category Technology. (Read more)
FilteringMetadataHybrid Search - Filtered Vector Search Guide - Complete guide to metadata filtering in vector search covering pre-filtering, post-filtering, and hybrid approaches. Addresses the Achilles heel of vector search with modern solutions. (Read more)
FilteringMetadataBest Practices - Graph RAG - RAG architecture that combines knowledge graphs with vector databases, enabling multi-hop reasoning, relationship traversal, and structured knowledge representation for more accurate and explainable AI responses. (Read more)
Knowledge GraphRagrelationships - GraphRAG - Retrieval-Augmented Generation approach that combines graph databases with vector search for enhanced context retrieval. Uses graph structures to capture relationships between entities while leveraging vector embeddings for semantic search. (Read more)
RagGraph DatabaseHybrid Approach - GraphRAG - Microsoft's approach to RAG that uses knowledge graphs to enhance retrieval. GraphRAG builds structured representations of documents enabling better context understanding and multi-hop reasoning for complex queries. (Read more)
GraphRagKnowledge GraphMicrosoft - Hamming Distance - A distance metric that measures the number of positions at which corresponding elements in two vectors differ. Particularly useful for binary vectors and categorical data, commonly used with binary quantization in vector search. (Read more)
Distance MetricBinarySimilarity - Hamming Distance for Binary Vector Search - Distance metric for comparing binary vectors using XOR operations, enabling efficient similarity search with dramatically reduced storage requirements compared to full-precision vectors. (Read more)
Distance MetricBinaryOptimizationLocal First - HCNNG - Hierarchical Clustering-based Nearest Neighbor Graph using MST to connect dataset points through multiple hierarchical clusters. Performs efficient guided search instead of traditional greedy routing. (Read more)
AnnGraph BasedClustering - HNSW (Hierarchical Navigable Small World) - Graph-based algorithm for approximate nearest neighbor search that maintains multi-layer graph structures for efficient vector similarity search with logarithmic complexity, widely used in modern vector databases. (Read more)
AlgorithmGraphAnn - Hybrid Chunking Strategies - Advanced document chunking approaches that combine multiple chunking methods (fixed-size, semantic, structural) to optimize retrieval in RAG systems. Hybrid strategies adapt to document characteristics for superior performance. (Read more)
ChunkingRagBest PracticesOptimization - Hybrid Search (BM25 + Vector) - A search approach combining traditional keyword-based BM25 ranking with modern vector similarity search. By leveraging both lexical matching and semantic understanding, hybrid search provides superior retrieval quality through techniques like reciprocal rank fusion (RRF) to merge results from both methods. (Read more)
Hybrid SearchBM25Semantic Search - Hybrid Search Best Practices - Comprehensive guide to combining BM25 keyword search with vector semantic search using reciprocal rank fusion and reranking. Essential pattern for production RAG systems in 2026. (Read more)
Hybrid SearchRagBest Practices - Hybrid Search Techniques - Best practices for combining vector and keyword search using RRF and weighted fusion for improved retrieval accuracy in RAG systems. (Read more)
Hybrid SearchBest PracticesRag - Hybrid Search with Reciprocal Rank Fusion - Search technique combining BM25 lexical search and semantic vector search using Reciprocal Rank Fusion (RRF) to merge results, balancing precision of keyword matching with contextual understanding of neural embeddings. (Read more)
Hybrid SearchBm25Ranking - HybridRAG - Next evolution in RAG systems that combines vector databases for semantic similarity with graph databases for relationship exploration and multi-hop reasoning. (Read more)
RagHybrid SearchGraph Vector - Inner Product Similarity - A vector similarity metric that calculates the dot product of two vectors, combining both magnitude and direction. Equivalent to cosine similarity when vectors are normalized, and commonly used for Maximum Inner Product Search (MIPS). (Read more)
Distance MetricSimilarityMips - Inverted File Index (IVF) - A vector indexing technique that partitions the vector space into clusters using k-means, then searches only the nearest clusters during queries. Foundation for efficient approximate nearest neighbor search, often combined with product quantization (IVF-PQ). (Read more)
IndexingIvfClustering - IVF - Inverted File Index vector search algorithm that partitions high-dimensional vectors into clusters using k-means, enabling efficient nearest neighbor search by restricting searches to relevant clusters and dramatically reducing search space. (Read more)
AlgorithmIndexingAnn - IVF (Inverted File Index) - Clustering-based approximate nearest neighbor algorithm that partitions vector space into Voronoi cells. Fast search through coarse-to-fine strategy, often combined with Product Quantization (IVF-PQ). (Read more)
AlgorithmClusteringAnn - IVF-FLAT - Inverted File index with FLAT (uncompressed) vectors, partitioning the vector space into clusters with centroids, offering a balance between search speed and accuracy for approximate nearest neighbor search. (Read more)
IndexingIvfClustering - IVF-FLAT Index - Inverted File Index with flat vectors using K-means clustering to partition high-dimensional space into regions, enhancing search efficiency by narrowing search area through neighbor partitions. (Read more)
IndexingAlgorithmAnn - IVF-PQ (Inverted File with Product Quantization) - Vector indexing method combining inverted file index with product quantization for memory-efficient search. Reduces storage from 128x4 bytes to 32x1 bytes (1/16th) while maintaining search quality. (Read more)
QuantizationIndexingCompression - k-NN Search - k-Nearest Neighbors search finds the k closest vectors to a query vector in high-dimensional space. A fundamental operation in vector databases and machine learning, k-NN can be exact (brute force) or approximate (ANN) depending on performance requirements and dataset size. (Read more)
AlgorithmSearchfundamental - KD-Tree - Tree-based data structure for organizing vectors through recursive axis-aligned partitioning, enabling logarithmic time complexity searches for balanced data but struggling with high-dimensional spaces. (Read more)
Tree BasedIndexingData Structure - L2 Normalization (Vector Normalization) - A preprocessing technique that scales vectors to unit length, ensuring all vectors lie on a hypersphere. Essential for making cosine similarity equivalent to inner product and improving embedding quality in many applications. (Read more)
NormalizationPreprocessingEmbeddings - Late Chunking - Advanced chunking technique for long-context embeddings where documents are embedded first as a whole, then chunked, preserving contextual information and improving retrieval quality especially for technical documents. (Read more)
ChunkingEmbeddingsRag - Late Interaction - Retrieval paradigm where query and document tokens are encoded separately and interactions computed at search time, combining efficiency of bi-encoders with expressiveness of cross-encoders. (Read more)
RetrievalColbertNeural Search - Late Interaction Retrieval - A retrieval paradigm where query and document encodings are kept separate until a late interaction stage, enabling more expressive and efficient similarity computations. Pioneered by ColBERT and extended by ColPali and ColQwen, this approach maintains fine-grained representations while enabling fast retrieval. (Read more)
RetrievalArchitectureColBERT - Lazy Loading Filesystem - Modal Labs' FUSE-based filesystem implementation that loads container images and dependencies on-demand, enabling sub-second container startup times for GPU workloads. (Read more)
OptimizationContainersPerformance - LIRE Protocol - Lightweight incremental rebalancing protocol used in SPFresh for billion-scale vector updates with only 1% DRAM and <10% cores compared to global rebuild approaches. (Read more)
IndexingIncrementalAlgorithm - LLM Caching for Vector Search - Caching strategies for LLM and vector search systems including semantic caching, embedding caching, and response caching to reduce costs and improve latency in RAG applications. (Read more)
CachingPerformanceCost Optimization - LLMOps - Operational practices and tooling for deploying, monitoring, and maintaining LLM applications in production, encompassing prompt management, model versioning, evaluation, and observability. (Read more)
OperationsMLOpsProduction - Locality Sensitive Hashing (LSH) - Algorithmic technique for approximate nearest neighbor search in high-dimensional spaces using hash functions to map similar items to the same buckets with high probability. (Read more)
HashingAnnAlgorithm - Locally-Adaptive Vector Quantization - Advanced quantization technique that applies per-vector normalization and scalar quantization, adapting the quantization bounds individually for each vector. Achieves four-fold reduction in vector size while maintaining search accuracy with 26-37% overall memory footprint reduction. (Read more)
QuantizationCompressionOptimization - Manhattan Distance - Vector distance metric calculating the sum of absolute differences between vector components. Measures grid-like distance and is robust to outliers, with faster calculation as data dimensionality increases. (Read more)
SimilarityDistance MetricHigh Dimensional - Matryoshka Representation Learning - Training technique enabling flexible embedding dimensions by learning representations where truncated vectors maintain good performance, achieving 75% cost savings when using smaller dimensions. (Read more)
EmbeddingsOptimizationMachine Learning - Maximum Inner Product Search (MIPS) - A search problem focused on finding vectors that maximize the inner product with a query vector. Common in recommendation systems and neural search where magnitude carries semantic meaning, requiring specialized algorithms like those in ScaNN. (Read more)
SearchAlgorithmMips - MaxSim - Maximum Similarity late interaction function introduced by ColBERT for ranking. Calculates cosine similarity between query and document token embeddings, keeping maximum score per query token for highly effective long-document retrieval. (Read more)
ColbertRankingLate Interaction - MaxSim Operator - Scoring function used in late interaction models like ColBERT that computes query-document relevance by finding maximum similarity between each query token and document tokens, then summing. (Read more)
Late InteractionColbertRanking - Metadata Filtering - The capability to filter vector search results based on metadata attributes before or during similarity search. Metadata filtering enables hybrid queries combining semantic search with structured constraints like dates, categories, tags, or user permissions, crucial for production RAG and search applications. (Read more)
FilteringMetadataSearch - MSTG (Multi-Stage Tree Graph) - Hierarchical vector index developed by MyScale overcoming IVF limitations through multi-layered design, creating multiple layers unlike IVF's single layer of cluster vectors for improved search performance. (Read more)
IndexingTree BasedHierarchical - Multi Vector Search - Placeholder - comprehensive documentation for multi-vector-search in vector databases and RAG systems. (Read more)
placeholder - Multi-Tenancy in Vector Databases - Architectural patterns for isolating and managing data for multiple customers (tenants) in shared vector database infrastructure. Multi-tenancy strategies include namespace isolation, metadata filtering, and separate collections, each offering different trade-offs between performance, cost, and data isolation. (Read more)
ArchitectureSecuritySaaS - Multi-Tenancy Patterns - Architectural patterns for isolating data between different tenants (customers/organizations) in vector databases. Includes collection-per-tenant, partition-per-tenant, and filter-based approaches with different trade-offs. (Read more)
Multi TenantArchitectureSecurity - Multi-Vector Embeddings - Embedding approach where documents/images are represented by multiple vectors (one per token/patch) rather than a single vector, enabling fine-grained semantic matching. (Read more)
EmbeddingsColbertRetrieval - Multimodal Embeddings - Vector representations mapping different data types (text, images, audio, video) into a shared embedding space. Enables cross-modal search and understanding. (Read more)
MultimodalEmbeddingsCross Modal - Multimodal Embeddings (CLIP) - Embeddings that map multiple modalities (text, images, video) into a shared vector space, enabling cross-modal search and retrieval using models like CLIP, SigLIP, and voyage-multimodal-3. (Read more)
Multimodalclipimage-search - MVCC Vector Indexing - Multi-Version Concurrency Control for vector indexes enabling transactional guarantees and consistent reads in distributed vector databases like YugabyteDB. (Read more)
MvccTransactionsDistributed - Navigable Small World (NSW) - A graph-based approximate nearest neighbor search algorithm that uses both long-range and short-range links to achieve poly-logarithmic search complexity. Foundation for the more advanced HNSW algorithm. (Read more)
Graph BasedAnnAlgorithm - NSW (Navigable Small World) - Graph-based algorithm for approximate nearest neighbor search where vertices represent vectors and edges are constructed heuristically. Foundation for HNSW with (poly/)logarithmic search complexity using greedy routing. (Read more)
AnnGraph BasedAlgorithm - Observer-Reflector Architecture - Memory system architecture used in Mastra's Observational Memory with two background agents that compress and garbage collect conversation history achieving 5-40x compression. (Read more)
MemoryCompressionArchitecture - Parent Document Retriever - A RAG technique that indexes small chunks for precise matching but retrieves larger parent documents for LLM context. Balances retrieval precision with comprehensive context by separating indexing granularity from context size. (Read more)
RagRetrievalChunking - Perpetual Sandbox - Sandbox architecture that maintains state indefinitely while scaling costs to zero during idle periods. Pioneered by Blaxel with sub-25ms resume times from standby mode. (Read more)
SandboxArchitectureCost Optimization - Plan-Execute-Verify Framework - Agent orchestration pattern used by Emergence AI that plans tasks, executes with specialized agents, and verifies results to achieve reliable autonomous workflow automation. (Read more)
AgentsWorkflowOrchestration - Pluggable Orchestration Strategies - Modular agent coordination patterns in AG2 allowing developers to swap orchestration logic without changing agent code, enabling flexible multi-agent workflows. (Read more)
OrchestrationModularityAgents - Product Quantization (PQ) - Vector compression technique that splits high-dimensional vectors into subvectors and quantizes each independently, achieving significant memory reduction while enabling approximate similarity search. (Read more)
QuantizationCompressionOptimization - Product Quantization Compression - Lossy vector compression dividing vectors into subvectors for independent quantization. Achieves 8-64x storage reduction while enabling fast approximate distance computation via lookup tables. (Read more)
CompressionQuantizationPq - Progressive K-Annealing - Training technique in CSRv2 that stabilizes sparsity learning by gradually increasing sparsity constraints, reducing dead neurons from >80% to ~20%. (Read more)
TrainingSparse EmbeddingsOptimization - Prompt Engineering for RAG - Best practices and techniques for crafting effective prompts in RAG systems including context formatting, instruction design, few-shot examples, and prompt optimization strategies. (Read more)
promptingRagLlm - Query Expansion for Vector Search - Techniques to improve retrieval by expanding user queries with synonyms, related terms, and reformulations including HyDE, query rewriting, and multi-query approaches. (Read more)
Query OptimizationRetrievalRag - Query Expansion Techniques - Placeholder - comprehensive documentation for query-expansion-techniques in vector databases and RAG systems. (Read more)
placeholder - RAG (Retrieval-Augmented Generation) - AI technique combining information retrieval with LLM generation. Retrieves relevant context from knowledge base before generating responses, reducing hallucinations and enabling grounded answers. (Read more)
RagLlmRetrieval - Rag Evaluation Datasets - Placeholder - comprehensive documentation for rag-evaluation-datasets in vector databases and RAG systems. (Read more)
placeholder - RAG Evaluation Metrics - Industry-standard metrics for evaluating Retrieval-Augmented Generation systems, including Answer Relevancy, Faithfulness, Context Relevance, Context Recall, and Context Precision to ensure quality and reliability. (Read more)
RagEvaluationMetrics - Rag Pipeline Optimization - Placeholder - comprehensive documentation for rag-pipeline-optimization in vector databases and RAG systems. (Read more)
placeholder - Range Search - A vector search operation that retrieves all vectors within a specified distance threshold from the query vector, rather than a fixed number of nearest neighbors. Useful for finding all similar items above a quality threshold. (Read more)
SearchSimilarityThreshold - Reciprocal Rank Fusion - Method for combining ranked lists from multiple retrieval systems in hybrid search. Standard technique in RAG pipelines for fusing BM25 and dense vector results before reranking, creating diverse high-confidence candidate sets. (Read more)
Hybrid SearchRankingFusion - Reciprocal Rank Fusion (RRF) - Hybrid search algorithm combining results from multiple ranking systems by computing reciprocal ranks, commonly used to merge dense vector search with sparse keyword search for improved retrieval. (Read more)
Hybrid SearchRankingFusion - Reranking - A two-stage retrieval process where initial candidates from vector search are reordered using more sophisticated models like cross-encoders. Reranking significantly improves result quality by applying computationally expensive models to a small set of candidates, commonly used in RAG systems and search applications. (Read more)
RetrievalRankingRAG - Retrieval Metrics - Performance measurement framework for vector search and RAG systems including recall, precision, nDCG, MRR, and context relevance metrics to evaluate retrieval quality and relevance. (Read more)
EvaluationMetricsPerformance - Scalar Quantization - Vector compression technique reducing precision of each vector component from 32-bit floats to 8-bit integers, achieving 4x memory reduction with minimal accuracy loss for vector search. (Read more)
QuantizationCompressionOptimization - Self-Querying Retriever - An intelligent retrieval technique where an LLM decomposes natural language queries into semantic search components and metadata filters. Enables more precise retrieval by automatically extracting structured filters from unstructured queries. (Read more)
RagRetrievalLlm - Semantic Caching - AI caching pattern that stores vector embeddings of LLM queries and responses, serving cached results when new queries are semantically similar. Cuts LLM costs by 50%+ with millisecond response times versus seconds for fresh calls. (Read more)
CachingOptimizationLlm - Semantic Caching - A caching technique that uses vector embeddings to identify and reuse responses for semantically similar queries, reducing LLM costs and latency. Unlike traditional caches based on exact matches, semantic caching achieves cache hit ratios of up to 92% by matching queries based on semantic similarity. (Read more)
CachingEmbeddingsPerformanceCost Optimization - Semantic Chunking - Advanced text splitting technique using embeddings to divide documents based on semantic content instead of arbitrary positions, preserving cohesive ideas within chunks for improved RAG performance. (Read more)
ChunkingRagText Processing - Semantic Search - A search approach that understands the meaning and intent of queries rather than just matching keywords. Using vector embeddings and similarity measures, semantic search finds conceptually relevant results even when exact terms don't match, enabling natural language queries and cross-lingual retrieval. (Read more)
SearchNLPEmbeddings - Sentence Window Retrieval - A RAG technique that indexes individual sentences for precise matching but retrieves surrounding sentences (a window) for context. Provides fine-grained retrieval precision while maintaining adequate context for LLM generation. (Read more)
RagRetrievalChunking - SOAR (Spilling with Orthogonality-Amplified Residuals) - A major algorithmic advancement to Google's ScaNN that introduces controlled redundancy to the vector index, leading to improved search efficiency. Enables even faster vector search while maintaining or improving accuracy. (Read more)
AlgorithmGoogleOptimization - Sparse Retrieval - Information retrieval using high-dimensional sparse vectors where most values are zero, typically based on term frequency methods like BM25. Sparse retrieval excels at exact keyword matching and is interpretable, often combined with dense retrieval in hybrid search systems for robust performance. (Read more)
RetrievalBM25Keyword Search - Sparse Vectors (SPLADE) - Learned sparse representation technique that creates interpretable, high-dimensional sparse vectors for text, combining benefits of traditional keyword search with neural approaches for improved retrieval. (Read more)
Sparse VectorsNeural SearchInterpretable - Statistical Binary Quantization - Compression method developed by Timescale researchers that improves on standard Binary Quantization, reducing vector memory footprint by 32x while maintaining high accuracy for filtered searches. (Read more)
QuantizationCompressiontimescale - Streaming Vector Indexing - Real-time indexing of vectors as they arrive in a stream, enabling immediate searchability without batch processing delays. Critical for applications requiring up-to-the-second freshness like social media, news, or real-time recommendations. (Read more)
StreamingReal TimeIndexing - Supervised Contrastive Objectives - Training technique in CSRv2 that enhances representational quality of sparse embeddings by using labeled data to guide the learning process. (Read more)
TrainingMachine LearningOptimization - Temporal Knowledge Graph - Knowledge graph architecture where facts have validity windows showing when they became true and were superseded. Core component of Zep AI's Graphiti and other agent memory systems. (Read more)
Knowledge GraphTemporalAgent Memory - Term Expansion - A retrieval technique that expands queries or documents with related but not literally present terms. Key feature of learned sparse models like SPLADE, enabling identification of relevant documents even when exact terms don't match. (Read more)
SearchSpladeSparse Embeddings - Text Chunking Strategies for RAG - Essential techniques for splitting documents into optimal-sized chunks for Retrieval-Augmented Generation, including fixed-size, recursive, semantic, and document-based chunking with overlap strategies to preserve context. (Read more)
RagText ProcessingRetrieval - Text-to-Cypher - Natural language to Cypher query generation for Neo4j graph databases. Enables users to query knowledge graphs using plain English, critical component of GraphRAG systems for generating graph traversal queries from natural language questions. (Read more)
GraphragKnowledge GraphLlm - Tree-Based Indexing - A family of vector indexing methods using tree data structures like KD-trees, Ball-trees, and R-trees for spatial partitioning. Provides logarithmic search complexity for low to medium dimensional data, though effectiveness decreases in very high dimensions. (Read more)
Tree BasedIndexingSpatial Indexing - TreeAH - Vector index type based on Google's ScaNN algorithm combining tree-like structure with Asymmetric Hashing quantization, optimized for batch queries with 10x faster index generation and smaller memory footprint. (Read more)
IndexingQuantizationGoogle - UMAP - Uniform Manifold Approximation and Projection - a non-linear dimensionality reduction technique that preserves both local and global data structure. More scalable than t-SNE while maintaining superior visualization quality and cluster separation for high-dimensional embeddings. (Read more)
Dimensionality ReductionVisualizationManifold Learning - Vamana - Graph-based indexing algorithm powering Microsoft's DiskANN. Uses flat graph structure with minimized search diameter for efficient disk-based nearest neighbor search with 40x GPU speedup available via NVIDIA cuVS. (Read more)
AnnGraph BasedAlgorithm - Vector Compression Techniques - Placeholder - comprehensive documentation for vector-compression-techniques in vector databases and RAG systems. (Read more)
placeholder - Vector Database Backup and Recovery - Best practices for backing up vector databases, disaster recovery planning, point-in-time recovery, and data migration strategies to prevent data loss and ensure business continuity. (Read more)
BackupDisaster RecoveryOperations - Vector Database Backup and Recovery Guide - Best practices for backup and disaster recovery in vector databases. Covers full/incremental backups, replication strategies, and cloud-native approaches for safeguarding high-dimensional embeddings. (Read more)
BackupDisaster RecoveryBest Practices - Vector Database Backup and Restore - Strategies for backing up vector databases and restoring from failures, including snapshots, incremental backups, and disaster recovery. Proper backup procedures are essential for production vector databases to prevent data loss and ensure business continuity in RAG and search systems. (Read more)
BackupDisaster RecoveryOperations - Vector Database Backup Strategies - Best practices and techniques for backing up vector databases including snapshots, continuous backups, and disaster recovery. Critical for production systems to prevent data loss and enable point-in-time recovery. (Read more)
BackupDisaster RecoveryOperations - Vector Database Cost Optimization - Comprehensive strategies for reducing vector database costs through embedding model selection, quantization, caching, and infrastructure choices. Critical for production deployments at scale. (Read more)
Cost OptimizationpricingBest PracticesScalability - Vector Database Cost Optimization Guide - Comprehensive strategies for reducing vector database costs including storage management, compute optimization, and monitoring. Covers cloud pricing trends and hidden costs in 2026. (Read more)
Cost OptimizationCloudBest Practices - Vector Database Deletion and Updates - Strategies for deleting and updating vectors in production systems including soft deletes, versioning, and rebuild patterns. Critical for maintaining data accuracy and handling GDPR/compliance requirements. (Read more)
OperationsData ManagementCompliance - Vector Database Migration - Placeholder - comprehensive documentation for vector-database-migration in vector databases and RAG systems. (Read more)
placeholder - Vector Database Migration Strategies - Guide to migrating vector databases including export/import procedures, zero-downtime migration patterns, data validation, and strategies for changing providers or versions. (Read more)
Migrationdata-transferOperations - Vector Database Monitoring - Placeholder - comprehensive documentation for vector-database-monitoring in vector databases and RAG systems. (Read more)
placeholder - Vector Database Performance Tuning Guide - Comprehensive guide covering index optimization, quantization, caching, and parameter tuning for vector databases. Includes techniques for balancing performance, cost, and accuracy at scale. (Read more)
PerformanceOptimizationBest Practices - Vector Database Schema Design - Best practices for designing vector database schemas including vector dimensions, metadata structure, indexing strategies, and collection organization. Critical for performance, scalability, and maintainability. (Read more)
SchemaDesignBest Practices - Vector Database Security - Placeholder - comprehensive documentation for vector-database-security in vector databases and RAG systems. (Read more)
placeholder - Vector Database Sharding - Distributing vector data across multiple nodes for horizontal scaling. Enables handling billions of vectors by partitioning data and parallelizing queries. (Read more)
ShardingScalabilityDistributed - Vector Database Sharding Strategies - Approaches for distributing vectors across multiple nodes including horizontal sharding, data partitioning, and routing strategies for scaling vector search to billions of vectors. (Read more)
Scalabilitydistributed-systemsArchitecture - Vector Database Testing - Placeholder - comprehensive documentation for vector-database-testing in vector databases and RAG systems. (Read more)
placeholder - Vector Database Testing Strategies - Comprehensive testing approaches for vector databases including unit tests, integration tests, performance tests, and chaos engineering for ensuring reliability and quality in production. (Read more)
TestingqaReliability - Vector Database Use Cases - Applications of vector databases across industries including semantic search, RAG systems, recommendations, anomaly detection, and multimodal search. (Read more)
Use CasesApplicationsAi - Vector Deduplication - Techniques for identifying and removing duplicate or near-duplicate vectors in databases using similarity thresholds. Deduplication reduces storage costs, improves search quality, and prevents redundant results in RAG systems by detecting semantically identical content even when textual representations differ. (Read more)
Data QualityOptimizationPreprocessing - Vector Dimensionality - Number of components in an embedding vector, typically ranging from 128 to 4096 dimensions. Higher dimensions can capture more information but increase storage, computation, and costs. Critical design parameter for vector databases. (Read more)
EmbeddingsOptimizationArchitecture - Vector Dimensionality Reduction - Techniques for reducing embedding dimensions while preserving semantic information, including PCA, random projection, and learned compression methods like Matryoshka embeddings. Dimensionality reduction enables faster search, lower storage costs, and efficient deployment at scale. (Read more)
OptimizationCompressionEmbeddings - Vector Index Build Strategies - Techniques for efficiently building vector indexes including batch construction, incremental updates, and online indexing. Critical for production systems that need to balance indexing speed, search performance, and resource utilization. (Read more)
IndexingPerformanceOperations - Vector Index Rebuild Strategies - Approaches for updating vector database indexes when data changes significantly, including zero-downtime rebuilds, incremental updates, and blue-green deployments. Index rebuilds are necessary when adding large batches of vectors, changing parameters, or optimizing performance in production systems. (Read more)
OperationsMaintenancePerformance - Vector Index Sharding - Placeholder - comprehensive documentation for vector-index-sharding in vector databases and RAG systems. (Read more)
placeholder - Vector Index Types - Different indexing strategies for vector databases including HNSW, IVF, LSH, and flat indexes. Each type offers different trade-offs between query speed, build time, accuracy, and memory usage. Understanding index types is crucial for optimizing vector database performance at scale. (Read more)
IndexingPerformanceAlgorithms - Vector Normalization - The process of scaling vectors to unit length (L2 normalization) or other standard forms. Normalized vectors enable cosine similarity computation via simple dot product and are essential for many embedding models and distance metrics used in vector databases. (Read more)
PreprocessingmathematicsEmbeddings - Vector Normalization (L2 Normalization) - Essential preprocessing technique that scales embedding vectors to unit length using L2 norm, ensuring consistent magnitude and making cosine similarity equivalent to dot product for faster computation. (Read more)
PreprocessingNormalizationEmbeddings - Vector Quantization Techniques - Methods for compressing vector embeddings to reduce storage and memory costs. Includes scalar quantization, product quantization, and binary quantization with varying compression-accuracy tradeoffs. (Read more)
CompressionOptimizationCost Reduction - Vector Query Optimization - Techniques for optimizing vector search queries including parameter tuning, result caching, batch queries, and index selection. Critical for achieving production-grade performance and cost efficiency. (Read more)
OptimizationPerformanceQuery - Vector Search at the Edge - Techniques and tools for deploying vector search in edge environments including embedded databases, WASM implementations, and edge-optimized models for privacy and low-latency applications. (Read more)
edge-computingEmbeddedPrivacy - Vector Search Caching - Strategies for caching vector search results, embeddings, and frequently accessed data to reduce latency and costs in RAG systems. Effective caching can eliminate redundant embedding API calls and vector searches for common queries, significantly improving performance and reducing infrastructure costs. (Read more)
CachingPerformanceOptimization - Vector Search Explain - Placeholder - comprehensive documentation for vector-search-explain in vector databases and RAG systems. (Read more)
placeholder - Vector Similarity Metrics - Mathematical measures for comparing vector similarity including cosine similarity (directional), Euclidean distance (geometric), dot product (magnitude+direction), and Manhattan distance (grid-based) for AI and search applications. (Read more)
SimilarityDistanceMetrics - Vector Similarity Search - Finding nearest vectors in high-dimensional space based on distance or similarity metrics. Core operation of vector databases enabling semantic search, recommendations, and RAG. (Read more)
SimilaritySearchVectors - Zero-Shot Classification with Embeddings - Using vector embeddings to classify items into categories without training data for those specific categories. Leverages semantic similarity between text and category descriptions for instant classification. (Read more)
ClassificationZero ShotEmbeddings
