freshcrate
Home > Databases > vectorizer

vectorizer

A high-performance, in-memory vector database written in Rust, designed for semantic search and top-k nearest neighbor queries in AI-driven applications, with binary file persistence for durability.

Description

A high-performance, in-memory vector database written in Rust, designed for semantic search and top-k nearest neighbor queries in AI-driven applications, with binary file persistence for durability.

README

Vectorizer

Rust Rust Edition License Crates.io GitHub release Production Ready

High-performance vector database and search engine in Rust for semantic search, document indexing, and AI applications. Ships as a Cargo workspace (5 crates) with binary RPC + HTTP transports, a React dashboard, and native SDKs for Rust, Python, TypeScript, Go, and C#.

โœจ Key Features

Transport & API

  • VectorizerRPC (default, port 15503) โ€” binary MessagePack over TCP, multiplexed connection pool. See wire spec.
  • REST API (port 15002) โ€” universal HTTP fallback, powers the dashboard and any caller that doesn't speak raw TCP.
  • gRPC โ€” Qdrant-compatible service.
  • GraphQL โ€” full REST parity with async-graphql + GraphiQL playground.
  • MCP โ€” 31 focused tools for AI model integration (Cursor, Claude Desktop, etc.).
  • UMICP Protocol โ€” native JSON types + tool discovery endpoint.

Performance

  • SIMD acceleration โ€” AVX2-optimized vector ops with runtime CPU detection (5-10x faster).
  • Metal GPU โ€” macOS Apple Silicon via hive-gpu 0.2; logs render real device name, driver, VRAM.
  • Sub-3ms search (CPU) / <1ms (GPU) via HNSW indexing.
  • 4-5x faster than Qdrant in head-to-head benchmarks (0.16-0.23ms vs 0.80-0.87ms avg latency).

Storage

  • .vecdb unified format โ€” 20-30% space savings, automatic snapshots.
  • Memory-mapped storage โ€” datasets larger than RAM, efficient OS paging.
  • Product Quantization โ€” 64x memory reduction with minimal accuracy loss.
  • Scalar Quantization + cache hit ratio metrics.

High Availability & Scaling

  • Raft consensus via openraft (pinned =0.10.0-alpha.17) โ€” automatic leader election in 1-5s, write-redirect via HTTP 307, WAL-backed durable replication, DNS discovery for Kubernetes headless services.
  • Master-Replica โ€” TCP streaming replication with full/partial sync, exponential reconnect backoff (5sโ†’60s).
  • Distributed sharding โ€” horizontal scaling with automatic routing; distributed hybrid search via RemoteHybridSearch RPC with dense-only fallback for mixed-version clusters.
  • HiveHub cluster mode โ€” multi-tenant with quotas, usage tracking, tenant isolation, mandatory MMap storage, 1GB cache cap.

Search

  • Semantic similarity โ€” Cosine, Euclidean, Dot Product.
  • Hybrid search โ€” Dense + Sparse with Reciprocal Rank Fusion (RRF).
  • Intelligent search โ€” query expansion, semantic reranking.
  • Multi-collection search across projects.
  • Graph relationships โ€” automatic edge discovery, neighbor exploration, shortest-path finding.

Embeddings & Docs

  • Built-in providers โ€” TF-IDF, BM25, FastEmbed, BERT, MiniLM, custom models.
  • Document conversion โ€” PDF, DOCX, XLSX, PPTX, HTML, XML, images (14 formats).
  • Qdrant API compatibility โ€” Snapshots, Sharding, Cluster Management, Query (with prefetch), Search Groups, Matrix, Named Vectors (partial), PQ/Binary quantization config.
  • Summarization โ€” extractive, keyword, sentence, abstractive (OpenAI GPT).

Security

  • JWT + API Key authentication with RBAC.
  • JWT secret is mandatory โ€” boot refuses to start with empty / default / <32 char secrets when auth is enabled.
  • First-run root credentials written to {data_dir}/.root_credentials (0o600), never logged.
  • Payload encryption โ€” optional ECC-P256 + AES-256-GCM, zero-knowledge, per-collection policies (docs).
  • TLS 1.2/1.3 with mTLS, configurable cipher suites, ALPN.
  • Per-API-key rate limiting with tiers + overrides.
  • Path-traversal guard on file discovery; canonicalized base, symlink-escape refusal.

UI

  • Web Dashboard โ€” React + TypeScript; JWT login, graph CRUD (edges, neighbors, paths), collection management, API sandbox, setup wizard with glassmorphism design. Embedded in the binary (~26MB, no external assets needed).
  • Desktop GUI โ€” Electron + vis-network for visual database management.

๐ŸŽ‰ Latest Release: v3.0.0

Highlights โ€” see CHANGELOG.md for the full breakdown.

Breaking

  • RPC is default transport (rpc.enabled: true, port 15503). REST stays on 15002. Migration guide: docs/migration/rpc-default.md. Opt out with rpc.enabled: false.
  • gRPC SearchResult.score narrowed double โ†’ float. Clients on the pre-v3 proto must regenerate.
  • JWT secret must be explicitly configured โ€” no more insecure default. Generate via openssl rand -hex 64 and inject via VECTORIZER_JWT_SECRET.
  • Configs moved under config/ โ€” config.yml โ†’ config/config.yml, presets under config/presets/. Legacy ./config.yml still works with a deprecation warning (removed in v3.1).
  • Cargo workspace split โ€” vectorizer-core, vectorizer-protocol, vectorizer, vectorizer-server, vectorizer-cli. Callers reaching into the server layer need to switch from vectorizer::{server,api,grpc,logging,umicp}::* to vectorizer_server::*.

Removed

  • Standalone JavaScript SDK dropped โ€” TypeScript SDK ships compiled CJS + ESM, usable from plain JS. Migrate @hivehub/vectorizer-sdk-js โ†’ @hivehub/vectorizer-sdk.
  • TypeScript SDK scope is @hivehub, not @hivellm (docs corrected).
  • Framework integration packages dropped โ€” langchain, langchain-js, langflow, n8n, tensorflow, pytorch adapters. Published versions stay installable; integrate against native SDKs directly.

Added

  • Layered config loader โ€” VECTORIZER_MODE=dev|production merges config/modes/<mode>.yml over base. Deep YAML merge with null-clear semantics. See docs/deployment/configuration.md.
  • Docker collapsed to one compose with profiles โ€” docker compose --profile <default|dev|ha|hub> up -d.
  • C# SDK RPC transport (Vectorizer.Sdk.Rpc 3.0.0) โ€” TCP + MessagePack framing, connection pool, ASP.NET Core DI.
  • #![deny(missing_docs)] + cargo doc -D warnings CI gate โ€” cleared 2,219 missing-docs warnings to 0.
  • unwrap_used / expect_used denied workspace-wide โ€” every production .unwrap() either returns Result or sits behind a documented #[allow].

Changed

  • rmcp 0.10 โ†’ 1.5 โ€” MCP SDK major rewrite; builder-based construction across every handler.
  • Second-pass dep migrations โ€” reqwest 0.13, arrow/parquet 58, zip 8, tantivy 0.26, hmac 0.13 + sha2 0.11, hf-hub 0.5, sysinfo 0.38, candle 0.10.2, bcrypt 0.19, openraft pinned =0.10.0-alpha.17.
  • Frontend majors โ€” React 19, react-router 7, TypeScript 6 (dashboard), vitest 4, eslint 10, Electron 41, Vue-router 5 (GUI).
  • parking_lot migration complete โ€” all std::sync::{Mutex,RwLock} off the hot path; CI grep gate prevents regression.
  • Hot-path rand / hmac / tonic 0.14 / prost 0.14 / bincode 2.0 upgraded.

๐Ÿš€ Quick Start

Install Script (Linux/macOS)

curl -fsSL https://raw.githubusercontent.com/hivellm/vectorizer/main/scripts/install.sh | bash

Installs CLI + systemd service. Commands: sudo systemctl {status|restart|stop} vectorizer, sudo journalctl -u vectorizer -f.

Install Script (Windows)

powershell -c "irm https://raw.githubusercontent.com/hivellm/vectorizer/main/scripts/install.ps1 | iex"

Installs CLI + Windows Service (requires Admin). Commands: Get-Service Vectorizer, {Start|Stop|Restart}-Service Vectorizer.

Docker

docker run -d \
  --name vectorizer \
  -p 15002:15002 -p 15503:15503 \
  -v $(pwd)/vectorizer-data:/vectorizer/data \
  -e VECTORIZER_AUTH_ENABLED=true \
  -e VECTORIZER_ADMIN_USERNAME=admin \
  -e VECTORIZER_ADMIN_PASSWORD=your-secure-password \
  -e VECTORIZER_JWT_SECRET=$(openssl rand -hex 64) \
  --restart unless-stopped \
  hivehub/vectorizer:latest

Docker Compose with profiles:

cp .env.example .env
# Edit .env with your credentials
docker compose --profile default up -d          # standalone
docker compose --profile dev up -d              # dev overlay
docker compose --profile ha up -d               # Raft cluster
docker compose --profile hub up -d              # multi-tenant

Profiles are mutually exclusive on host port 15002.

Images: Docker Hub ยท GHCR

Build from Source

git clone https://github.com/hivellm/vectorizer.git
cd vectorizer

cargo build --release                          # Basic
cargo build --release --features hive-gpu      # macOS Metal
cargo build --release --features full          # All features
./target/release/vectorizer

Access Points

Surface URL Notes
VectorizerRPC (primary) vectorizer://localhost:15503 Binary MessagePack over TCP โ€” see operator guide
REST API http://localhost:15002 Universal HTTP fallback
Web Dashboard http://localhost:15002/dashboard/ React UI, embedded in binary
MCP Server http://localhost:15002/mcp 31 tools for AI agents
GraphQL http://localhost:15002/graphql GraphiQL at /graphql
UMICP Discovery http://localhost:15002/umicp/discover
Health Check http://localhost:15002/health

Upgrading from v2.x? RPC is now on by default on port 15503. REST is unchanged. If you can't expose the new port, set rpc.enabled: false. See v3.x migration guide.

Configuration

Configs live under config/:

config/
โ”œโ”€โ”€ config.yml             # Base config (your deployment)
โ”œโ”€โ”€ config.example.yml     # Reference
โ”œโ”€โ”€ modes/
โ”‚   โ”œโ”€โ”€ dev.yml            # Layered override: verbose logs, loopback, watcher on
โ”‚   โ””โ”€โ”€ production.yml     # Layered override: warn logs, larger threads/cache, zstd, scheduled snapshots
โ””โ”€โ”€ presets/               # Standalone full configs (legacy style)
    โ”œโ”€โ”€ production.yml
    โ”œโ”€โ”€ cluster.yml
    โ”œโ”€โ”€ hub.yml
    โ””โ”€โ”€ development.yml

Layered loader (recommended):

VECTORIZER_MODE=production ./target/release/vectorizer

Merges config/modes/production.yml over config/config.yml. Typos in the mode override fail fast at boot.

Authentication

Auth is enabled by default in Docker. Default creds โ€” change in production.

# Login
curl -X POST http://localhost:15002/auth/login \
  -H "Content-Type: application/json" \
  -d '{"username":"admin","password":"admin"}'

# JWT in requests
curl http://localhost:15002/collections \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

# Create API key (JWT required)
curl -X POST http://localhost:15002/auth/keys \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{"name":"Production","permissions":["read","write"],"expires_in_days":90}'

# API key in requests (NO Bearer prefix)
curl http://localhost:15002/collections \
  -H "Authorization: YOUR_API_KEY"
Method Header Use case
JWT Authorization: Bearer <token> Dashboard, short-lived sessions
API Key Authorization: <key> MCP, CLI, long-lived integrations

Production must set:

  • VECTORIZER_JWT_SECRET โ€” โ‰ฅ32 chars, not the historical default. Boot aborts otherwise.
  • VECTORIZER_ADMIN_PASSWORD โ€” strong, โ‰ฅ32 chars.

First-run root credentials are written to {data_dir}/.root_credentials (0o600), never printed to stdout. Read and delete after first login.

See Docker Authentication Guide and Security Policy.

๐Ÿ“Š Performance

Metric Value
Search latency (CPU) < 3ms
Search latency (Metal GPU) < 1ms
Throughput 4,400-6,000 QPS (vs Qdrant 1,100-1,300)
Storage reduction 20-30% (.vecdb) + PQ 64x
MCP tools 31
Document formats 14

Benchmark vs Qdrant

  • Search: 4-5x faster (0.16-0.23ms vs 0.80-0.87ms avg latency).
  • Insert: Fire-and-forget pattern, configurable batch / body limits, background processing.
  • Scenarios: Small (1K) / Medium (5K) / Large (10K) vectors ร— dimensions 384 / 512 / 768.

See Benchmark Documentation.

๐Ÿ”„ Feature Comparison

Feature Vectorizer Qdrant pgvector Pinecone Weaviate Milvus Chroma
Core
Language Rust Rust C C++/Go Go C++/Go Python
License Apache 2.0 Apache 2.0 PostgreSQL Proprietary BSD Apache 2.0 Apache 2.0
APIs
REST โœ… โœ… via PG โœ… โœ… โœ… โœ…
gRPC (Qdrant-compat) โœ… โœ… โŒ โœ… โœ… โœ… โŒ
GraphQL โœ… + GraphiQL โŒ โŒ โŒ โœ… โŒ โŒ
MCP โœ… 31 tools โŒ โŒ โŒ โŒ โŒ โŒ
Binary RPC โœ… MessagePack โŒ โŒ โŒ โŒ โŒ โŒ
SDKs Rust, Python, TS, Go, C# All All Most Most Most Python
Performance
Search latency < 3ms CPU / < 1ms GPU 1-5ms 5-50ms 50-100ms 10-50ms 5-20ms 10-100ms
SIMD โœ… AVX2 โœ… โœ… โœ… โŒ โœ… โŒ
GPU โœ… Metal โœ… CUDA โŒ โœ… Cloud โŒ โœ… CUDA โŒ
Storage
HNSW โœ… โœ… โœ… โœ… โœ… โœ… โœ…
PQ (64x) โœ… โœ… โŒ โœ… โŒ โœ… โŒ
Scalar Quantization โœ… โœ… โŒ โœ… โŒ โœ… โŒ
MMap โœ… โœ… โœ… โŒ โœ… โœ… โŒ
Advanced
Graph Relationships โœ… auto + GUI โŒ โŒ โŒ โœ… โŒ โŒ
Document Processing โœ… 14 formats โŒ โŒ โŒ โœ… โŒ โœ…
Hybrid Search โœ… โœ… โœ… โœ… โœ… โœ… โŒ
Query Expansion โœ… โŒ โŒ โŒ โŒ โŒ โŒ
Qdrant API compat โœ… + migration N/A โŒ โŒ โŒ โŒ โŒ
Scaling
Sharding โœ… โœ… via PG โœ… Cloud โœ… โœ… โŒ
Replication โœ… Raft + Master-Replica โœ… via PG โœ… Cloud โœ… โœ… โŒ
Management
Dashboard โœ… React + graph GUI โœ… basic pgAdmin โœ… Cloud โœ… โœ… โœ… basic
Desktop GUI โœ… Electron โŒ โŒ โŒ โŒ โŒ โŒ
Security
JWT + API Keys โœ… โœ… via PG โœ… Cloud โœ… โœ… โœ…
Payload Encryption โœ… ECC-P256 + AES-GCM โŒ via PG โœ… Cloud โŒ โŒ โŒ

Key Differentiators

  • MCP integration (31 tools) โ€” native AI-agent protocol.
  • Graph relationships โ€” auto-discovery + full GUI (edges, path-finding, neighbor exploration).
  • GraphQL โ€” full REST parity + GraphiQL.
  • Document processing โ€” 14 formats built in.
  • Qdrant compatibility โ€” full API + migration tools.
  • Performance โ€” 4-5x faster than Qdrant in benchmarks.
  • Binary RPC default โ€” MessagePack over TCP on port 15503 for low-overhead client traffic.
  • Complete SDK coverage โ€” Rust, Python, TypeScript (+JS), Go, C# โ€” all on v3.0.0.

Best fit: AI apps needing MCP, document ingestion, graph relationships, and sub-ms search with an embedded dashboard.

๐ŸŽฏ Use Cases

  • RAG systems โ€” semantic search with automatic document conversion.
  • Document search โ€” PDFs, Office, web content.
  • Code analysis โ€” semantic code navigation.
  • Knowledge bases โ€” enterprise multi-format search.

๐Ÿ”ง MCP Integration

Cursor / Claude Desktop config:

{
  "mcpServers": {
    "vectorizer": {
      "url": "http://localhost:15002/mcp",
      "type": "streamablehttp"
    }
  }
}

Available Tools (31)

Core operations (9) list_collections ยท create_collection ยท get_collection_info ยท insert_text ยท get_vector ยท update_vector ยท delete_vector ยท search ยท multi_collection_search

Advanced search (4) search_intelligent (query expansion) ยท search_semantic (reranking) ยท search_extra (combined) ยท search_hybrid (dense + sparse RRF)

Discovery & files (7) filter_collections ยท expand_queries ยท get_file_content ยท list_files ยท get_file_chunks ยท get_project_outline ยท get_related_files

Graph (8) graph_list_nodes ยท graph_get_neighbors ยท graph_find_related ยท graph_find_path ยท graph_create_edge ยท graph_delete_edge ยท graph_discover_edges ยท graph_discover_status

Maintenance (3) list_empty_collections ยท cleanup_empty_collections ยท get_collection_stats

Cluster-management operations are REST-only for security.

๐Ÿ“ฆ Client SDKs

All SDKs synchronized at v3.0.0. The TypeScript SDK ships compiled CJS + ESM โ€” usable from plain JavaScript, no separate JS package needed.

SDK Install
Python pip install vectorizer-sdk
TypeScript / JS npm install @hivehub/vectorizer-sdk
Rust cargo add vectorizer-sdk
C# dotnet add package Vectorizer.Sdk (REST) ยท Vectorizer.Sdk.Rpc (RPC)
Go go get github.com/hivellm/vectorizer-sdk-go

Every SDK accepts both vectorizer://host[:port] (RPC, default port 15503) and http(s)://host[:port] (REST) URLs through the same endpoint parser.

๐Ÿ”„ Qdrant Migration

  • Config migration โ€” parse Qdrant YAML/JSON โ†’ Vectorizer format.
  • Data migration โ€” export from Qdrant, import into Vectorizer.
  • Validation โ€” integrity + compatibility checks.
  • REST compatibility โ€” full Qdrant API at /qdrant/*.
use vectorizer::migration::qdrant::{QdrantDataExporter, QdrantDataImporter};

let exported = QdrantDataExporter::export_collection(
    "http://localhost:6333",
    "my_collection"
).await?;

let result = QdrantDataImporter::import_collection(&store, &exported).await?;

See Qdrant Migration Guide.

โ˜๏ธ HiveHub Cloud

Multi-tenant cluster mode integration with HiveHub.Cloud.

  • Tenant isolation โ€” owner-scoped collections.
  • Quota enforcement โ€” collections / vectors / storage per tenant.
  • Usage tracking โ€” automatic reporting.
  • User-scoped backups.
hub:
  enabled: true
  api_url: "https://api.hivehub.cloud"
  tenant_isolation: "collection"
  usage_report_interval: 300
export HIVEHUB_SERVICE_API_KEY="your-service-api-key"

Cluster-mode requirements (enforced at boot):

Requirement Default
MMap storage (Memory storage rejected) Enforced
Max cache memory across all caches 1 GB
File watcher Disabled
Strict config validation Enabled
cluster:
  enabled: true
  node_id: "node-1"
  memory:
    max_cache_memory_bytes: 1073741824
    enforce_mmap_storage: true
    disable_file_watcher: true
    strict_validation: true

See HiveHub Integration and Cluster Memory Limits.

๐Ÿ—๏ธ Workspace Layout

crates/
โ”œโ”€โ”€ vectorizer-core/       # Foundation: error, codec, quantization, simd, compression, paths
โ”œโ”€โ”€ vectorizer-protocol/   # RPC wire types + tonic-generated gRPC
โ”œโ”€โ”€ vectorizer/            # Engine (umbrella): db, embedding, models, cache, persistence, search, ...
โ”œโ”€โ”€ vectorizer-server/     # Transport: HTTP / gRPC / MCP / RPC + binary
โ””โ”€โ”€ vectorizer-cli/        # CLI binaries
sdks/rust/                 # Rust SDK โ€” re-exports vectorizer-protocol wire types

Runtime directories resolve to platform-standard locations (~/.local/share/vectorizer/ on Linux, ~/Library/Application Support/vectorizer/ on macOS, %APPDATA%\vectorizer\ on Windows), overridable via VECTORIZER_DATA_DIR / VECTORIZER_LOGS_DIR.

๐Ÿ“š Documentation

๐Ÿ“„ License

Apache License 2.0 โ€” see LICENSE.

๐Ÿค Contributing

See CONTRIBUTING.md.

Release History

VersionChangesUrgencyDate
vectorizer-3.0.0A Helm chart for Vectorizer - High-performance vector databaseHigh4/21/2026
v3.0.0### Removed - **Standalone JavaScript SDK dropped (`sdks/javascript/`).** The TypeScript SDK ships compiled CommonJS + ESM and is fully usable from plain JavaScript, so maintaining a parallel JS-only package doubled CI/release work for no functional difference. The published `@hivehub/vectorizer-sdk-js` package on npm remains installable; no new releases will ship. Migrate by replacing `@hivehub/vectorizer-sdk-js` with `@hivehub/vectorizer-sdk` โ€” the runtime API and import paths are identical. High4/21/2026
vectorizer-2.5.16A Helm chart for Vectorizer - High-performance vector databaseMedium4/3/2026
v2.5.16Fix: POST search/scroll/recommend/count are now served locally on followers instead of being redirected to leader. These are read operations. Image: ghcr.io/hivellm/vectorizer:2.5.16Medium4/3/2026
vectorizer-2.5.15A Helm chart for Vectorizer - High-performance vector databaseMedium4/3/2026
v2.5.15Fix: vector inserts now replicate to followers in Raft mode. Root cause: upsert_points() only checked static master_node (always None in Raft), not ha_manager.master_node(). Also includes NetworkError fix for Raft elections. Image: ghcr.io/hivellm/vectorizer:2.5.15Medium4/3/2026
vectorizer-2.5.14A Helm chart for Vectorizer - High-performance vector databaseMedium4/3/2026
v2.5.14Root cause fix: Raft RPC errors now use NetworkError (transient, immediate retry) instead of Unreachable (permanent backoff). This was why elections never completed โ€” initial DNS failures marked all peers as permanently dead. Image: ghcr.io/hivellm/vectorizer:2.5.14Medium4/3/2026
vectorizer-2.5.13A Helm chart for Vectorizer - High-performance vector databaseMedium4/3/2026
v2.5.13Debug: log before/after initialize_cluster. Image: ghcr.io/hivellm/vectorizer:2.5.13Medium4/3/2026
vectorizer-2.5.12A Helm chart for Vectorizer - High-performance vector databaseMedium4/3/2026
v2.5.12Fix: trigger().elect() every 10s until leader elected. Image: ghcr.io/hivellm/vectorizer:2.5.12Medium4/3/2026
vectorizer-2.5.11A Helm chart for Vectorizer - High-performance vector databaseMedium4/3/2026
v2.5.11Debug: warn-level logs for cluster servers count and DNS wait. Image: ghcr.io/hivellm/vectorizer:2.5.11Medium4/3/2026
vectorizer-2.5.10A Helm chart for Vectorizer - High-performance vector databaseMedium4/3/2026
v2.5.10Debug: logs config parse result for cluster_enabled. Image: ghcr.io/hivellm/vectorizer:2.5.10Medium4/3/2026
vectorizer-2.5.9A Helm chart for Vectorizer - High-performance vector databaseMedium4/2/2026
v2.5.9Fix: waits up to 60s for peer DNS resolution before Raft initialize. Prevents election failures due to pods not yet in headless service DNS. Image: ghcr.io/hivellm/vectorizer:2.5.9Medium4/2/2026
vectorizer-2.5.8A Helm chart for Vectorizer - High-performance vector databaseMedium4/2/2026
v2.5.8Fix: Raft RPCs now use connect() with 3s timeout per call instead of connect_lazy(). DNS is re-resolved on each election attempt so peers are found after startup. Image: ghcr.io/hivellm/vectorizer:2.5.8Medium4/2/2026
vectorizer-2.5.7A Helm chart for Vectorizer - High-performance vector databaseMedium4/2/2026
v2.5.7Fix Raft transport error with connect_lazy(). Clean tag for ArgoCD. Image: ghcr.io/hivellm/vectorizer:2.5.7Medium4/2/2026
v2.5.6## Fix: Raft leader election now works in Kubernetes **Root cause:** Each Raft vote/append RPC created a new tonic Channel with `connect()`, doing a full HTTP/2 handshake per call. When peers weren't ready yet, the handshake failed with "transport error" and was never retried on the same connection. **Fix:** Use `connect_lazy()` to create a persistent Channel per peer. The HTTP/2 handshake happens on first actual RPC and tonic handles reconnection automatically. This eliminates: - "transport eMedium4/2/2026
vectorizer-2.5.6A Helm chart for Vectorizer - High-performance vector databaseMedium4/2/2026
vectorizer-2.5.5A Helm chart for Vectorizer - High-performance vector databaseMedium4/1/2026
v2.5.5## v2.5.5 โ€” Production-ready Kubernetes HA cluster Clean release with all fixes since v2.5.1. Use this version (not v2.5.4 which was re-tagged during development). ### Bug Fixes - **MMap storage in cluster mode** โ€” collections now use MMap (disk-backed) instead of Memory when cluster is active. Data survives pod restarts. - **Collection replication in Raft mode** โ€” collections created on the leader are now replicated to followers via the HaManager's dynamic master node. - **Compaction crash onMedium4/1/2026
v2.5.4Complete HA fixes: MMap storage enforcement, collection replication in Raft mode, compaction crash fix, Raft bootstrap, DNS re-resolution, SO_REUSEADDR, Axum router, probes, parallel pods, config mount path.Medium4/1/2026
vectorizer-2.5.4A Helm chart for Vectorizer - High-performance vector databaseMedium3/31/2026
vectorizer-2.5.3A Helm chart for Vectorizer - High-performance vector databaseMedium3/24/2026
v2.5.3## Bug Fix ### Config parse failures for logging and top-level sections Extends the serde defaults fix from v2.5.2 to cover: - `LoggingConfig` fields: `level`, `log_requests`, `log_responses`, `log_errors` - Top-level `VectorizerConfig` sections: `server`, `file_watcher`, `logging` Without these defaults, a minimal config.yml like: ```yaml cluster: enabled: true replication: enabled: true role: "master" ``` would fail to parse because `logging.log_requests` was required, causing silentMedium3/24/2026
vectorizer-2.5.2A Helm chart for Vectorizer - High-performance vector databaseMedium3/24/2026
v2.5.2## Bug Fix ### Config parse failure silently disabled cluster mode `AuthConfig.jwt_secret` and `ServerConfig.mcp_port` were missing `#[serde(default)]` attributes. When these fields were omitted from `config.yml`, the entire config failed to parse, silently falling back to defaults with: - `cluster.enabled: false` - `replication.role: standalone` This caused all pods to run as **standalone** nodes instead of master/replica, even when the ConfigMap had correct replication settings. ### Fix AlMedium3/24/2026
vectorizer-2.5.1A Helm chart for Vectorizer - High-performance vector databaseMedium3/24/2026
v2.5.1## What's New ### Automatic HA Failover - **Raft leadership watcher** (`src/cluster/raft_watcher.rs`): Subscribes to openraft server metrics and automatically triggers master/replica transitions when leadership changes in the Raft cluster - Zero-cost `tokio::sync::watch` subscription โ€” no polling, detects transitions in milliseconds - Kubernetes-native: uses `HOSTNAME`, `VECTORIZER_SERVICE_NAME`, `POD_IP` env vars for routable leader URLs ### Bug Fixes - **Snapshot spam eliminated**: Empty repMedium3/24/2026
v2.5.0## What's New ### HA Cluster with Raft Consensus - Raft-based leader election and log replication - TCP state replication across nodes - Automatic shard migration on node join/leave - DNS-based peer discovery for Kubernetes - Collection sync across cluster nodes - WAL-based durable replication log ### CI/CD - Container images published to GitHub Container Registry (`ghcr.io/hivellm/vectorizer`) - Helm chart published to GitHub Pages (`helm repo add vectorizer https://hivellm.github.io/vectorizMedium3/23/2026
vectorizer-2.5.0A Helm chart for Vectorizer - High-performance vector databaseMedium3/22/2026
v2.4.4Release v2.4.4Low2/3/2026
v2.4.3**Full Changelog**: https://github.com/hivellm/vectorizer/compare/v2.4.2...v2.4.3Low2/3/2026
v2.4.2**Full Changelog**: https://github.com/hivellm/vectorizer/compare/v2.4.1...v2.4.2Low2/3/2026
v2.4.1**Full Changelog**: https://github.com/hivellm/vectorizer/compare/v2.4.0...v2.4.1Low2/3/2026
v2.4.0## ๐ŸŽ‰ Latest Release: v2.4.0 - Transmutation Default & Enhanced File Upload **New in v2.4.0:** - **Transmutation enabled by default**: Document conversion (PDF, DOCX, XLSX, PPTX, images) now included in default build - **Increased file upload limits**: Support for files up to 200MB (previously 100MB) - **Enhanced file upload configuration**: Improved config loading with better error handling and logging - **Extended file format support**: Added PDF, DOCX, XLSX, PPTX, and image formats to Low1/21/2026
v2.3.0# Vectorizer v2.3.0 ## โœจ New Features ### Setup Wizard with Glassmorphism UI - Beautiful glassmorphism design with blur effects and animated backgrounds - Multi-step wizard for easy project configuration - Skip wizard option for experienced users - Automatic project analysis and collection suggestions ### Embedded Dashboard - Dashboard assets now embedded directly in the binary - **Zero external dependencies** - single binary distribution - Faster startup and simplified deploymenLow1/6/2026
v2.2.0Release v2.2.0Low12/11/2025
v2.0.3Release v2.0.3Low12/8/2025
v2.0.0Release v2.0.0Low12/8/2025
v1.8.5Release v1.8.5Low12/6/2025
v1.8.4Release v1.8.4Low12/6/2025
v1.7.0Release v1.7.0Low11/30/2025
v1.6.0Release v1.6.0Low11/28/2025
v1.5.1- vector insert fixesLow11/25/2025

Dependencies & License Audit

Loading dependencies...

Similar Packages

contextdbEmbedded database for agentic memory โ€” relational, graph, and vector under unified MVCC transactionsv0.3.4
coordinodeThe graph-native hybrid retrieval engine for AI and GraphRAG. Graph + Vector + Full-Text in a single transactional engine.v0.4.1
chromaData infrastructure for AI1.5.8
meilisearchA lightning-fast search engine API bringing AI-powered hybrid search to your sites and applications.v1.42.1
oasisdbOasisDB: A minimal and lightweight vector databasev0.1.2