freshcrate
Skin:/
Home > Testing > VectorDBBench

VectorDBBench

Benchmark for vector databases.

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Benchmark for vector databases.

README

VectorDBBench(VDBBench): A Benchmark Tool for VectorDB

version Downloads

What is VDBBench

VDBBench is not just an offering of benchmark results for mainstream vector databases and cloud services, it's your go-to tool for the ultimate performance and cost-effectiveness comparison. Designed with ease-of-use in mind, VDBBench is devised to help users, even non-professionals, reproduce results or test new systems, making the hunt for the optimal choice amongst a plethora of cloud services and open-source vector databases a breeze.

Understanding the importance of user experience, we provide an intuitive visual interface. This not only empowers users to initiate benchmarks at ease, but also to view comparative result reports, thereby reproducing benchmark results effortlessly. To add more relevance and practicality, we provide cost-effectiveness reports particularly for cloud services. This allows for a more realistic and applicable benchmarking process.

Closely mimicking real-world production environments, we've set up diverse testing scenarios including insertion, searching, and filtered searching. To provide you with credible and reliable data, we've included public datasets from actual production scenarios, such as SIFT, GIST, Cohere, and a dataset generated by OpenAI from an opensource raw dataset. It's fascinating to discover how a relatively unknown open-source database might excel in certain circumstances!

Prepare to delve into the world of VDBBench, and let it guide you in uncovering your perfect vector database match.

VDBBench is sponsored by Zilliz๏ผŒthe leading opensource vectorDB company behind Milvus. Choose smarter with VDBBench - start your free test on zilliz cloud today!

Leaderboard: https://zilliz.com/benchmark

Quick Start

Prerequirement

python >= 3.11

Install

Install vectordb-bench with only PyMilvus

pip install vectordb-bench

Install the specific database client

pip install 'vectordb-bench[pinecone]'

All the database client supported

Optional database client install command
pymilvus, zilliz_cloud (default) pip install vectordb-bench
qdrant pip install vectordb-bench[qdrant]
pinecone pip install vectordb-bench[pinecone]
weaviate pip install vectordb-bench[weaviate]
elastic, aliyun_elasticsearch pip install vectordb-bench[elastic]
pgvector, pgvectorscale, pgdiskann, alloydb, vectorchord pip install vectordb-bench[pgvector]
pgvecto.rs pip install vectordb-bench[pgvecto_rs]
redis pip install vectordb-bench[redis]
memorydb pip install vectordb-bench[memorydb]
chromadb pip install vectordb-bench[chromadb]
cockroachdb pip install vectordb-bench[cockroachdb]
awsopensearch pip install vectordb-bench[opensearch]
aliyun_opensearch pip install vectordb-bench[aliyun_opensearch]
mongodb pip install vectordb-bench[mongodb]
tidb pip install vectordb-bench[tidb]
vespa pip install vectordb-bench[vespa]
oceanbase pip install vectordb-bench[oceanbase]
hologres pip install vectordb-bench[hologres]
tencent_es pip install vectordb-bench[tencent_es]
alisql pip install 'vectordb-bench[alisql]'
polardb pip install vectordb-bench[polardb]
doris pip install vectordb-bench[doris]
zvec pip install vectordb-bench[zvec]
endee pip install vectordb-bench[endee]
lindorm pip install vectordb-bench[lindorm]

Run

init_bench

OR:

Run from the command line.

vectordbbench [OPTIONS] COMMAND [ARGS]...

To list the clients that are runnable via the commandline option, execute: vectordbbench --help

$ vectordbbench --help
Usage: vectordbbench [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  pgvectorhnsw
  pgvectorivfflat
  vectorchordrq
  test
  weaviate

To list the options for each command, execute vectordbbench [command] --help

$ vectordbbench pgvectorhnsw --help
Usage: vectordbbench pgvectorhnsw [OPTIONS]

Options:
  --config-file PATH              Read configuration from yaml file
  --drop-old / --skip-drop-old    Drop old or skip  [default: drop-old]
  --load / --skip-load            Load or skip  [default: load]
  --search-serial / --skip-search-serial
                                  Search serial or skip  [default: search-
                                  serial]
  --search-concurrent / --skip-search-concurrent
                                  Search concurrent or skip  [default: search-
                                  concurrent]
  --case-type [CapacityDim128|CapacityDim960|Performance768D100M|Performance768D10M|Performance768D1M|Performance768D10M1P|Performance768D1M1P|Performance768D10M99P|Performance768D1M99P|Performance1536D500K|Performance1536D5M|Performance1536D500K1P|Performance1536D5M1P|Performance1536D500K99P|Performance1536D5M99P|Performance1536D50K]
                                  Case type
  --db-label TEXT                 Db label, default: date in ISO format
                                  [default: 2024-05-20T20:26:31.113290]
  --dry-run                       Print just the configuration and exit
                                  without running the tasks
  --k INTEGER                     K value for number of nearest neighbors to
                                  search  [default: 100]
  --concurrency-duration INTEGER  Adjusts the duration in seconds of each
                                  concurrency search  [default: 30]
  --num-concurrency TEXT          Comma-separated list of concurrency values
                                  to test during concurrent search  [default:
                                  1,10,20]
  --concurrency-timeout INTEGER   Timeout (in seconds) to wait for a
                                  concurrency slot before failing. Set to a
                                  negative value to wait indefinitely.
                                  [default: 3600]
  --user-name TEXT                Db username  [required]
  --password TEXT                 Db password  [required]
  --host TEXT                     Db host  [required]
  --db-name TEXT                  Db name  [required]
  --maintenance-work-mem TEXT     Sets the maximum memory to be used for
                                  maintenance operations (index creation). Can
                                  be entered as string with unit like '64GB'
                                  or as an integer number of KB.This will set
                                  the parameters:
                                  max_parallel_maintenance_workers,
                                  max_parallel_workers &
                                  table(parallel_workers)
  --max-parallel-workers INTEGER  Sets the maximum number of parallel
                                  processes per maintenance operation (index
                                  creation)
  --m INTEGER                     hnsw m
  --ef-construction INTEGER       hnsw ef-construction
  --ef-search INTEGER             hnsw ef-search
  --quantization-type [none|bit|halfvec]
                                  quantization type for vectors (in index)
  --table-quantization-type [none|bit|halfvec]
                                  quantization type for vectors (in table). If
                                  equal to bit, the parameter
                                  quantization_type will be set to bit too.
  --reranking / --skip-reranking  Enable reranking for HNSW search for binary
                                  quantization
  --reranking-metric [L2|COSINE|IP|DP]
                                  Distance metric for reranking  [default:
                                  COSINE]
  --quantized-fetch-limit INTEGER
                                  Limit of fetching quantized vector ranked by
                                  distance for reranking                 --
                                  bound by ef_search
  --custom-case-name TEXT         Custom case name i.e. PerformanceCase1536D50K
  --custom-case-description TEXT  Custom name description
  --custom-case-load-timeout INTEGER
                                  Custom case load timeout [default: 36000]
  --custom-case-optimize-timeout INTEGER
                                  Custom case optimize timeout [default: 36000]
  --custom-dataset-name TEXT
                                  Dataset name i.e OpenAI
  --custom-dataset-dir TEXT       Dataset directory i.e. openai_medium_500k
  --custom-dataset-size INTEGER   Dataset size i.e. 500000
  --custom-dataset-dim INTEGER    Dataset dimension
  --custom-dataset-metric-type TEXT
                                  Dataset distance metric [default: COSINE]
  --custom-dataset-file-count INTEGER
                                  Dataset file count
  --custom-dataset-use-shuffled / --skip-custom-dataset-use-shuffled
                                  Use shuffled custom dataset or skip  [default: custom-dataset-
                                  use-shuffled]
  --custom-dataset-with-gt / --skip-custom-dataset-with-gt
                                  Custom dataset with ground truth or skip  [default: custom-dataset-
                                  with-gt]
  --help                          Show this message and exit.

Run VectorChord (vchordrq) from command line

VectorChord is a PostgreSQL extension for scalable vector similarity search using IVF + RaBitQ indexing. It is fully compatible with pgvector data types and provides faster queries and index builds.

vectordbbench vectorchordrq \
  --user-name postgres --password '<password>' \
  --host localhost --port 5432 --db-name vectordb \
  --case-type Performance1536D50K \
  --lists 1000 --probes 10 --epsilon 1.9 \
  --spherical-centroids --build-threads 8 \
  --max-parallel-workers 15

Key VectorChord-specific options:

Option Description
--lists Number of IVF lists for vchordrq index
--probes Number of probes during search (default: 10)
--epsilon Reranking precision factor, 0.0-4.0 (default: 1.9)
--residual-quantization Enable residual quantization
--spherical-centroids L2-normalize centroids (recommended for cosine/IP)
--build-threads Number of threads for index building (1-255)
--degree-of-parallelism Degree of parallelism for index build (1-256)
--max-parallel-workers Sets max_parallel_workers & max_parallel_maintenance_workers
--max-scan-tuples Max tuples to scan before stopping (-1 for unlimited)

Run awsopensearch from command line

vectordbbench awsopensearch --db-label awsopensearch \
--m 16 --ef-construction 256 \
--host search-vector-db-prod-h4f6m4of6x7yp2rz7gdmots7w4.us-west-2.es.amazonaws.com --port 443 \
--user vector --password '<password>' \
--case-type Performance1536D5M --number-of-indexing-clients 10  \
--skip-load --num-concurrency 75

To list the options for awsopensearch, execute vectordbbench awsopensearch --help

$ vectordbbench awsopensearch --help
Usage: vectordbbench awsopensearch [OPTIONS]

Options:
  # Sharding and Replication
  --number-of-shards INTEGER      Number of primary shards for the index
  --number-of-replicas INTEGER    Number of replica copies for each primary
                                  shard
  # Indexing Performance
  --index-thread-qty INTEGER      Thread count for native engine indexing
  --index-thread-qty-during-force-merge INTEGER
                                  Thread count during force merge operations
  --number-of-indexing-clients INTEGER
                                  Number of concurrent indexing clients
  # Index Management
  --number-of-segments INTEGER    Target number of segments after merging
  --refresh-interval TEXT         How often to make new data available for
                                  search
  --force-merge-enabled BOOLEAN   Whether to perform force merge operation
  --flush-threshold-size TEXT     Size threshold for flushing the transaction
                                  log
  --engine TEXT                   type of engine to use valid values [faiss, lucene, s3vector]
  # Memory Management
  --cb-threshold TEXT             k-NN Memory circuit breaker threshold

  --ondisk                        Ondisk mode with binary quantization(32x compression)
  --oversample-factor             Controls the degree of oversampling applied to minority classes in imbalanced datasets to improve model performance by balancing class distributions.(default 1.0)

  # Quantization Type
  --quantization-type TEXT        which type of quantization to use valid values [fp32, fp16, bq]
  --help                          Show this message and exit.

Run Elastic Cloud from command line

Elastic Cloud supports multiple index types: HNSW, HNSW_INT8, HNSW_INT4, and HNSW_BBQ.

Example: Run HNSW index test

vectordbbench elasticcloudhnsw --db-label elastic-cloud-test \
--cloud-id <your-cloud-id> --password '<your-password>' \
--m 16 --ef-construction 100 --num-candidates 100 \
--case-type Performance768D1M --number-of-shards 1 \
--number-of-replicas 0 --refresh-interval 30s

Example: Run HNSW_INT8 index test

vectordbbench elasticcloudhnswint8 --db-label elastic-cloud-int8 \
--cloud-id <your-cloud-id> --password '<your-password>' \
--m 16 --ef-construction 200 --num-candidates 200 \
--case-type Performance1536D50K --element-type float

Example: Run HNSW_INT4 index test

vectordbbench elasticcloudhnswint4 --db-label elastic-cloud-int4 \
--cloud-id <your-cloud-id> --password '<your-password>' \
--m 16 --ef-construction 200 --num-candidates 200 \
--case-type Performance768D10M --use-rescore --oversample-ratio 2.0

Example: Run HNSW_BBQ index test

vectordbbench elasticcloudhnswbbq --db-label elastic-cloud-bbq \
--cloud-id <your-cloud-id> --password '<your-password>' \
--m 16 --ef-construction 200 --num-candidates 200 \
--case-type Performance1536D5M --use-routing --use-force-merge

Example: Run Label Filter Performance test

vectordbbench elasticcloudhnsw --db-label elastic-cloud-label-filter \
--cloud-id <your-cloud-id> --password '<your-password>' \
--case-type LabelFilterPerformanceCase \
--dataset-with-size-type "Medium OpenAI (1536dim, 500K)" \
--label-percentage 0.001 \
--m 16 --ef-construction 128 --num-candidates 100 \
--num-concurrency 1,5 --number-of-shards 1

To list all options for Elastic Cloud, execute vectordbbench elasticcloudhnsw --help. The following are Elastic Cloud-specific command-line options:

$ vectordbbench elasticcloudhnsw --help
Usage: vectordbbench elasticcloudhnsw [OPTIONS]

Options:
  # Connection
  --cloud-id TEXT                 Elastic Cloud ID  [required]
  --password TEXT                 Elastic Cloud password  [required]

  # HNSW Index Parameters
  --m INTEGER                     HNSW M parameter  [default: 16]
  --ef-construction INTEGER       HNSW efConstruction parameter  [default: 100]
  --num-candidates INTEGER        Number of candidates for search  [default: 100]
  --element-type [float|byte]     Element type for vectors (float: 4 bytes, byte: 1 byte)  [default: float]

  # Index Configuration
  --number-of-shards INTEGER      Number of shards  [default: 1]
  --number-of-replicas INTEGER    Number of replicas  [default: 0]
  --refresh-interval TEXT         Index refresh interval  [default: 30s]
  --merge-max-thread-count INTEGER
                                  Maximum thread count for merge  [default: 8]
  --use-force-merge BOOLEAN       Whether to use force merge  [default: True]
  --use-routing BOOLEAN           Whether to use routing  [default: False]
  --use-rescore BOOLEAN           Whether to use rescore  [default: False]
  --oversample-ratio FLOAT        Oversample ratio for rescore  [default: 2.0]

  # Common Options
  --case-type [CapacityDim128|CapacityDim960|Performance768D100M|...]
                                  Case type
  --db-label TEXT                 Db label, default: date in ISO format
  --k INTEGER                     K value for number of nearest neighbors to search  [default: 100]
  --num-concurrency TEXT          Comma-separated list of concurrency values  [default: 1,5,10,20,30,40,60,80]
  --help                          Show this message and exit.

Run OceanBase from command line

Execute tests for the index types: HNSW, HNSW_SQ, or HNSW_BQ.

vectordbbench oceanbasehnsw --host xxx --port xxx --user root@mysql_tenant --database test \
--m 16 --ef-construction 200 --case-type Performance1536D50K \
--index-type HNSW --ef-search 100

To list the options for oceanbase, execute vectordbbench oceanbasehnsw --help, The following are some OceanBase-specific command-line options.

$ vectordbbench oceanbasehnsw --help
Usage: vectordbbench oceanbasehnsw [OPTIONS]

Options:
  [...]
  --host TEXT                     OceanBase host
  --user TEXT                     OceanBase username  [required]
  --password TEXT                 OceanBase database password
  --database TEXT                 DataBase name  [required]
  --port INTEGER                  OceanBase port  [required]
  --m INTEGER                     hnsw m  [required]
  --ef-construction INTEGER       hnsw ef-construction  [required]
  --ef-search INTEGER             hnsw ef-search  [required]
  --index-type [HNSW|HNSW_SQ|HNSW_BQ]
                                  Type of index to use. Supported values:
                                  HNSW, HNSW_SQ, HNSW_BQ  [required]
  --help                          Show this message and exit.

Execute tests for the index types: IVF_FLAT, IVF_SQ8, or IVF_PQ.

vectordbbench oceanbaseivf --host xxx --port xxx --user root@mysql_tenant --database test \
--nlist 1000 --sample_per_nlist 256 --case-type Performance768D1M \
--index-type IVF_FLAT --ivf_nprobes 100

To list the options for oceanbase, execute vectordbbench oceanbaseivf --help, The following are some OceanBase-specific command-line options.

$ vectordbbench oceanbaseivf --help
Usage: vectordbbench oceanbaseivf [OPTIONS]

Options:
  [...]
  --host TEXT                     OceanBase host
  --user TEXT                     OceanBase username  [required]
  --password TEXT                 OceanBase database password
  --database TEXT                 DataBase name  [required]
  --port INTEGER                  OceanBase port  [required]
  --index-type [IVF_FLAT|IVF_SQ8|IVF_PQ]
                                  Type of index to use. Supported values:
                                  IVF_FLAT, IVF_SQ8, IVF_PQ  [required]
  --nlist INTEGER                 Number of cluster centers  [required]
  --sample_per_nlist INTEGER      The cluster centers are calculated by total
                                  sampling sample_per_nlist * nlist vectors
                                  [required]
  --ivf_nprobes TEXT              How many clustering centers to search during
                                  the query  [required]
  --m INTEGER                     The number of sub-vectors that each data
                                  vector is divided into during IVF-PQ
  --help                          Show this message and exit.                       Show this message and exit.

Run Hologres from command line

It is recommended to use the following code for installation.

pip install 'vectordb-bench[hologres]' 'psycopg[binary]' pgvector

Execute tests for the index types: HGraph.

NUM_PER_BATCH=10000 vectordbbench hologreshgraph --host Hologres_Endpoint --port 80 \
--user ACCESS_ID --password ACCESS_KEY --database DATABASE_NAME \
--m 64 --ef-construction 400 --case-type Performance768D10M \
--index-type HGraph --ef-search 400 --k 10 --num-concurrency 1,60,70,75,80,90,95,100,110,120

To list the options for Hologres, execute vectordbbench hologreshgraph --help, The following are some Hologres-specific command-line options.

$ vectordbbench hologreshgraph --help
Usage: vectordbbench hologreshgraph [OPTIONS]

Options:
  [...]
  --host TEXT                     Hologres host
  --user TEXT                     Hologres username  [required]
  --password TEXT                 Hologres database password
  --database TEXT                 Hologres database name  [required]
  --port INTEGER                  Hologres port  [required]
  --m INTEGER                     hnsw m  [required]
  --ef-construction INTEGER       hnsw ef-construction  [required]
  --ef-search INTEGER             hnsw ef-search  [required]
  --index-type [HGraph]           Type of index to use. Supported values:
                                  HGraph [required]
  --help                          Show this message and exit.

Run Zvec from command line

vectordbbench zvec --path Performance768D10M --db-label 16c64g-v0.1 \
    --case-type Performance768D10M --num-concurrency 12,14,16,18,20 \
    --quantize-type int8 --ef-search 118 --is-using-refiner

To list the options for zvec, execute vectordbbench zvec --help

  --path TEXT                     collection path  [required]
  --m INTEGER                     HNSW index parameter m.
  --ef-construction INTEGER       HNSW index parameter ef_construction
  --ef-search INTEGER             HNSW index parameter ef for search
  --quantize-type TEXT            HNSW index quantize type, fp16/int8
                                  supported
  --is-using-refiner              is using refiner, suitable for quantized
                                  index, recall `ef-search` results then
                                  refine with unquantized vector to `topk`
                                  results

Run Doris from command line

Doris supports ann index with type hnsw from version 4.0.x

NUM_PER_BATCH=1000000 vectordbbench doris --http-port=8030 --port=9030 --db-name=vector_test --case-type=Performance768D1M --stream-load-rows-per-batch=500000

Using flag --session-var, if you want to test doris with some customized session variables. For example:

NUM_PER_BATCH=1000000 vectordbbench doris --http-port=8030 --port=9030 --db-name=vector_test --case-type=Performance768D1M --stream-load-rows-per-batch=500000 --session-var enable_profile=True

Mote options:

--m INTEGER                     hnsw m
--ef-construction INTEGER       hnsw ef-construction
--username TEXT                 Username  [default: root; required]
--password TEXT                 Password  [default: ""]
--host TEXT                     Db host  [default: 127.0.0.1; required]
--port INTEGER                  Query Port  [default: 9030; required]
--http-port INTEGER             Http Port  [default: 8030; required]
--db-name TEXT                  Db name  [default: test; required]
--ssl / --no-ssl                Enable or disable SSL, for Doris Serverless
                                SSL must be enabled  [default: no-ssl]
--index-prop TEXT               Extra index PROPERTY as key=value
                                (repeatable)
--session-var TEXT              Session variable key=value applied to each
                                SQL session (repeatable)
--stream-load-rows-per-batch INTEGER
                                Rows per single stream load request; default
                                uses NUM_PER_BATCH
--no-index                      Create table without ANN index

Run Lindorm from command line

Lindorm supports index types: hnsw, ivfpq, or ivfbq.

Example: Run hnsw index test

vectordbbench lindormhnsw --case-type Performance768D10M --index-name <index_name> --k 10 \
--host <lindorm_host> --port <lindorm_port> --user <username> --password <password> --m 32 \
--ef-construction 400 --ef-search 150

Example: Run ivfpq index test

vectordbbench lindormivfpq --case-type Performance768D10M \
--index-name <index_name> --k 10 --host <lindorm_host> --port <lindorm_port> \
--user <username> --password <password> --lists <nlist> --probes <nprobe> \
--m 32 --ef-construction 500 --ef-search 200 --reorder-factor 2

Example: Run ivfbq index test

vectordbbench lindormivfbq --case-type Performance768D10M --index-name <index_name> \
--k 10 --host <index_name> --port <lindorm_port> \
--user <username> --password <password> --lists <nlist> --probes <nprobe> \
--exbits 2 --m 32 --ef-construction 500 --ef-search 200 --reorder-factor 2

To list the options for Lindorm, execute vectordbbench lindormhnsw --help, The following are some Lindorm-specific command-line options.

  --host TEXT                     host connection string  [required]
  --port INTEGER                  Db Port  [required]
  --user TEXT                     Db username  [required]
  --password TEXT                 Db password  [required]
  --index-name TEXT               Db index name  [required]
  --filter-type TEXT              post_filter|pre_filter|efficient_filter
  --number-of-regions INTEGER     Vector number of regions
  --m INTEGER                     hnsw m  [required]
  --ef-construction INTEGER       hnsw ef-construction  [required]
  --ef-search INTEGER             hnsw ef-search  [required]

Run PolarDB from command line

PolarDB supports index types: faiss_hnsw_flat, faiss_hnsw_pq, and faiss_hnsw_sq.

Example: Run faiss_hnsw_flat benchmark

vectordbbench polardbhnswflat \
  --case-type Performance768D1M \
  --username <db_user> \
  --password '<db_password>' \
  --host <db_host> \
  --port 3306 \
  --m 16 \
  --ef-construction 256 \
  --ef-search 256 \
  --insert-workers 64 \
  --num-concurrency '10,20,40,60,80' \
  --concurrency-duration 60 \
  --task-label <task_label> \
  --db-label <db_label> \
  --skip-search-serial \
  --post-load-index

To list the options for PolarDB, execute vectordbbench polardbhnswflat --help. The following are some PolarDB-specific command-line options.

  --username TEXT                  Username  [required]
  --password TEXT                  Password
  --host TEXT                      Db host  [default: 127.0.0.1]
  --port INTEGER                   Db Port  [default: 3306]
  --database TEXT                  Database name  [default: vectordbbench]
  --m INTEGER                      M parameter (max_degree) in HNSW
  --ef-construction INTEGER        ef_construction parameter in HNSW
  --ef-search INTEGER              polar_vector_index_hnsw_ef_search session variable
  --insert-workers INTEGER         Number of concurrent threads for data insertion
  --post-load-index / --inline-index
                                   Create index after load or inline at table creation

Using a configuration file.

The vectordbbench command can optionally read some or all the options from a yaml formatted configuration file.

By default, configuration files are expected to be in vectordb_bench/config-files/, this can be overridden by setting the environment variable CONFIG_LOCAL_DIR or by passing the full path to the file.

The required format is:

commandname:
   parameter_name: parameter_value
   parameter_name: parameter_value

Example:

pgvectorhnsw:
  db_label: pgConfigTest
  user_name: vectordbbench
  password: vectordbbench
  db_name:  vectordbbench
  host: localhost
  m: 16
  ef_construction: 128
  ef_search: 128
milvushnsw:
  skip_search_serial: True
  case_type: Performance1536D50K
  uri: http://localhost:19530
  m: 16
  ef_construction: 128
  ef_search: 128
  drop_old: False
  load: False
elasticcloudhnsw:
  db_label: elastic-cloud-hnsw
  cloud_id: <your-cloud-id>
  password: <your-password>
  case_type: Performance768D1M
  m: 16
  ef_construction: 100
  num_candidates: 100
  number_of_shards: 1
  number_of_replicas: 0
  refresh_interval: 30s
  element_type: float

Notes:

  • Options passed on the command line will override the configuration file*
  • Parameter names use an _ not -
  • For LabelFilterPerformanceCase and NewIntFilterPerformanceCase, you must specify dataset_with_size_type in addition to case_type

Using a batch configuration file.

The vectordbbench command can read a batch configuration file to run all the test cases in the yaml formatted configuration file.

By default, configuration files are expected to be in vectordb_bench/config-files/, this can be overridden by setting the environment variable CONFIG_LOCAL_DIR or by passing the full path to the file.

The required format is:

commandname:
  - parameter_name: parameter_value
    another_parameter_name: parameter_value

Example:

pgvectorhnsw:
  - db_label: pgConfigTest
    user_name: vectordbbench
    password: vectordbbench
    db_name:  vectordbbench
    host: localhost
    m: 16
    ef_construction: 128
    ef_search: 128
milvushnsw:
  - skip_search_serial: True
    case_type: Performance1536D50K
    uri: http://localhost:19530
    m: 16
    ef_construction: 128
    ef_search: 128
    drop_old: False
    load: False
elasticcloudhnsw:
  - db_label: elastic-cloud-hnsw-test-1
    cloud_id: <your-cloud-id>
    password: <your-password>
    case_type: Performance768D1M
    m: 16
    ef_construction: 100
    num_candidates: 100
  - db_label: elastic-cloud-label-filter-0.1
    cloud_id: <your-cloud-id>
    password: <your-password>
    case_type: LabelFilterPerformanceCase
    dataset_with_size_type: "Medium OpenAI (1536dim, 500K)"
    label_percentage: 0.001
    m: 16
    ef_construction: 128
    num_candidates: 100
    num_concurrency: "1,5"

Notes:

  • Options can only be passed through configuration files
  • Parameter names use an _ not -
  • For LabelFilterPerformanceCase and NewIntFilterPerformanceCase, you must specify dataset_with_size_type in addition to case_type

How to use?

vectordbbench batchcli --batch-config-file <your-yaml-configuration-file>

Leaderboard

Introduction

To facilitate the presentation of test results and provide a comprehensive performance analysis report, we offer a leaderboard page. It allows us to choose from QPS, QP$, and latency metrics, and provides a comprehensive assessment of a system's performance based on the test results of various cases and a set of scoring mechanisms (to be introduced later). On this leaderboard, we can select the systems and models to be compared, and filter out cases we do not want to consider. Comprehensive scores are always ranked from best to worst, and the specific test results of each query will be presented in the list below.

Scoring Rules

  1. For each case, select a base value and score each system based on relative values.

    • For QPS and QP$, we use the highest value as the reference, denoted as base_QPS or base_QP$, and the score of each system is (QPS/base_QPS) * 100 or (QP$/base_QP$) * 100.
    • For Latency, we use the lowest value as the reference, that is, base_Latency, and the score of each system is (base_Latency + 10ms)/(Latency + 10ms) * 100.

    We want to give equal weight to different cases, and not let a case with high absolute result values become the sole reason for the overall scoring. Therefore, when scoring different systems in each case, we need to use relative values.

    Also, for Latency, we add 10ms to the numerator and denominator to ensure that if every system performs particularly well in a case, its advantage will not be infinitely magnified when latency tends to 0.

  2. For systems that fail or timeout in a particular case, we will give them a score based on a value worse than the worst result by a factor of two. For example, in QPS or QP$, it would be half the lowest value. For Latency, it would be twice the maximum value.

  3. For each system, we will take the geometric mean of its scores in all cases as its comprehensive score for a particular metric.

Build on your own

Install requirements

pip install -e '.[test]'

pip install -e '.[pinecone]'

Run test server

python -m vectordb_bench

OR:

init_bench

OR:

If you are using dev container, create the following dataset directory first:

# Mount local ~/vectordb_bench/dataset to contain's /tmp/vectordb_bench/dataset.
# If you are not comfortable with the path name, feel free to change it in devcontainer.json
mkdir -p ~/vectordb_bench/dataset

After reopen the repository in container, run python -m vectordb_bench in the container's bash.

Check coding styles

make lint

To fix the coding styles automatically

make format

How does it work?

Result Page

image This is the main page of VDBBench, which displays the standard benchmark results we provide. Additionally, results of all tests performed by users themselves will also be shown here. We also offer the ability to select and compare results from multiple tests simultaneously.

The standard benchmark results displayed here include all 15 cases that we currently support for 6 of our clients (Milvus, Zilliz Cloud, Elastic Search, Qdrant Cloud, Weaviate Cloud and PgVector). However, as some systems may not be able to complete all the tests successfully due to issues like Out of Memory (OOM) or timeouts, not all clients are included in every case.

All standard benchmark results are generated by a client running on an 8 core, 32 GB host, which is located in the same region as the server being tested. The client host is equipped with an Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz processor. Also all the servers for the open-source systems tested in our benchmarks run on hosts with the same type of processor.

Run Test Page

  1. Initially, you select the systems to be tested - multiple selections are allowed. Once selected, corresponding forms will pop up to gather necessary information for using the chosen databases. The db_label is used to differentiate different instances of the same system. We recommend filling in the host size or instance type here (as we do in our standard results).
  2. The next step is to select the test cases you want to perform. You can select multiple cases at once, and a form to collect corresponding parameters will appear.
  3. Finally, you'll need to provide a task label to distinguish different test results. Using the same label for different tests will result in the previous results being overwritten. Now we can only run one task at the same time. image image image

Module

Code Structure

image

Client

Our client module is designed with flexibility and extensibility in mind, aiming to integrate APIs from different systems seamlessly. As of now, it supports Milvus, Zilliz Cloud, Elastic Search, Pinecone, Qdrant Cloud, Weaviate Cloud, PgVector, VectorChord, Redis, Chroma, CockroachDB, etc. Stay tuned for more options, as we are consistently working on extending our reach to other systems.

Benchmark Cases

We've developed lots of comprehensive benchmark cases to test vector databases' various capabilities, each designed to give you a different piece of the puzzle. These cases are categorized into four main types:

Capacity Case

  • Large Dim: Tests the database's loading capacity by inserting large-dimension vectors (GIST 100K vectors, 960 dimensions) until fully loaded. The final number of inserted vectors is reported.
  • Small Dim: Similar to the Large Dim case but uses small-dimension vectors (SIFT 500K vectors, 128 dimensions).

Search Performance Case

  • XLarge Dataset: Measures search performance with a massive dataset (LAION 100M vectors, 768 dimensions) at varying parallel levels. The results include index building time, recall, latency, and maximum QPS.
  • Large Dataset: Similar to the XLarge Dataset case, but uses a slightly smaller dataset (10M-1024dim, 10M-768dim, 5M-1536dim).
  • Medium Dataset: A case using a medium dataset (1M-1024dim, 1M-768dim, 500K-1536dim).
  • Small Dataset: For development (100K-768dim, 50K-1536dim).

Filtering Search Performance Case

  • Int-Filter Cases: Evaluates search performance with int-based filter expression (e.g. "id >= 2,000").
  • Label-Filter Cases: Evaluates search performance with label-based filter expressions (e.g., "color == 'red'"). The test includes randomly generated labels to simulate real-world filtering scenarios.

Streaming Cases

  • Insertion-Under-Load Case: Evaluates search performance while maintaining a constant insertion workload. VDBBench applies a steady stream of insert requests at a fixed rate to simulate real-world scenarios where search operations must perform reliably under continuous data ingestion.

Each case provides an in-depth examination of a vector database's abilities, providing you a comprehensive view of the database's performance.

Custom Dataset for Performance case

Through the /custom page, users can customize their own performance case using local datasets. After saving, the corresponding case can be selected from the /run_test page to perform the test.

image image

We have strict requirements for the data set format, please follow them.

  • Folder Path - The path to the folder containing all the files. Please ensure that all files in the folder are in the Parquet format.

    • Vectors data files: The file must be named train.parquet and should have two columns: id as an incrementing int and emb as an array of float32.
    • Query test vectors: The file must be named test.parquet and should have two columns: id as an incrementing int and emb as an array of float32.
      • We recommend limiting the number of test query vectors, like 1,000. When conducting concurrent query tests, Vdbbench creates a large number of processes. To minimize additional communication overhead during testing, we prepare a complete set of test queries for each process, allowing them to run independently. However, this means that as the number of concurrent processes increases, the number of copied query vectors also increases significantly, which can place substantial pressure on memory resources.
    • Ground truth file: The file must be named neighbors.parquet and should have two columns: id corresponding to query vectors and neighbors_id as an array of int.
  • Train File Count - If the vector file is too large, you can consider splitting it into multiple files. The naming format for the split files should be train-[index]-of-[file_count].parquet. For example, train-01-of-10.parquet represents the second file (0-indexed) among 10 split files.

  • Use Shuffled Data - If you check this option, the vector data files need to be modified. VDBBench will load the data labeled with shuffle. For example, use shuffle_train.parquet instead of train.parquet and shuffle_train-04-of-10.parquet instead of train-04-of-10.parquet. The id column in the shuffled data can be in any order.

Goals

Our goals of this benchmark are:

Reproducibility & Usability

One of the primary goals of VDBBench is to enable users to reproduce benchmark results swiftly and easily, or to test their customized scenarios. We believe that lowering the barriers to entry for conducting these tests will enhance the community's understanding and improvement of vector databases. We aim to create an environment where any user, regardless of their technical expertise, can quickly set up and run benchmarks, and view and analyze results in an intuitive manner.

Representability & Realism

VDBBench aims to provide a more comprehensive, multi-faceted testing environment that accurately represents the complexity of vector databases. By moving beyond a simple speed test for algorithms, we hope to contribute to a better understanding of vector databases in real-world scenarios. By incorporating as many complex scenarios as possible, including a variety of test cases and datasets, we aim to reflect realistic conditions and offer tangible significance to our community. Our goal is to deliver benchmarking results that can drive tangible improvements in the development and usage of vector databases.

Contribution

General Guidelines

  1. Fork the repository and create a new branch for your changes.
  2. Adhere to coding conventions and formatting guidelines.
  3. Use clear commit messages to document the purpose of your changes.

Adding New Clients

Step 1: Creating New Client Files

  1. Navigate to the vectordb_bench/backend/clients directory.
  2. Create a new folder for your client, for example, "new_client".
  3. Inside the "new_client" folder, create two files: new_client.py and config.py.

Step 2: Implement new_client.py and config.py

  1. Open new_client.py and define the NewClient class, which should inherit from the clients/api.py file's VectorDB abstract class. The VectorDB class serves as the API for benchmarking, and all DB clients must implement this abstract class. Example implementation in new_client.py: new_client.py
from ..api import VectorDB
class NewClient(VectorDB):
    # Implement the abstract methods defined in the VectorDB class
    # ...
  1. Open config.py and implement the DBConfig and optional DBCaseConfig classes.
  2. The DBConfig class should be an abstract class that provides information necessary to establish connections with the database. It is recommended to use the pydantic.SecretStr data type to handle sensitive data such as tokens, URIs, or passwords.
  3. The DBCaseConfig class is optional and allows for providing case-specific configurations for the database. If not provided, it defaults to EmptyDBCaseConfig. Example implementation in config.py:
from pydantic import SecretStr
from clients.api import DBConfig, DBCaseConfig

class NewDBConfig(DBConfig):
    # Implement the required configuration fields for the database connection
    # ...
    token: SecretStr
    uri: str

class NewDBCaseConfig(DBCaseConfig):
    # Implement optional case-specific configuration fields
    # ...

Step 3: Importing the DB Client and Updating Initialization

In this final step, you will import your DB client into clients/init.py and update the initialization process.

  1. Open clients/init.py and import your NewClient from new_client.py.
  2. Add your NewClient to the DB enum.
  3. Update the db2client dictionary by adding an entry for your NewClient. Example implementation in clients/init.py:
#clients/__init__.py

# Add NewClient to the DB enum
class DB(Enum):
    ...
    DB.NewClient = "NewClient"

    @property
    def init_cls(self) -> Type[VectorDB]:
        ...
        if self == DB.NewClient:
            from .new_client.new_client import NewClient
            return NewClient
        ...

    @property
    def config_cls(self) -> Type[DBConfig]:
        ...
        if self == DB.NewClient:
            from .new_client.config import NewClientConfig
            return NewClientConfig
        ...

    def case_config_cls(self, ...)
        if self == DB.NewClient:
            from .new_client.config import NewClientCaseConfig
            return NewClientCaseConfig

Step 4: Implement new_client/cli.py and vectordb_bench/cli/vectordbbench.py

In this (optional, but encouraged) step you will enable the test to be run from the command line.

  1. Navigate to the vectordb_bench/backend/clients/"client" directory.
  2. Inside the "client" folder, create a cli.py file. Using zilliz as an example cli.py:
from typing import Annotated, Unpack

import click
import os
from pydantic import SecretStr

from vectordb_bench.cli.cli import (
    CommonTypedDict,
    cli,
    click_parameter_decorators_from_typed_dict,
    run,
)
from vectordb_bench.backend.clients import DB


class ZillizTypedDict(CommonTypedDict):
    uri: Annotated[
        str, click.option("--uri", type=str, help="uri connection string", required=True)
    ]
    user_name: Annotated[
        str, click.option("--user-name", type=str, help="Db username", required=True)
    ]
    password: Annotated[
        str,
        click.option("--password",
                     type=str,
                     help="Zilliz password",
                     default=lambda: os.environ.get("ZILLIZ_PASSWORD", ""),
                     show_default="$ZILLIZ_PASSWORD",
                     ),
    ]
    level: Annotated[
        str,
        click.option("--level", type=str, help="Zilliz index level", required=False),
    ]


@cli.command()
@click_parameter_decorators_from_typed_dict(ZillizTypedDict)
def ZillizAutoIndex(**parameters: Unpack[ZillizTypedDict]):
    from .config import ZillizCloudConfig, AutoIndexConfig

    run(
        db=DB.ZillizCloud,
        db_config=ZillizCloudConfig(
            db_label=parameters["db_label"],
            uri=SecretStr(parameters["uri"]),
            user=parameters["user_name"],
            password=SecretStr(parameters["password"]),
        ),
        db_case_config=AutoIndexConfig(
            params={parameters["level"]},
        ),
        **parameters,
    )
  1. Update cli by adding:
    1. Add database specific options as an Annotated TypedDict, see ZillizTypedDict above.
    2. Add index configuration specific options as an Annotated TypedDict. (example: vectordb_bench/backend/clients/pgvector/cli.py)
      1. May not be needed if there is only one index config.
      2. Repeat for each index configuration, nesting them if possible.
    3. Add a index config specific function for each index type, see Zilliz above. The function name, in lowercase, will be the command name passed to the vectordbbench command.
    4. Update db_config and db_case_config to match client requirements
    5. Continue to add new functions for each index config.
    6. Import the client cli module and command to vectordb_bench/cli/vectordbbench.py (for databases with multiple commands (index configs), this only needs to be done for one command)
    7. Import the get_custom_case_config function from vectordb_bench/cli/cli.py and use it to add a new key custom_case to the parameters variable within the command.

cli modules with multiple index configs:

  • pgvector: vectordb_bench/backend/clients/pgvector/cli.py
  • milvus: vectordb_bench/backend/clients/milvus/cli.py

That's it! You have successfully added a new DB client to the vectordb_bench project.

Rules

Installation

The system under test can be installed in any form to achieve optimal performance. This includes but is not limited to binary deployment, Docker, and cloud services.

Fine-Tuning

For the system under test, we use the default server-side configuration to maintain the authenticity and representativeness of our results. For the Client, we welcome any parameter tuning to obtain better results.

Incomplete Results

Many databases may not be able to complete all test cases due to issues such as Out of Memory (OOM), crashes, or timeouts. In these scenarios, we will clearly state these occurrences in the test results.

Mistake Or Misrepresentation

We strive for accuracy in learning and supporting various vector databases, yet there might be oversights or misapplications. For any such occurrences, feel free to raise an issue or make amendments on our GitHub page.

Timeout

In our pursuit to ensure that our benchmark reflects the reality of a production environment while guaranteeing the practicality of the system, we have implemented a timeout plan based on our experiences for various tests.

1. Capacity Case:

  • For Capacity Case, we have assigned an overall timeout.

2. Other Cases:

For other cases, we have set two timeouts:

  • Data Loading Timeout: This timeout is designed to filter out systems that are too slow in inserting data, thus ensuring that we are only considering systems that is able to cope with the demands of a real-world production environment within a reasonable time frame.

  • Optimization Preparation Timeout: This timeout is established to avoid excessive optimization strategies that might work for benchmarks but fail to deliver in real production environments. By doing this, we ensure that the systems we consider are not only suitable for testing environments but also applicable and efficient in production scenarios.

This multi-tiered timeout approach allows our benchmark to be more representative of actual production environments and assists us in identifying systems that can truly perform in real-world scenarios.

Case Data Size Timeout Type Value
Capacity Case N/A Loading timeout 24 hours
Other Cases 1M vectors, 768 dimensions
500K vectors, 1536 dimensions
Loading timeout 2.5 hours
Optimization timeout 15 mins
Other Cases 10M vectors, 768 dimensions
5M vectors, 1536 dimensions
Loading timeout 25 hours
Optimization timeout 2.5 hours
Other Cases 100M vectors, 768 dimensions Loading timeout 250 hours
Optimization timeout 25 hours

Note: Some datapoints in the standard benchmark results that violate this timeout will be kept for now for reference. We will remove them in the future.

Release History

VersionChangesUrgencyDate
v1.0.22## What's Changed * feat(seekdb): add SeekDB backend and HNSW benchmark support by @liuhao6741 in https://github.com/zilliztech/VectorDBBench/pull/770 * fix: Require pymilvus<3.0.0 and fix the overflow size by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/781 * feat(oceanbase): configurable index params, KEY partitioning, HNSW_BQ cosine support by @wyfanxiao in https://github.com/zilliztech/VectorDBBench/pull/776 ## New Contributors * @liuhao6741 made their first contHigh5/15/2026
v1.0.21 VDBBench 1.0.21 adds Lindorm, PolarDB, Apache Pinot, VectorChord, and Intel SVS support; migrates Milvus benchmarks to `MilvusClient`; adds concurrent insert support; upgrades to Pydantic v2; improves Streamlit error surfacing; and refreshes benchmark results for Milvus, ElasticCloud, ZillizCloud, and turbopuffer. High4/27/2026
v1.0.20## What's Changed * Added Endee Client Support by @MithunEndee in https://github.com/zilliztech/VectorDBBench/pull/711 * fix: Use StrEnum instead of str, Enum by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/722 * Feat/endee version by @MithunEndee in https://github.com/zilliztech/VectorDBBench/pull/715 * fix: Fix benchmark results display for paswordless AWS OpenSearch benchmarks by @javiervegas in https://github.com/zilliztech/VectorDBBench/pull/721 * feat: add PineconeLow2/12/2026
v1.0.19## What's Changed * Add Chroma cli to VectorDBBench by @bjpietrzak in https://github.com/zilliztech/VectorDBBench/pull/610 * enhance: Change the Authors and Emails by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/697 * feat: Update CockroachDB logo for better visibility by @viragtripathi in https://github.com/zilliztech/VectorDBBench/pull/702 * enhance: fix coding styles by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/703 * fix: update logging messagesLow1/29/2026
v1.0.18## What's Changed * feat: add OceanBase UI config settings by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/691 * support restful by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/690 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v1.0.17...v1.0.18Low1/4/2026
v1.0.17## What's Changed * feat(oss-opensearch): Add memory-optimized search configuration option by @Akhil-Pathivada in https://github.com/zilliztech/VectorDBBench/pull/673 * feat: Add QPS-Latency tradeoff metrics for Streaming Tests by @Akhil-Pathivada in https://github.com/zilliztech/VectorDBBench/pull/670 * feat(oss-opensearch): Add Disk-based vector search support by @Akhil-Pathivada in https://github.com/zilliztech/VectorDBBench/pull/680 * feat(milvus): add SCANN index support by @JackLCL in Low12/26/2025
v1.0.16## What's Changed * expose turbopuffer through CLI and make compatible with latest sdk by @lantingchiang in https://github.com/zilliztech/VectorDBBench/pull/667 * Custom log file placement by @mottosen in https://github.com/zilliztech/VectorDBBench/pull/669 * make collection name configurable by @lantingchiang in https://github.com/zilliztech/VectorDBBench/pull/671 ## New Contributors * @lantingchiang made their first contribution in https://github.com/zilliztech/VectorDBBench/pull/667 *Low12/12/2025
v1.0.15## What's Changed * fix(cockroachdb): Handle 30s timeout and ensure vector index usage by @viragtripathi in https://github.com/zilliztech/VectorDBBench/pull/647 * fix es quant & rescore settings by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/648 * Update cli.py fix bug (#649) by @WSL0809 in https://github.com/zilliztech/VectorDBBench/pull/651 * Several modifications to the AliSQL client by @JoeJRW in https://github.com/zilliztech/VectorDBBench/pull/657 * feat(cockroaLow12/5/2025
v1.0.14## What's Changed * fix v1.0.13 by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/644 * Feature: support doris benchmark by @zhiqiang-hhhh in https://github.com/zilliztech/VectorDBBench/pull/631 * feat: Add CockroachDB vector database support by @viragtripathi in https://github.com/zilliztech/VectorDBBench/pull/630 * feat: support turbopuffer client by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/646 * fix lint issues; remove client dependency froLow11/26/2025
v1.0.13## What's Changed * feat(oss-opensearch): Add version compatibility for 2.x and 3.x by @Akhil-Pathivada in https://github.com/zilliztech/VectorDBBench/pull/635 * Add TencentES client by @morning-color in https://github.com/zilliztech/VectorDBBench/pull/623 * feat: add ujson dependency and fix lint issues by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/643 * set default milvus-partition-key to false by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pullLow11/25/2025
v1.0.12## What's Changed * fixing README.md file to update pip install command by @towfeeqfayaz11 in https://github.com/zilliztech/VectorDBBench/pull/622 * aws opensearch added ondisk mode and binary quantization 32x compression by @norrishuang in https://github.com/zilliztech/VectorDBBench/pull/625 * cli support run LabelFilterPerformanceCase by @egolearner in https://github.com/zilliztech/VectorDBBench/pull/626 * feat(oss-opensearch): Add replication type configuration option by @Akhil-Pathivada Low11/7/2025
v1.0.11## What's Changed * Update cli.py fix bug by @stitchWzc in https://github.com/zilliztech/VectorDBBench/pull/608 * Update cli.py fix bug by @stitchWzc in https://github.com/zilliztech/VectorDBBench/pull/607 * Update cli.py by @stitchWzc in https://github.com/zilliztech/VectorDBBench/pull/609 * Do not store vector data in source, fixed compatible issue of latest version of opensearch-py by @norrishuang in https://github.com/zilliztech/VectorDBBench/pull/613 * feat: Add concurrency duration coLow10/23/2025
v1.0.10## What's Changed * Add S3vectors cli support by @acanadil in https://github.com/zilliztech/VectorDBBench/pull/600 * fix: remove psycopg dependency from rate_runner by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/606 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v1.0.9...v1.0.10Low9/18/2025
v1.0.9## What's Changed * Optimize the Hologres test execution method in the README document. by @TimothyDing in https://github.com/zilliztech/VectorDBBench/pull/597 * fix ruff lint issue by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/598 * feat(pgvector): add new label/int-filter support and refactor client by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/586 ## New Contributors * @TimothyDing made their first contribution in https://github.com/zilLow9/12/2025
v1.0.8## What's Changed * Add: CLI support for --dataset-with-size-type and --filter-rate in NewIntFilterPerformanceCase by @pitiless0514 in https://github.com/zilliztech/VectorDBBench/pull/593 * update hologres client, use hybrid index by @biaozy in https://github.com/zilliztech/VectorDBBench/pull/594 * bugfix: remove metric_type from GPUBruteForceConfig by @lee-taejun in https://github.com/zilliztech/VectorDBBench/pull/595 ## New Contributors * @pitiless0514 made their first contribution in hLow9/5/2025
v1.0.7## What's Changed * add env parameters of dataset download from AWS S3 Or Aliyun OSS, cli add num-shards of zilliz cloud and Modify the OpenSearch cluster parameter configuration method to be compatible with opensearch-py 3.0 by @norrishuang in https://github.com/zilliztech/VectorDBBench/pull/583 * optimize:milvus add replica-number parameter by @liyunqiu666 in https://github.com/zilliztech/VectorDBBench/pull/588 * Added Alibaba Cloud Hologres support to VectorDBBench. by @xiaolanlianhua in hLow8/29/2025
v1.0.6## What's Changed * Add S3vector Engine for AWS OpenSearch, fixed some bugs of AWS OpenSearch cli parameters by @norrishuang in https://github.com/zilliztech/VectorDBBench/pull/569 * Fix missing assets by @emmanuel-ferdman in https://github.com/zilliztech/VectorDBBench/pull/575 * Add nbits parameter to IVF_PQ index and adapt new filter logic by @wyfanxiao in https://github.com/zilliztech/VectorDBBench/pull/576 * Added support for Product Quantization in pg_diskann by @wahajali in https://gitLow8/25/2025
v1.0.5## What's Changed * Fixed the issue where the welcome page image could not be loaded. by @zhuwenxing in https://github.com/zilliztech/VectorDBBench/pull/556 * Fix back to results page error by @zhuwenxing in https://github.com/zilliztech/VectorDBBench/pull/557 * Fix running `vectordbbench tidb` will return error by @JaySon-Huang in https://github.com/zilliztech/VectorDBBench/pull/559 * feat: Add OSS OpenSearch client support by @akhilpathivada in https://github.com/zilliztech/VectorDBBench/pLow7/25/2025
v1.0.4## What's Changed * upgrade aliyun opensearch client by @hust-xing in https://github.com/zilliztech/VectorDBBench/pull/552 * Feat add ivf rabitq for command #553 by @MageChiu in https://github.com/zilliztech/VectorDBBench/pull/554 ## New Contributors * @MageChiu made their first contribution in https://github.com/zilliztech/VectorDBBench/pull/554 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v1.0.3...v1.0.4Low7/4/2025
v1.0.3## What's Changed * update elastic_cloud results by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/548 * Fix: Correct typos in README by @triplechecker-com in https://github.com/zilliztech/VectorDBBench/pull/550 * fix bugs: remove None from download_files by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/551 ## New Contributors * @triplechecker-com made their first contribution in https://github.com/zilliztech/VectorDBBench/pull/550 **Full ChanLow7/2/2025
v1.0.2## What's Changed * fix bug: set default num_shards to 1 by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/547 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v1.0.1...v1.0.2Low6/19/2025
v1.0.1## What's Changed * generate leaderboard_v2 data by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/546 * update some docs by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/544 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v1.0.0...v1.0.1Low6/18/2025
v1.0.0### VDBBench 1.0 is Here! We're excited to launch **VectorDBBench(VDBBench) 1.0**, a major update focused on a better user experience and more realistic testing. **What's New in 1.0:** * **๐Ÿš€ Redesigned UI:** A brand new homepage and integrated analytics pages make it easier than ever to visualize and compare test results. * **๐Ÿท๏ธ Label-Filter Tests:** Simulate real-world queries with new tests for filtered search (e.g., `color == "red"`). * **๐ŸŒŠ Streaming Scenarios:** Measure sLow6/16/2025
v0.0.30## What's Changed * add --num-shards option for milvus performance test case by @LoveYou3000 in https://github.com/zilliztech/VectorDBBench/pull/526 * Add a batch cli to support the batch execution of multiple cases. by @LoveYou3000 in https://github.com/zilliztech/VectorDBBench/pull/530 * Fixing bugs in aws opensearch client and added fp16 support by @navneet1v in https://github.com/zilliztech/VectorDBBench/pull/529 * Bugfix: add num_shards option to MilvusHNSW by @LoveYou3000 in https://gLow6/11/2025
v0.0.29## What's Changed * Add qdrant CLI support and fix support for optional fields by @s-h-a-d-o-w in https://github.com/zilliztech/VectorDBBench/pull/519 * update readme by @yuyuankang in https://github.com/zilliztech/VectorDBBench/pull/522 * Fixing Bugs in Benchmarking ClickHouse with vectordbbench by @yuyuankang in https://github.com/zilliztech/VectorDBBench/pull/523 * Add --concurrency-timeout option to avoid long time waiting by @LoveYou3000 in https://github.com/zilliztech/VectorDBBench/puLow5/16/2025
v0.0.28## What's Changed * fix: prevent the frontend from crashing on invalid indexes in results by @s-h-a-d-o-w in https://github.com/zilliztech/VectorDBBench/pull/513 * Add lancedb support by @s-h-a-d-o-w in https://github.com/zilliztech/VectorDBBench/pull/518 * Add --task-label option for cli by @LoveYou3000 in https://github.com/zilliztech/VectorDBBench/pull/517 ## New Contributors * @s-h-a-d-o-w made their first contribution in https://github.com/zilliztech/VectorDBBench/pull/513 * @LoveYoLow5/7/2025
v0.0.27## What's Changed * fix bugs when use custom_dataset without groundtruth file by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/511 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.26...v0.0.27Low4/30/2025
v0.0.26## What's Changed * add more milvus index types: hnsw sq/pq/prq; ivf rabitq by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/505 * add more milvus index types: ivf_pq by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/508 * Add HNSW support for Clickhouse client by @MansorY23 in https://github.com/zilliztech/VectorDBBench/pull/500 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.25...v0.0.26Low4/28/2025
v0.0.25## What's Changed * fix cli crush by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/502 * fix bug with streamlit when running tests by @le-codeur-rapide in https://github.com/zilliztech/VectorDBBench/pull/504 ## New Contributors * @le-codeur-rapide made their first contribution in https://github.com/zilliztech/VectorDBBench/pull/504 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.24...v0.0.25Low4/17/2025
v0.0.24## What's Changed * remove duplicated code by @yuyuankang in https://github.com/zilliztech/VectorDBBench/pull/490 * add Clickhouse benchmark by @MansorY23 in https://github.com/zilliztech/VectorDBBench/pull/495 * Add vespa client by @nuvotex-tk in https://github.com/zilliztech/VectorDBBench/pull/498 * remove redundant empty_field config check for qdrant and tidb by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/499 ## New Contributors * @yuyuankang made their first cLow4/14/2025
v0.0.23## What's Changed * Support GPU_BRUTE_FORCE index for Milvus by @Rachit-Chaudhary11 in https://github.com/zilliztech/VectorDBBench/pull/476 * Add table quantization option for pgvector by @lucagiac81 in https://github.com/zilliztech/VectorDBBench/pull/427 * Support MariaDB database by @HugoWenTD in https://github.com/zilliztech/VectorDBBench/pull/375 * Add TiDB backend by @breezewish in https://github.com/zilliztech/VectorDBBench/pull/484 * CLI fix for GPU index by @Rachit-Chaudhary11 in htLow3/14/2025
v0.0.22## What's Changed * add mongodb client by @zhuwenxing in https://github.com/zilliztech/VectorDBBench/pull/449 * add some risk warnings for custom dataset by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/452 * Bump grpcio from 1.53.0 to 1.53.2 in /install by @dependabot in https://github.com/zilliztech/VectorDBBench/pull/453 * add mongodb config by @zhuwenxing in https://github.com/zilliztech/VectorDBBench/pull/451 * Opensearch interal configuration parameters by @XavieLow2/21/2025
v0.0.21## What's Changed * fix bug by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/442 * fix: Unable to run vebbench and cli by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/447 * enhance: Unify optimize and remove ready_to_load by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/448 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.20...v0.0.21Low1/14/2025
v0.0.20## What's Changed * add aliyun Opensearch requirements by @hust-xing in https://github.com/zilliztech/VectorDBBench/pull/425 * update readme by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/432 * Removed the Filter Path from Opensearch Search, so we can get the full response from search by @Xavierantony1982 in https://github.com/zilliztech/VectorDBBench/pull/434 * add support to provide custom port in pgvector by @shaharuk-yb in https://github.com/zilliztech/VectorDBBenLow1/9/2025
v0.0.19## What's Changed * fix: invalid value for --max-num-levels when using CLI (AlloyDB) by @Sheharyar570 in https://github.com/zilliztech/VectorDBBench/pull/415 * Add Milvus auth support through user_name and password fields by @teynar in https://github.com/zilliztech/VectorDBBench/pull/416 * support alibaba cloud elasticsearch by @xingshaomin in https://github.com/zilliztech/VectorDBBench/pull/418 * enhance: refine read write cases by @XuanYang-cn in https://github.com/zilliztech/VectorDBBencLow12/13/2024
v0.0.18## What's Changed * Added AlloyDB client by @Sheharyar570 in https://github.com/zilliztech/VectorDBBench/pull/412 * fix: Donot refresh load by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/414 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.17...v0.0.18Low11/28/2024
v0.0.17## What's Changed * Add rate runner by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/403 * fix conc_latency_p99 calculation; add conc_latency_avg metric; conc_test first by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/410 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.16...v0.0.17Low11/26/2024
v0.0.16## What's Changed * Fix code for custom dataset usage by @acanadil in https://github.com/zilliztech/VectorDBBench/pull/395 * remove older zillizcloud test results from leaderboard by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/399 * Fix pgvectorivfflat reranking key bug by @Sheharyar570 in https://github.com/zilliztech/VectorDBBench/pull/401 ## New Contributors * @acanadil made their first contribution in https://github.com/zilliztech/VectorDBBench/pull/395 **FuLow11/6/2024
v0.0.15## What's Changed * Support for pgdiskann client by @wahajali in https://github.com/zilliztech/VectorDBBench/pull/388 * increase timeout by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/390 * Binary Quantization Support for pgvector HNSW Algorithm by @Sheharyar570 in https://github.com/zilliztech/VectorDBBench/pull/389 * fix weaviate client bug by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/392 **Full Changelog**: https://github.com/zillizteLow10/30/2024
v0.0.14## What's Changed * fix redis benchmark execution bug by @angus0wang in https://github.com/zilliztech/VectorDBBench/pull/358 * Add support for filtered search in pgvector by @wahajali in https://github.com/zilliztech/VectorDBBench/pull/362 * Add CLI Support for pgvectorscale by @Sheharyar570 in https://github.com/zilliztech/VectorDBBench/pull/365 * Add support for filtered search in pgvectorscale by @Sheharyar570 in https://github.com/zilliztech/VectorDBBench/pull/364 * Add quantization optLow10/28/2024
v0.0.13## What's Changed * add new db_config for better labeling: version, note by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/350 * Support for MemoryDB client by @baswanth09 in https://github.com/zilliztech/VectorDBBench/pull/339 * refactor: migrate to new pgvecto_rs sdk by @cutecutecat in https://github.com/zilliztech/VectorDBBench/pull/353 * Added pgvectorscale client by @Sheharyar570 in https://github.com/zilliztech/VectorDBBench/pull/355 ## New Contributors * @baswLow8/2/2024
v0.0.12## What's Changed * support custom_dataset by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/320 * Add client aws_opensearch by @cydrain in https://github.com/zilliztech/VectorDBBench/pull/342 ## New Contributors * @cydrain made their first contribution in https://github.com/zilliztech/VectorDBBench/pull/342 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.11...v0.0.12Low7/18/2024
v0.0.11## What's Changed * the dataset_local_dir should be kept consistent by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/331 * Introduce a command line interface. by @greenhal in https://github.com/zilliztech/VectorDBBench/pull/330 * pgvector: ensure vector is sent in binary representation by @jkatz in https://github.com/zilliztech/VectorDBBench/pull/335 * new metric support: ndcg@100, performance under conc_test by @alwayslove2013 in https://github.com/zilliztech/VectorDBBLow7/1/2024
v0.0.10## What's Changed * fix bugs: should normalize cosine dataset when test with milvus gpu_index by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/326 * Increase the optimization time limit. by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/332 **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.9...v0.0.10Low6/7/2024
v0.0.9## What's Changed * Support pgvector-hnsw by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/311 * Remove last batch in insert_embeddings by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/314 * Optimize pgvector test for semi-recent enhancements by @jkatz in https://github.com/zilliztech/VectorDBBench/pull/319 ## Bug fixes: * Pinning Streamlit to version 1.31.1 to avoid error #315 by @yorek in https://github.com/zilliztech/VectorDBBench/pull/316 * SkLow5/16/2024
v0.0.8## What's Changed * update zillizcloud results (2024-01) by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/295 * add zillizcloud failed results by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/296 * Add support for HNSW index in pgvector client by @wahajali in https://github.com/zilliztech/VectorDBBench/pull/294 ## Bug fixes * fix qdrant_cloud import bugs by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/300 * fix the label Low4/23/2024
v0.0.7## What's Changed * Improving the performance of PGVectors by @yugeeklab in https://github.com/zilliztech/VectorDBBench/pull/235 * add new search_params for zillzcloud: level by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/274 * add new index_type for milvus: ivf_sq8 by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/277 * add dev container support by @zhjwpku in https://github.com/zilliztech/VectorDBBench/pull/280 * enhance: Support for windows byLow3/13/2024
v0.0.6## New Benchmark Results * add test results of zilliz_cloud_beta_202401 by @alwayslove2013 in https://github.com/zilliztech/VectorDBBench/pull/258 ## New DB client * feat: add new db pgvecto.rs by @cutecutecat in https://github.com/zilliztech/VectorDBBench/pull/242 ## New features * Import database client while in use by @XuanYang-cn in https://github.com/zilliztech/VectorDBBench/pull/223 * [Milvus client]support gpu_index_type for milvus by @alwayslove2013 in https://github.com/zillLow1/18/2024
v0.0.5## What's Changed Fix some data points lost issues **Full Changelog**: https://github.com/zilliztech/VectorDBBench/compare/v0.0.4...v0.0.5Low8/15/2023
v0.0.41. Add Chroma and Redis Client 2. Add 6 more cases and results |cases|dataset|filter|description| |-- | -- | -- | --| Search Performance Case | OpenAI generated 500K vectors, 1536 dimensions | N/A | Index building time, recall, latency, maximum QPS Search Performance Case | OpenAI generated 5M vectors, 1536 dimensions | N/A | Index building time, recall, latency, maximum QPS Filtering Search Performance Case | OpenAI generated 500K vectors, 1536 dimensions | 1% vectors | Index building Low8/14/2023
v0.0.3## What's New 1. Add pgvector client and benchmark results 2. Support Leaderborad: https://zilliz.com/benchmark 3. Retest and update qdrant, pinecone, milvus, and zillizcloud results 4. Add Timeout for all cases ## Enhancement 1. Sync all process before multi-processing search starts. 2. Refine DataSetLow7/12/2023
v0.0.2- Add Pinecone results - Remove results of 100K - Enable to save results as one image - Enable to select all, deselect all filtersLow6/21/2023

Dependencies & License Audit

Loading dependencies...

Similar Packages

vector-db-benchmarkFramework for benchmarking vector search enginesmaster@2026-06-05
topkTopK is a search engine for the AI era.cli-v0.11.0
little-coderA coding agent optimized to smaller LLMsv1.8.2
ISC-BenchInternal Safety Collapse: Turning the LLM or an AI Agent into a sensitive data generator.v0.0.6
OpenClawProBenchOpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.main@2026-05-19

More from zilliztech

memsearchA Markdown-first memory system, a standalone library for any AI agent. Inspired by OpenClaw.
knowhereVector search engine inside Milvus, integrating FAISS, HNSW, DiskANN.
vector-graph-ragGraph RAG with pure vector search, achieving SOTA performance in multi-hop reasoning scenarios.

More in Testing

fspecFSPEC: The Spec-Driven, Multi-Agent Coding Factory. It is infrastructure for the "Dark Factory"โ€”the emerging model of fully autonomous software development where AI agents handle all implementation wh
vector-db-benchmarkFramework for benchmarking vector search engines
GitoAn AI-powered GitHub code review tool that uses LLMs to detect high-confidence, high-impact issuesโ€”such as security vulnerabilities, bugs, and maintainability concerns.
mxcliMendix cli tool, a headless way to work with Mendix projects. Enables Mendix projects for use with 3rd party agentic coding tools like Claude Code and Copilot. Includes a starlark linter for quality v