Home > Databases > dingo

dingo

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and

embedding-search embedding-store hybrid-search java key-value-distributed-store mysql-compatibility real-time-semantic-search serving structured-data

Why this rank:Strong adoptionHealthy release cadence

Description

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.

README

DingoDB

DingoDB is an open-source distributed multi-modal vector database independently designed and developed by DataCanvas, which integrates real-time strong consistency, relational semantics, and vector semantics into a unified platform, DingoDB positioning itself as a distinctive multi-modal database solution. With exceptional horizontal scalability and elastic scaling capabilities, it effortlessly meets enterprise-grade high availability requirements. Furthermore, DingoDB offers extensive multi-language interfaces and seamless compatibility with the MySQL protocol, delivering unparalleled flexibility and convenience for users. Demonstrating comprehensive excellence in functionality, performance, and user-friendliness, DingoDB stands out as a robust solution for modern data-driven applications.

Key Features

1. Comprehensive access interface

DingoDB provides comprehensive access interfaces, supporting various flexible access modes such as SQL, SDK, and API to meet the needs of different developers. Additionally, it introduces Table and Vector as first-class citizen data models, providing users with efficient and powerful data processing capabilities.

2.Built-in data high availability

DingoDB provides fully functional and highly available built-in configurations without the need to deploy any external components, which can significantly reduce users' deployment and operation and maintenance costs and significantly improve the efficiency of system operation and maintenance.

3.Fully automatic elastic data sharding

DingoDB supports dynamic configuration of data shard size, automatic splitting and merging, realizing efficient and friendly resource allocation strategies, and easily responding to various business expansion needs.

4.Scalar-vector hybrid retrieval

DingoDB supports both traditional database index types and various vector index types, providing a seamless scalar and vector hybrid retrieval experience, reflecting industry-leading retrieval capabilities. In addition, it also supports fusion of scalars and vectors. Distributed transaction processing.

5.Built-in real-time index optimization

DingoDB can build scalar and vector indexes in real time, providing users with unconscious background automatic index optimization. At the same time, it ensures no delays during data retrieval.

6.Cold-Hot Tiered Retrieval for Massive Datasets DingoDB provides disk-based vector search capabilities to minimize memory consumption, and supports dynamic switching between different indexes based on data scale requirements.

Get Start

Docs

All Documentation Docs

Install

How to install and deploy Docker or Ansible

Usage

How to use DingoDB Usage

Developing DingoDB

VS Code

We recommend VS Code to develop the DingoDB codebase.

Java Profiler tools: YourKit

We recommend YourKit Java Profiler for any preformance critical application you make.

Check it out at https://www.yourkit.com/

Projects about DingoDB

The main projects about DingoDB are as follows:

Dingo-Store: A strongly consistent distributed storage system based on the Raft protocol.
Dingo-Deploy: The deployment project of compute nodes and storage nodes.

How to make a clean pull request

Create a personal fork of dingo on GitHub.
Clone the fork on your local machine. Your remote repo on GitHub is called origin.
Add the original repository as a remote called upstream.
If you created your fork a while ago be sure to pull upstream changes into your local repository.
Create a new branch to work on. Branch from develop.
Implement/fix your feature, comment your code.
Follow the code style of Google code style, including indentation.
If the project has tests run them!
Add unit tests that test your new code.
In general, avoid changing existing tests, as they also make sure the existing public API is unchanged.
Add or change the documentation as needed.
Squash your commits into a single commit with git's interactive rebase.
Push your branch to your fork on GitHub, the remote origin.
From your fork open a pull request in the correct branch. Target the Dingo's develop branch.
Once the pull request is approved and merged you can pull the changes from upstream to your local repo and delete your branch.
Last but not least: Always write your commit messages in the present tense. Your commit message should describe what the commit, when applied, does to the code – not what you did to the code.

Special Thanks

DataCanvas

DingoDB is Sponsored by DataCanvas, a new platform to do data science and data process in real-time.

DingoDB is an open-source project licensed under the Apache License Version 2.0, welcome any feedback from the community. For any support or suggestion, please contact us.

Contact us

If you have any technical questions or business needs, please contact us.

Attach the Wetchat QR Code

Release History

Version	Changes	Urgency	Date
v0.9.0	# Release Notes v0.9.0 ## 1. New Features ### 1）License Management Mechanism Introduced a License management feature to protect DingoDB's intellectual property. With the License activation and management tools, users can easily manage and monitor software usage, ensuring legal and compliant use. ### 2）Single Machine Lite Version of DingoDB Implemented a Single Machine Lite version of DingoDB, lowering the usage threshold for users. This version can run on a single machine without complex	Low	6/14/2024
v0.8.0	# Release Notes v0.8.0 ## Major New Features ### 1. Distributed Transaction The addition of distributed transaction capabilities meets the core ACID features of the database, ensuring the integrity and reliability of the database, and expands the range of applications. * Transaction-related interfaces are added to the Store layer/Index layer/Executor layer. * Provides the ability for garbage collection of distributed transaction data, cleaning up completed and no longer needed transaction	Low	3/20/2024
v0.7.0	# Release Notes v0.7.0 ## 1.Store Storage Layer ### 1.1 Distributed Storage * Provide the ability to manage IndexRegions, supporting dynamic creation and deletion of IndexRegions. * Add functionality for Raft Snapshot creation and installation for IndexRegions, which helps generate and load snapshot data for IndexRegions, enhancing system reliability and recovery capabilities. * Introduce the Build, Rebuild, and Load functions for VectorIndex to enable efficient creation, reconstr	Low	10/15/2023
v0.6.0	# Release Notes v0.6.0 ## 1 架构层 ### 1.1 存储计算分离 1. 计算引擎（Executor）:接收基于MySQL协议和DingoDB自有协议的SQL，进行SQL解析、逻辑计划和执行计划生成，对接低层Store存储。 2. 分布式存储引擎（Store）：基于C++的高效分布式存储。整个存储层分为元数据存储和数据存储；存储层设计采用灵活扩展的方式，进行多种存储引擎的扩展，如Rocksdb, memory, xdp-rocks等。 3. 支持计算下推操作：为了高效的提升聚合、过滤操作带来的价值，提升计算的效率，存储层支持计算下推的逻辑实现；支持filter，count，sum, min, max等操作。 ### 1.2 Raft升级 1. 提供Leader选举机制，支持多节点选举； 2. 提供日志复制，保证了系统的可靠性，有效防止数据丢失。 3. 提供高性能的Raft，采用多线程和异步IO，提高了系统的吞吐量和响应速度。 4. 提供Snapshot机制，用于恢复	Low	6/18/2023
v0.5.0	## Release Note - V0.5.0 #### 一、SQL相关特性 1. 支持like关键字的模糊查询 2. 支持用户认证：用户的增删改查 3. 支持用户权限赋予 3. 支持集群认证 4. 支持SQL批量插入 5. 优化Calcite函数校验机制 6. 错误码信息重构 #### 二、元数据管理 1. 将集群表粒度管理拆分到executor 2. 废弃原有Dingo-jraft模块 3. Coordinator中将原有Dingo-jraft迁移至Dingo-mpu 4. 支持基于SQL的元数据表查询 #### 三、索引相关 1. 支持索引的增删改查，提升查询性能 2. 支持多种多索引类型：非主键索引和联合索引 #### 四、SDK相关特性 1. 支持基于链式表达式的计算，实现多种范围查找后的聚合计算、更新等 2. 支持非主键列扫描、过滤计算 3. 指标计算特性列表： \| 序号 \| 函数 \| 说明	Low	2/10/2023
v0.4.1	## 1. Feature and Optimization about SQL ### 1.1 Features about SQL #### 1.1.1 Extended SQL Syntax - Support TTL when create table using options - Support to assign partitions when create table #### 1.1.2 Features about Complex Data Type - Support Operations about MAP - Support Operations about MultiSet - Support Operations about Array #### 1.1.3 Support to use variables in SQL statement, such as insert, select, delete. #### 1.1.4 Support stratagy to control messages tran	Low	10/12/2022
v0.3.0	## 1.Semantics and Function of SQL ### 1.1 New data type - Boolean - Date: default format yyyy-MM-dd - Time: default format HH:mm:ss - Timestamp: default format yyyy-MM-dd HH:mm:ss.SSS ### 1.2 Allow assigning a default value to column, either constant or internal functions ### 1.3 Support Join operation - Inner Join - Left Join - Right Join - Full Join - Cross Join ### 1.4 Function list about String \| No \| Function Names \| Notes a	Low	7/2/2022
v0.1.0	# DingoDB 0.1.0 Release Notes * Cluster 1. Distributed computing. Cluster nodes are classified into coordinator role and executor role. 1. Distributed meta data storage. Support creating and dropping meta data of tables. 1. Coordinators support SQL parsing and optimizing, job creating and distributing, result collecting. 1. Executors support task executing. * Data store 1. Using RocksDB storage. 1. Encoding and decoding in Apache Avro format. 1. Table parti	Low	7/1/2022
v0.2.0	* Architecture 1. Refactor DingoDB architecture abandon Zookeeper, Kafka and Helix. 1. Using raft as the consensus protocol to make agreement across multiple nodes on membership selection and data replication. 1. Region is proposed as the unit of data replication, it can be scheduled, split, managed by `coordinator`. 1. The distributed file system is replaced by distributed key-value implemented by raft and rocksdb. * Distributed Storage 1. Support region to replica	Low	7/1/2022

Dependencies & License Audit

Loading dependencies...

Similar Packages

milvusMilvus is a high-performance, cloud-native vector database built for scalable vector ANN searchv2.6.21

AIMAXXINGYour Very Own Agent: The Ultimate, Complete Editionmain@2026-07-24

weaviate-ioWebsite for the Weaviate vector databasemain@2026-07-23

weaviateWeaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cv1.38.6

crateCrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.6.4.1

More in Databases

alibabacloud-adb20211201Alibaba Cloud adb (20211201) SDK Library for Python

milvusMilvus is a high-performance, cloud-native vector database built for scalable vector ANN search

onyxOpen Source AI Platform - AI Chat with advanced features that works with every LLM

sentence-transformersEmbeddings, Retrieval, and Reranking