Vector DB From Scratch Vector Databases

Educational and minimalist vector database implementations built to understand core concepts and internals. Includes toy/learning projects, lightweight engines, and pure-Python implementations prioritizing clarity over production features. Does NOT include enterprise databases, managed services, or specialized implementations (embedded SQLite variants, REST API wrappers, or domain-specific systems like NFT databases).

There are 157 vector db from scratch tools tracked. 1 score above 70 (verified tier). The highest-rated is MariaDB/server at 76/100 with 7,297 stars. 4 of the top 10 are actively maintained.

Get all 157 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=vector-db&subcategory=vector-db-from-scratch&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 MariaDB/server

MariaDB server is a community developed fork of MySQL server. Started by...

76
Verified
2 infiniflow/infinity

The AI-native database built for LLM applications, providing incredibly fast...

69
Established
3 AlayaDB-AI/AlayaLite

AlayaLite – A Fast, Flexible Vector Database for Everyone.

66
Established
4 oceanbase/seekdb

The AI-Native Search Database. Unifies vector, text, structured and...

61
Established
5 schwabauerbriantomas-gif/m2m-vector-search

Edge Vector search engine with Vulkan GPU acceleration, hierarchical...

61
Established
6 gusye1234/nano-vectordb

A simple, easy-to-hack Vector Database

60
Established
7 nnethercott/hannoy

Production-ready KV-backed HNSW implementation in Rust using LMDB

60
Established
8 dingodb/dingo

A multi-modal vector database that supports upserts and vector queries using...

58
Established
9 endee-io/endee

Endee.io – A high-performance vector database, designed to handle up to 1B...

57
Established
10 zilliztech/knowhere

Vector search engine inside Milvus, integrating FAISS, HNSW, DiskANN.

57
Established
11 dingodb/dingo-store

A distributed Key-Value Storage using Raft

55
Established
12 MinishLab/vicinity

Lightweight Nearest Neighbors with Flexible Backends

52
Established
13 muxi-ai/faissx

High-performance remote FAISS server for vector similarity search, with full...

50
Established
14 jina-ai/vectordb

A Python vector database you just need - no more, no less.

50
Established
15 VectorDB-NTU/RaBitQ-Library

A lightweight library for the RaBitQ algorithm and its applications in vector search.

50
Established
16 datawhalechina/easy-vecdb

📚 从零开始的向量数据库原理与实践教程,在线阅读地址:https://easy-vecdb.datawhale.cc/

50
Established
17 thustorage/PipeANN

A low-latency, billion-scale, and updatable graph-based vector store on SSD.

49
Emerging
18 varshith-Git/valori

A high-performance vector database library for Python that provides...

46
Emerging
19 Veeresh-Hanni/DBDuck

Universal Data Object Model in Pytghon for SQL, Nosql, Graph, Vector DBMS

45
Emerging
20 vortezwohl/Bhakti

An easy-to-use vector database.

45
Emerging
21 rapidsai/cuvs-lucene

A Lucene codec for vector search and clustering on the GPU

43
Emerging
22 pomagrenate/pomaidb

PomaiDB Vector Database for low performance devices

43
Emerging
23 ejaasaari/lorann

Approximate Nearest Neighbor search using reduced-rank regression, with...

43
Emerging
24 AutoCookies/pomaidb

PomaiDB Vector Database for low performance devices

42
Emerging
25 nickna/Neighborly

An open-source vector database

40
Emerging
26 BBC-Esq/VectorDB-Plugin

Program that lets you ask questions about your documents including audio and...

39
Emerging
27 syalia-srl/beaver

All-in-one, pure-python, embedded database for relational data, documents,...

39
Emerging
28 vinerya/faiss_vector_aggregator

This Python library provides a suite of advanced methods for aggregating...

38
Emerging
29 epsilla-cloud/vectordb

Epsilla is a high performance Vector Database Management System

38
Emerging
30 VectorDB-NTU/Extended-RaBitQ

[SIGMOD 2025] Practical and Asymptotically Optimal Quantization of...

37
Emerging
31 ZeusDB/zeusdb

High-performance database management system

37
Emerging
32 makr-code/ThemisDB

Themis Database System - High-performance C++ hybrid-database...

37
Emerging
33 cgtuebingen/ggnn

GGNN: State of the Art Graph-based GPU Nearest Neighbor Search

37
Emerging
34 feather-store/feather

Embedded vector database + living context engine Part of Hawky.ai —...

37
Emerging
35 1yefuwang1/vectorlite

Fast, SQL powered, in-process vector search for any language with an SQLite driver

37
Emerging
36 knowusuboaky/VectrixDB

Where vectors come alive - A lightweight, visual-first vector database with...

36
Emerging
37 vitrivr/cottontaildb

Cottontail DB is a column store vector database aimed at multimedia...

35
Emerging
38 ShravanSunder/hnswlib-wasm

hnswlib-wasm attempts to create a browser friendly version of hnswlib

35
Emerging
39 Dripfarm/SVDB

Swift Vector Database. On-device, local vector database for building the...

35
Emerging
40 sauravniraula/fastembed-vectorstore

In-memory vector store with FastEmbed integration for Python applications.

34
Emerging
41 mmilunovic/m2vdb

vector db built by someone with no idea how to build a vector db

33
Emerging
42 MenxLi/tiny_vectordb

A small and fast Python JIT vector database

33
Emerging
43 BirchKwok/lynsedb

A pure Python-implemented, lightweight, server-optional, multi-end...

33
Emerging
44 krishcdbry/nexadb

NexaDB - A lightweight NoSQL database with vector search, TOON format, and...

33
Emerging
45 0xDebabrata/citrus

(distributed) vector database

32
Emerging
46 MChatzakis/DARTH

[SIGMOD 2026] DARTH: Declarative Recall Through Early Termination for...

31
Emerging
47 wibyuan/easyANN

This project implements 30+ variants of ANN algorithms to find the K nearest...

30
Emerging
48 atasoglu/sqlite-vec-client

A lightweight Python client around sqlite-vec for CRUD and similarity search.

30
Emerging
49 tylerpuig/tinyvec

TinyVecDB is an ultra fast embedded vector database.

30
Emerging
50 replikativ/proximum

Versioned, fast and scalable nearest neighbor search.

28
Experimental
51 thewebscraping/crossvector

Production-ready Python vector database library with unified API for...

28
Experimental
52 lynnlangit/learning-nosql

Companion repository to Linked In Learning course 'Cloud NoSQL for SQL Pros'

27
Experimental
53 firstbatchxyz/hollowdb-vector

A decentralized vector database for building vector search applications

27
Experimental
54 mihirahuja1/vectorwrap

Universal vector search wrapper for Postgres, MySQL, SQLite (pgvector,...

27
Experimental
55 sarabesh/PuppyDB

This is an experimental learning project to explore how vector databases...

26
Experimental
56 prrao87/db-hub-fastapi

Async bulk data ingestion and querying in various document, graph and vector...

26
Experimental
57 ToucanDB/ToucanDB

ToucanDB is a brand-new micro ML-first database engine 🦜

26
Experimental
58 antarys-ai/python

Python client for Antarys vector database, optimized for large-scale vector...

26
Experimental
59 JadenGeller/similarity-topology

Efficient nearest neighbor search in Swift

26
Experimental
60 EmbedInAI/EmbedInDB

A vector database that empowers AI with persistent memory

26
Experimental
61 ashvardanian/JaccardIndex

Optimizing bit-level Jaccard Index and Population Counts for large-scale...

25
Experimental
62 mantzaris/LMDiskANN.jl

Julia Implementation of Low Memory Disk ANN (LM-DiskANN)

24
Experimental
63 maurocanuto/mempack

MemPack is a blazing-fast, lightweight alternative to heavy vector...

24
Experimental
64 klu-ai/EmbedKit

Swift library extending MLX Embed

24
Experimental
65 vortezwohl/Dipamkara

A light-weight vector database engine.

24
Experimental
66 skyzh/write-you-a-vector-db

A Vector Database Tutorial (over CMU-DB's BusTub system)

23
Experimental
67 rajathshttgr/zoro-db

A Vector Search Engine Built from Scratch in C++

23
Experimental
68 vital-ai/vital-vitalsigns-python

Knowledge Model Runtime, Ontology management, and interface to Graph and...

23
Experimental
69 nhevers/vecstore

lightweight vector store with HNSW indexing

23
Experimental
70 ericmillsio/whiplash

Serverless, lightweight, and fast vector database on top of DynamoDB

23
Experimental
71 torinriley/VecStream

Efficient, scalable, and lightweight vector database

23
Experimental
72 NachoBrito/vulcano

An in-process, lightweight vector database written in modern Java

22
Experimental
73 UnrealJon/DTDR

Transform-domain representation enabling 3–4× storage reduction with direct...

22
Experimental
74 MukundaKatta/thoth

Thoth — Embedded Vector Database. Embedded vector database (SQLite for vectors)

22
Experimental
75 gsavla6-hue/java-vector-database

High-performance Java vector database implementation with HNSW indexing,...

22
Experimental
76 lexxai/django-mariadb-vector-demo

A minimal demo project showing how to build article recommendations using...

22
Experimental
77 leitoooatr/PythonVectorDB

🗄️ Manage and search large vector datasets efficiently with this pure Python...

22
Experimental
78 britorbs/consciousdb

🗄️ Streamline data analysis with ConsciousDB, a vector database that...

22
Experimental
79 ribagolx10/crossvector

🔗 Simplify vector database operations with CrossVector, a unified Python...

22
Experimental
80 bosekarmegam/vecforge

VecForge is a universal, local-first Python vector database with enterprise...

22
Experimental
81 kroq86/mcp_vector_db

VectorDB MCP server

22
Experimental
82 rizquuula/pyvectordb

Python wrapper for many Vector Databases

22
Experimental
83 JaneaSystems/jecq

Faiss-based library for efficient similarity search

22
Experimental
84 ksm26/vector-databases-embeddings-applications

Unlock the power of vector databases with the "Vector Databases: from...

22
Experimental
85 tsvet01/quiverdb

Embeddable vector database for edge AI. Lightning-fast semantic search that...

22
Experimental
86 NDXDeveloper/formation-mariadb

🐬 Formation complète MariaDB 11.8 LTS en français . SQL, HA, DevOps,...

21
Experimental
87 QDL123/Periplus

A remote cache for vector databases which allows for a dynamically updated...

21
Experimental
88 starkdg/hftrie

index binary vectors for efficient nearest neighbor search

21
Experimental
89 ehsanghaffar/vector-store-api

This project aims to provide an efficient and scalable API for embedding and...

21
Experimental
90 krejciad/kramdb

Simple in-RAM database system

21
Experimental
91 oneKn8/VectorVault

HNSW approximate nearest neighbor engine from scratch in C++20. AVX2...

20
Experimental
92 jmelovich/VectorDatabasePluginUE

A vector 'database' plugin for Unreal Engine 5. Built for leveraging the...

20
Experimental
93 gifton/VectorCore

CPU-bound vector math library with SIMD optimization, distance metrics, and...

20
Experimental
94 PranavBhatP/velox-db

An hobby project to construct a fully functioning vector database from...

20
Experimental
95 doganarif/vectordb

In-memory vector database with pluggable indexing algorithms, metadata...

20
Experimental
96 vectordbpipe/vectorDBpipe

A modular text embedding and vector database pipeline for local and cloud...

20
Experimental
97 maticly/LabHub

OLTP to OLAP ETL + Semantic Search Engine

19
Experimental
98 jwill9999/Vector-DB-Service

A microservice that allows upload of documents from google services, and...

19
Experimental
99 jerryli99/jerry_vectorDB

A lightweight vector database

19
Experimental
100 amhoba/vector-search-db

A high-performance, persistent vector search engine written in C++17 with...

19
Experimental
101 Icingworld/dreamdb

轻量级向量数据库

19
Experimental
102 shlokkvaishnav/nano-db

Persistent Vector Search Engine built from scratch featuring disk-based HNSW...

19
Experimental
103 mingyu-hkustgz/Res-Infer

Distance Computaion for Vector Databases

19
Experimental
104 VQLite/VQLite

VQLite - Simple and Lightweight Vector Search Engine based on Google ScaNN

19
Experimental
105 AlexHaborets/vectordb

A minimalistic, pure-Python vector database for semantic search and RAG...

19
Experimental
106 maylad31/vector_sqlite

Faiss with sqlite

18
Experimental
107 capybara-brain346/capybaradb

capybaradb - a toy Vector DB implementation from scratch in Python. Explore...

18
Experimental
108 oscarcitoz/vector-db

A FastAPI-based API for managing vector database operations like creating...

18
Experimental
109 atisharma/fvdb

Thin porcelain around the FAISS vector database.

17
Experimental
110 N2FlowJS/nbase

NBase is a high-performance vector database for efficient similarity search,...

17
Experimental
111 haja-k/mysql-to-pgvector-embeddings

vectorizing data from mysql database to vector so it can be used by LLM in...

17
Experimental
112 cmessin02-cmyk/Sentry-Vector-The-AI-Powered-Immutable-Ledger

A high-performance, C++ based Vector Database with HMAC-SHA256 blockchain...

16
Experimental
113 mingyu-hkustgz/RESQ

High-Ratio Vector Quantization

16
Experimental
114 JGalego/VektorDB

A minimal vector database for educational purposes.

16
Experimental
115 SherifSystems/PythonVectorDB

Pure Python vector database • int8 quantized • ~1100 QPS @ 50k vectors •...

15
Experimental
116 matthewwangg/vector-database

A performant in-memory vector database with an HNSW index, data persistence,...

15
Experimental
117 kanitakadusic/bsc-thesis

Vector Databases: Use Cases, Algorithms and Key Features

15
Experimental
118 hritik2002/local-vectordb

Local vector database with embeddings & semantic search. Uses HNSW for fast...

15
Experimental
119 TekilaSS/Educational-Vector-Database

📚 Learn to build and understand Vector Databases step-by-step in Arabic,...

14
Experimental
120 waynewbishop/quiver

Quiver is a Swift package that provides vector mathematics, numerical...

14
Experimental
121 Linco2749/duckdb-s22

🦆 Explore DuckDB's powerful features for efficient data analysis and easy...

14
Experimental
122 deathbeam/vectorspace

Directory file watcher for automatically creating and querying vector embeddings.

14
Experimental
123 LongmaoTeamTf/ant

Open-source vector database built to embedding similarity search

14
Experimental
124 yichunzhao/python-learning

Taking it slow and easy—Python, here I come. 🐍✨

14
Experimental
125 AWeirdDev/vdb37

A simple vector database.

14
Experimental
126 Ronakagrwal000/vector-cache-optimizer

⚡ Optimize vector searches with a hyper-efficient cache that uses machine...

14
Experimental
127 yusupwinata/Basic-VectorDB

Build vector database using LangChain, Hugging Face, Chroma and FAISS.

13
Experimental
128 JohnnyHyytiainen/glossary_db

Personal Glossary Database to help keep track on terms and theory for school...

13
Experimental
129 takurot/Pyrope

Pyrope is a high-performance, adaptive Vector Database built as an extension...

13
Experimental
130 gifton/VectorAccelerate

Swift6 GPU-accelerated vector operations using Metal4 shaders for Apple...

13
Experimental
131 thkbit-labs/vecmodel

A model-based, ORM-inspired abstraction for vector databases.

13
Experimental
132 lcj2021/mini-ivf

A cute toy of IVF (PQ).

13
Experimental
133 gtfintechlab/Universal-NFT-Vector-Database

The Universal NFT Vector Database: A Scalable Vector Database for NFT...

13
Experimental
134 starkdg/mvptree

multiple vantage point distance-based tree data structure

13
Experimental
135 jballo/vector-db-engine

A FastAPI service that lets users create, read, update, and delete document...

12
Experimental
136 RKirlew/SoraDB-A-Lightweight-Vector-Database

SoraDB is a custom-built vector storage engine designed to manage and query...

12
Experimental
137 colbertdb/colbertdb

Open source ColBERT based document database

12
Experimental
138 rosaia/vecworks

Seamlessly manage vectorized data in Python

12
Experimental
139 ocramz/vectordb

Simple vector database based on annoy and sqlite3

12
Experimental
140 B-R-P/VStore

Embedded key-value store with vector similarity search

12
Experimental
141 Maverick0351a/consciousdb

ConsciousDB – Your Vector Database Is the Model

12
Experimental
142 danilop/knn-search-algorithm-comparison

KNN Search Algorithm Comparison – This project compares the performance of...

12
Experimental
143 nathangtg/dbms-research

This is the repository for ZGQ (Zone Graph Quantization)m Which is now...

12
Experimental
144 Scintirete/Scintirete

Scintirete 是一款基于 HNSW 算法实现的、嵌入式友好的、面向生产的向量数据库。Scintirete is a lightweight,...

12
Experimental
145 patw/InstructorVec

Create dense vectors using the instructor-large model, running on CPU in...

11
Experimental
146 natenberenstein/deep-dive-databases

Knowledge base covering database internals -- storage engines, data models,...

11
Experimental
147 tweedge/vectordb-docker-base

Python 3.10-slim with VectorDB (vectordb2==0.1.9) and certain models...

11
Experimental
148 RasaiStewart/Vector-database-using-vectordb

My attempt to create a vector database to store the names of books I have...

11
Experimental
149 tanushachoudhary/VectorDB

A production-ready vector database system that stores document embeddings...

11
Experimental
150 Flagro/VecMetaQ

Server over Python Faiss serverless implementation to match interfaces used...

11
Experimental
151 NautilusDB-cloud/nautilusdb-cli

The simple client of NautilusDB, a Clound-Native Vector Search Service

11
Experimental
152 1226085293/MiniVectorDB

Lightweight, self-hosted Node.js vector database using WASM-based HNSW with...

11
Experimental
153 timothyckl/iota

a minimal local embedding database.

11
Experimental
154 FoxRav/RL-astradb-

Astra Vector DB on Python-paketti, joka tallentaa dokumentteja DataStax...

11
Experimental
155 yezz123/vectorai

A Vector Database REST API with custom indexing algorithms

10
Experimental
156 mingyu-hkustgz/LabelANN

Label Filtering Vector Similarity Search

10
Experimental
157 Md-Emon-Hasan/Vector-Database

Designed to store and retrieve high-dimensional data, such as embeddings,...

10
Experimental

Comparisons in this category