Multimodal Image Search Vector Databases
Tools for semantic image retrieval using multimodal embeddings (text-to-image, image-to-image, or video search). Includes CLIP-based systems, vision transformers, and cross-modal ranking. Does NOT include general image classification, object detection, or single-modality text/vector search without image integration.
There are 42 multimodal image search tools tracked. The highest-rated is soulteary/simple-image-search-engine at 47/100 with 151 stars.
Get all 42 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=vector-db&subcategory=multimodal-image-search&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
soulteary/simple-image-search-engine
图片搜索引擎,很简单。三步构建属于你自己的图片搜索引擎,掌握向量数据库和以图搜图、文本搜索图片。 |
|
Emerging |
| 2 |
shotit/shotit
Shotit is a screenshot-to-video search engine tailored for TV & Film,... |
|
Emerging |
| 3 |
ob-labs/image-search
Image search application built with the vector capabilities of OceanBase |
|
Emerging |
| 4 |
sourav4243/sift-video
Semantic video search system that indexes audio and visual content to enable... |
|
Emerging |
| 5 |
shotit/shotit-api
The ultimate brain of Shotit, in charge of task coordination. |
|
Emerging |
| 6 |
KarunyaChavan/Semantixel-Semantic_Image_Retrieval
Semantic Image Retrieval is a lightweight web-based platform that enables... |
|
Emerging |
| 7 |
EricRollei/Semantic-Search
A powerful **two-stage multimodal retrieval pipeline** for ComfyUI, enabling... |
|
Emerging |
| 8 |
Aaryan2304/visual-search-engine
An AI-powered visual search engine that finds visually similar fashion items... |
|
Experimental |
| 9 |
AchrefHemissi/FoundIT-Computer-Vision-Powered-Lost-and-Found-Mobile-Application
The LostFound system is designed to facilitate the recovery of lost items... |
|
Experimental |
| 10 |
akashAD98/Car_ai_multimodal_search
A multimodal car search engine powered by LanceDB vector database that... |
|
Experimental |
| 11 |
weaviate-tutorials/next-multimodal-search-demo
a Weaviate multimodal search demo |
|
Experimental |
| 12 |
sachink1729/intelligentgallery
Intelligent Image Gallery with Uploads, Deduplication, and Text-Based Search... |
|
Experimental |
| 13 |
santi1602/AnyCam2Ros
📷 Transform any camera into ROS2 image topics for seamless integration with... |
|
Experimental |
| 14 |
shotit/shotit-media
Media broker for serving video preview for shotit |
|
Experimental |
| 15 |
JimmyHernandez503/oceano
Sistema de reconocimiento facial con InsightFace y Qdrant - 100% confiable |
|
Experimental |
| 16 |
jacobmarks/reverse-image-search-plugin
Find the images in your dataset most similar to a query image from URL or... |
|
Experimental |
| 17 |
Abhics8/Lumina-AI
AI-powered visual commerce engine with semantic fashion search using OWLv2,... |
|
Experimental |
| 18 |
bauerem/semantic-text2image-search
This repo implements a simple terminal-based semantic image search. |
|
Experimental |
| 19 |
navneet83/multimodal-mountain-peak-search
Identify mountain peaks in your photos using AI—zero-shot retrieval,... |
|
Experimental |
| 20 |
shotit/shotit-frontend
The frontend of shotit, with full documentation. |
|
Experimental |
| 21 |
laxmanclo/pany
PostgreSQL-native semantic search engine with multi-modal capabilities. Add... |
|
Experimental |
| 22 |
redswimmer/trail-camera-search
Multimodal vector search of images and videos taken from trail cameras. ... |
|
Experimental |
| 23 |
dschechter27875/clip_image_text_search
Multimodal semantic image search using CLIP embeddings and natural language queries. |
|
Experimental |
| 24 |
IlyasFardaouix/VisualIndexer
Multimodal visual search engine using CLIP, OCR, and vector similarity retrieval. |
|
Experimental |
| 25 |
MustafaAbbasi98/brand-video-logo-detection
An application for semi-automated logo detection in brand advertisement... |
|
Experimental |
| 26 |
BrandWill-ML-DS-DE/clip-faiss-product-search
End-to-end vision–language search system using CLIP + FAISS (HNSW/IVF) for... |
|
Experimental |
| 27 |
hareshanmuhan/semantic-search
Search 1M+ images/videos with natural language — OpenAI CLIP + FAISS +... |
|
Experimental |
| 28 |
ejber-ozkan/local-llm-photo-scanner
A privacy-first, self-hosted photo manager powered by local LLMs (Ollama)... |
|
Experimental |
| 29 |
suraj95/Whatsapp-Reel-Knowledge-Base
A small AI project that extract frames from an Instagram video to generate a... |
|
Experimental |
| 30 |
Aniket-16-S/Semantic_Video_Search
An AI powered Video Serach Engine with google's SigLIP and FAISS. It allows... |
|
Experimental |
| 31 |
EsraaMadi/similarity-search-weaviate
Text/Image search for similar products |
|
Experimental |
| 32 |
aritro1011/QID
(Query Images by Description)- A simple pipeline to convert images to... |
|
Experimental |
| 33 |
Sakshi3027/semantic-video-search
Production-grade semantic video search engine - search across video content... |
|
Experimental |
| 34 |
oguzhantasimaz/image-similarity-search
Image Similarity Search with CLIP and Upstash Vector |
|
Experimental |
| 35 |
mahadev0811/Text2ImageDescription
Text2ImageDescription retrieves relevant images from Pascal VOC 2012 dataset... |
|
Experimental |
| 36 |
GGCIRILLO/IR-Image-Classification-System
From CNN Embeddings to Vector Search: A Deep Learning Pipeline for Thermal... |
|
Experimental |
| 37 |
tyasemin/Data-Feature-Extraction-and-Retrieval-Pipeline
Project DART. Similarity search, SAM, CLIP, and more |
|
Experimental |
| 38 |
777reet/PhotoDiaries
Modern web photobooth with AI-powered image similarity search. Built with... |
|
Experimental |
| 39 |
vaibhavhonakere/ClipQuest
Find exact moments in uploaded videos using natural-language search + timestamps. |
|
Experimental |
| 40 |
ecmoce/ask-gallery
Ask Gallery — Semantic photo search system powered by VLM, CLIP, and vector search |
|
Experimental |
| 41 |
anantha119/Vector-Based-Image-Retrieval-System
This project leverages Vision Transformers (ViT) to build a scalable image... |
|
Experimental |
| 42 |
sefaburakokcu/semantic-image-search
Search for images using text and images using Milvus and OpenAI-Clip. |
|
Experimental |