Multimodal Search Engines ML Frameworks
Tools and applications for searching across image and text modalities using vision-language models like CLIP. Includes text-to-image search, image-to-image search, and video content search. Does NOT include general recommendation systems, dataset creation/filtering tools, or single-modality search applications.
There are 40 multimodal search engines frameworks tracked. 1 score above 70 (verified tier). The highest-rated is rom1504/img2dataset at 71/100 with 4,380 stars and 88,786 monthly downloads.
Get all 40 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=multimodal-search-engines&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download,... |
|
Verified |
| 2 |
devrimcavusoglu/pybboxes
Light weight toolkit for bounding boxes providing conversion between... |
|
Established |
| 3 |
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence |
|
Emerging |
| 4 |
PyRetri/PyRetri
Open source deep learning based unsupervised image retrieval toolbox built... |
|
Emerging |
| 5 |
Particle1904/DatasetHelpers
Dataset Helper program to automatically select, re scale and tag Datasets... |
|
Emerging |
| 6 |
haltakov/natural-language-image-search
Search photos on Unsplash using natural language |
|
Emerging |
| 7 |
haltakov/natural-language-youtube-search
Search inside YouTube videos using natural language |
|
Emerging |
| 8 |
jina-ai/example-multimodal-fashion-search
Input text or image, get back matching image fashion results, using Jina,... |
|
Emerging |
| 9 |
RAHUL-KAD/Reverse-Image-Search-Engine
With the help of this repo you can build image search algorithm on your... |
|
Emerging |
| 10 |
TheoCoombes/crawlingathome
A client library for LAION's effort to filter CommonCrawl with CLIP,... |
|
Emerging |
| 11 |
lucko515/image-search-engine
End-to-end image search engine based on the Deep learning techniques. |
|
Emerging |
| 12 |
masesk/process-google-dataset
Process Google Dataset is a tool to download and process images for neural... |
|
Emerging |
| 13 |
bwconrad/video-content-search
Search the content of a video with a text or image query |
|
Emerging |
| 14 |
huggingface/OBELICS
Code used for the creation of OBELICS, an open, massive and curated... |
|
Experimental |
| 15 |
meanderinghuman/OpenLens
Open-source visual search framework inspired by Google Lens — benchmarked... |
|
Experimental |
| 16 |
zabir-nabil/bangla-image-search
A dead-simple image search / retrieval and image-text matching system for... |
|
Experimental |
| 17 |
TAU-VAILab/Vox-E
This repo contains the python code as well as the webpage html files for the... |
|
Experimental |
| 18 |
Zeeshier/VistAI
VistAI is an AI-powered visual search for e-commerce, enabling users to... |
|
Experimental |
| 19 |
sayannath/Identical-Image-Retrieval
Identical-Image-Retrieval using Deep Learning |
|
Experimental |
| 20 |
Sagykri/NOVA
The official repository for NOVA, a deep learning framework designed for... |
|
Experimental |
| 21 |
thatgeeman/pybx
A simple python module to generate anchor (aka default/prior) boxes for... |
|
Experimental |
| 22 |
snehilhbtu/vectalab
📊 Evaluate image quality and performance with Vectalab's vectorization tools... |
|
Experimental |
| 23 |
masa-57/PIC
Hierarchical image clustering API for product catalog images. Two-level... |
|
Experimental |
| 24 |
Subhasri-Babu/AI-Scene-Safety-Analyzer-Project
AI-powered image safety analyzer using BLIP + LLaMA 3.3 via Groq API |
|
Experimental |
| 25 |
woctezuma/steam-image-search
Search for images on Steam using natural language queries. |
|
Experimental |
| 26 |
Rishabh1925/scene-localization-system
Powerful CLIP-based computer vision system for natural language-driven... |
|
Experimental |
| 27 |
Ivan-Zhou/image-search
Simple Image Search powered by Multimodal Foundation Models (OpenAI Clip and... |
|
Experimental |
| 28 |
O-S-O-K/insight_ai_app
Explainable AI image classification with Grad-CAM visualizations, BLIP... |
|
Experimental |
| 29 |
santoshlite/ByteDetective
The easiest way to search for images on your desktop 🔎 |
|
Experimental |
| 30 |
ItzCrazyKns/Dataset-Converter
A Python script for converting URL-based datasets into image datasets. |
|
Experimental |
| 31 |
rizkysaputradev/Vision-Fusion-Real-Time
a real-time retrieval multimodial AI based demo that allows a visual input... |
|
Experimental |
| 32 |
CN-Scars/picture_sherlock
A local image search tool based on pre-trained deep learning models |
|
Experimental |
| 33 |
kyegomez/VisionDatasets
Open source scripts to create large scale datasets with rich detail for... |
|
Experimental |
| 34 |
koushikvikram/multimodal-image-retrieval
📝🔍🖼️ A deep learning application for retrieving images by searching with text. |
|
Experimental |
| 35 |
NavdeepSinghNegi999/DeepVisionIntelligence
🧠 DeepVisionIntelligence — An end-to-end multimodal AI system that... |
|
Experimental |
| 36 |
ajaysawandkar05/spare-part-recognition
Spare part recognition system using CLIP + DINOv2 with hybrid re-ranking... |
|
Experimental |
| 37 |
TunggTungg/image_retrieval
An image retrieval system that utilizes deep learning ResNet for feature... |
|
Experimental |
| 38 |
kaeldrin-gh/image-similarity-search
Image similarity search system using deep learning embeddings and FAISS indexing |
|
Experimental |
| 39 |
RishabThapliyal/Video-Scene-Classification-System
AI-powered video analysis tool with natural language search inside video... |
|
Experimental |
| 40 |
heydido/VisualSearchEngine
This is the methodology I worked on while developing Visual Search Engine... |
|
Experimental |