CLIP Image Embeddings Transformer Models

Tools for generating and working with CLIP image-text embeddings, including implementations, fine-tuning, and lightweight variants. Does NOT include general vision-language models, text-to-image generation, or multimodal fusion frameworks.

There are 23 clip image embeddings models tracked. The highest-rated is OFA-Sys/Chinese-CLIP at 48/100 with 5,820 stars.

Get all 23 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=clip-image-embeddings&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	OFA-Sys/Chinese-CLIP Chinese version of CLIP which achieves Chinese cross-modal retrieval and...	48	Emerging	5,820	Jupyter Notebook
2	Kaushalya/medclip A multi-modal CLIP model trained on the medical dataset ROCO	44	Emerging	151	Jupyter Notebook
3	kastalimohammed1965/CLIP-fine-tune-registers-gated Vision Transformers Needs Registers. And Gated MLPs. And +20M params. Tiny...	39	Emerging	5	Python
4	BUAADreamer/SPN4CIR [ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning...	35	Emerging	39	Python
5	clip-italian/clip-italian CLIP (Contrastive Language–Image Pre-training) for Italian	32	Emerging	185	Jupyter Notebook
6	zer0int/CLIP-fine-tune-registers-gated Vision Transformers Needs Registers. And Gated MLPs. And +20M params. Tiny...	29	Experimental	47	Python
7	YUSH19883/cog-jinaai-jina-clip-v2 🖼️ Generate high-quality multimodal embeddings for text and images with Jina...	22	Experimental	—	Python
8	Armaggheddon/ClipServe 🚀 ClipServe: A fast API server for embedding text, images, and performing...	21	Experimental	8	Python
9	kyegomez/MuonClip This repository is an open source implementation of the MuonClip strategy...	21	Experimental	17	—
10	taherfattahi/MetaWorld-VLA-openai-clip-vit A lightweight Vision-Language-Action (VLA) baseline for MetaWorld robot-arm...	18	Experimental	3	Python
11	safinal/compositional-image-retrieval Solution for the First Challenge of the Main Phase in the Rayan...	17	Experimental	2	Python
12	iKrishneel/zsis CLIP based Zero Shot Instance Segmentation	16	Experimental	2	Jupyter Notebook
13	FuxiaoLiu/DocumentCLIP [ICPRAI 2024] DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents	15	Experimental	16	Python
14	theSohamTUmbare/CLIP-model Reimplementation of the CLIP model	13	Experimental	—	Jupyter Notebook
15	SuryaAnything/V-DeClip Masked Multi-Component Gated Decomposition Architecture	13	Experimental	2	Python
16	zsxkib/cog-jinaai-jina-clip-v2 Jina CLIP v2 - Multimodal embedding model for text and images with...	12	Experimental	1	Python
17	VijayPrakashReddy-k/CLIP-PACL Contrastive Language - Image Pre-training (CLIP) and Patch Aligned...	12	Experimental	3	—
18	MuhammadAliS/CLIP PyTorch implementation of OpenAI's CLIP model for image classification,...	12	Experimental	3	Jupyter Notebook
19	corentin-ryr/CLIP-mixer Implementation of CLIP using a Mixer architecture	12	Experimental	4	Python
20	ntat/Lightweight_CLIP_model A lightweight Pytorch implementation of OpenAI's CLIP model.	11	Experimental	—	Python
21	Rakshath66/ClipFindr 🔍 A CLIP-powered image similarity finder built with Streamlit — upload a...	11	Experimental	—	Python
22	seanghay/clipsort Group images by provided labels using OpenAI/CLIP	10	Experimental	1	Python
23	ptmorris03/CLIPEmbedding Easy text-image embedding and similarity with pretrained CLIP in PyTorch	10	Experimental	1	Python