CLIP Vision Language ML Frameworks

Implementations, adaptations, and applications of CLIP and similar vision-language models for zero-shot classification, image-text matching, and multimodal tasks. Does NOT include other vision-language models (like BLIP or LLaVA), general multimodal frameworks, or unrelated CLIPS language systems.

There are 53 clip vision language frameworks tracked. 1 score above 70 (verified tier). The highest-rated is mlfoundations/open_clip at 86/100 with 13,496 stars and 2,903,706 monthly downloads. 2 of the top 10 are actively maintained.

Get all 53 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=clip-vision-language&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Framework	Score	Tier	Stars	Language
1	mlfoundations/open_clip An open source implementation of CLIP.	86	Verified	13,496	Python
2	noxdafox/clipspy Python CFFI bindings for the 'C' Language Integrated Production System CLIPS	65	Established	198	Python
3	openai/CLIP CLIP (Contrastive Language-Image Pretraining), Predict the most relevant...	60	Established	32,796	Jupyter Notebook
4	filipbasara0/simple-clip A minimal, but effective implementation of CLIP (Contrastive Language-Image...	47	Emerging	42	Jupyter Notebook
5	moein-shariatnia/OpenAI-CLIP Simple implementation of OpenAI CLIP model in PyTorch.	46	Emerging	720	Jupyter Notebook
6	BioMedIA-MBZUAI/FetalCLIP Official repository of FetalCLIP: A Visual-Language Foundation Model for...	45	Emerging	59	Python
7	cliport/cliport CLIPort: What and Where Pathways for Robotic Manipulation	42	Emerging	541	Jupyter Notebook
8	WolodjaZ/MSAE Interpreting CLIP with Hierarchical Sparse Autoencoders (ICML 2025)	41	Emerging	22	Jupyter Notebook
9	Dalageo/paperclip-inspection Analyzing Paper Clips Using Deep Learning and Computer Vision Techniques 📎	37	Emerging	18	Jupyter Notebook
10	noxdafox/iclips CLIPS Jupyter console	36	Emerging	15	Python
11	SunzeY/AlphaCLIP [CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want	36	Emerging	869	Jupyter Notebook
12	kyegomez/CLIPQ A simple implementation of a CLIP that splits up an image into quandrants...	35	Emerging	7	Python
13	LeapLabTHU/Cross-Modal-Adapter [Pattern Recognition 2025] Cross-Modal Adapter for Vision-Language Retrieval	33	Emerging	140	Python
14	svpino/clip-container A containerized REST API around OpenAI's CLIP model.	33	Emerging	68	Python
15	lakeraai/onnx_clip An ONNX-based implementation of the CLIP model that doesn't depend on torch...	32	Emerging	76	Python
16	SiddhantBikram/MemeCLIP Official Repository for the paper 'MemeCLIP: Leveraging CLIP Representations...	32	Emerging	22	Python
17	jaisidhsingh/CoN-CLIP Implementation of the "Learn No to Say Yes Better" paper.	32	Emerging	40	Python
18	merveenoyan/siglip Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers...	31	Emerging	300	Jupyter Notebook
19	kevinzakka/clip_playground An ever-growing playground of notebooks showcasing CLIP's impressive...	31	Emerging	178	Jupyter Notebook
20	sarthaxxxxx/BATCLIP [ICCV '25] BATCLIP: Bimodal Online Test-Time Adaptation for CLIP	29	Experimental	9	Python
21	UCSC-VLAA/CLIPA [NeurIPS 2023] This repository includes the official implementation of our...	29	Experimental	319	Python
22	Mauville/MedCLIP Medical image captioning using OpenAI's CLIP	28	Experimental	95	Jupyter Notebook
23	sixu0/SeisCLIP The code of Paper 'SeisCLIP: A seismology foundation model pre-trained by...	28	Experimental	54	Jupyter Notebook
24	aygong/ClipMind Code for the paper "ClipMind: A Framework for Auditing Short-Format Video...	27	Experimental	5	Jupyter Notebook
25	RobertBiehl/CLIP-tf2 OpenAI CLIP converted to Tensorflow 2/Keras	26	Experimental	56	Python
26	bes-dev/pytorch_clip_bbox Pytorch based library to rank predicted bounding boxes using text/image...	26	Experimental	52	Python
27	bes-dev/pytorch_clip_guided_loss A simple library that implements CLIP guided loss in PyTorch.	24	Experimental	77	Python
28	LAION-AI/scaling-laws-openclip Reproducible scaling laws for contrastive language-image learning...	23	Experimental	188	Jupyter Notebook
29	KeremTurgutlu/clip_art CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification - 4th...	23	Experimental	28	—
30	halixness/understanding-CLIP Repo from the "Learning with limited labeled data" seminar @ Uni of...	22	Experimental	17	Jupyter Notebook
31	Krok1/adversarial-patch-for-clip Adversarial patch system for privacy protection against CLIP image...	22	Experimental	—	Python
32	CoderChen01/InterCLIP-MEP Official repository of the paper "InterCLIP-MEP: Interactive CLIP and...	21	Experimental	15	Python
33	ExcelsiorCJH/CLIP CLIP: Learning Transferable Visual Models From Natural Language Supervision	20	Experimental	7	Jupyter Notebook
34	zjunlp/SPEECH [ACL 2023] SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres	20	Experimental	13	Python
35	sbmagar13/VQGAN-CLIP-Text-to-Image Text-to-Image Synthesis using Multimodal (VQGAN + CLIP) Architectures	19	Experimental	8	Jupyter Notebook
36	A-SHOJAEI/multimodal-contrastive-captioning-with-preference-aligned-generation Vision-language model combining CLIP-style contrastive learning with...	19	Experimental	—	Python
37	your-ai-solution/generation-image-caption This application fine-tunes the CLIP model on the Flickr8k dataset to align...	18	Experimental	3	Python
38	Evfidiw/MoBA [ACMMM'24] MoBA: Mixture of Bi-directional Adapter for Multi-modal Sarcasm Detection	17	Experimental	12	Python
39	Jaso1024/Refining-Generated-Videos IEEE 2023 \| REGIS: Refining Generated Videos via Iterative Stylistic Remodeling	16	Experimental	7	Python
40	D0miH/does-clip-know-my-face Source Code for the JAIR Paper "Does CLIP Know my Face?" (Demo:...	15	Experimental	16	Jupyter Notebook
41	Komorebirumu/awe-ms-20260316-1451-01 AI Historical Document Authenticity Checker (Local Archives)	14	Experimental	—	HTML
42	Jeyjey123456/ReVidgen 🎥 Rethink video generation for the embodied world with ReVidgen, leveraging...	14	Experimental	—	Jupyter Notebook
43	Bijay-kumar-sethy/clip 🔍 Solve linear programming problems efficiently with Clp, an open-source...	14	Experimental	—	—
44	ImtiazShuvo/clip-lora-food101-classification Transfer learning and parameter-efficient fine-tuning of CLIP on the...	14	Experimental	—	Jupyter Notebook
45	Fr0zenCrane/Cockatiel The official implementation of our paper "Cockatiel: Ensembling Synthetic...	13	Experimental	38	Python
46	MingliangLiang3/GLIP Centered Masking for Language-Image Pre-training	13	Experimental	2	Jupyter Notebook
47	buraksatar/RoME_video_retrieval It includes our two recent papers on text-to-video retrieval along with a...	12	Experimental	3	—
48	jonkahana/CLIPPR An official PyTorch implementation for CLIPPR	12	Experimental	30	Python
49	rhysdg/vision-at-a-clip Low-latency ONNX and TensorRT based zero-shot classification and detection...	12	Experimental	44	Jupyter Notebook
50	nicolafan/clipper Explore your CLIP embeddings in a bidimensional space	11	Experimental	2	Vue
51	KeithLin724/HAR_Clip Human Action Recognition using Clip	11	Experimental	—	Python
52	MaharshPatelX/qwen-clip-multimodal Multimodal Vision-AI: CLIP eyes + Qwen2.5 brain, 155 K-step pipeline & demo.	11	Experimental	—	Python
53	smb-h/mqirtn Multimodal Query Enhancement for Image Retrieval using Transformer Networks (MQIRTN)	10	Experimental	1	Jupyter Notebook

Comparisons in this category

open_clip and CLIP (86 vs 60) CLIP and simple-clip (60 vs 47) open_clip and AlphaCLIP (86 vs 36) open_clip and OpenAI-CLIP (86 vs 46) open_clip and cliport (86 vs 42) open_clip and CLIPA (86 vs 29) open_clip and clip_playground (86 vs 31) open_clip and onnx_clip (86 vs 32) open_clip and clip-container (86 vs 33) open_clip and simple-clip (86 vs 47)