CLIP Vision Language ML Frameworks

Implementations, adaptations, and applications of CLIP and similar vision-language models for zero-shot classification, image-text matching, and multimodal tasks. Does NOT include other vision-language models (like BLIP or LLaVA), general multimodal frameworks, or unrelated CLIPS language systems.

There are 53 clip vision language frameworks tracked. 1 score above 70 (verified tier). The highest-rated is mlfoundations/open_clip at 86/100 with 13,496 stars and 2,903,706 monthly downloads. 2 of the top 10 are actively maintained.

Get all 53 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=clip-vision-language&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 mlfoundations/open_clip

An open source implementation of CLIP.

86
Verified
2 noxdafox/clipspy

Python CFFI bindings for the 'C' Language Integrated Production System CLIPS

65
Established
3 openai/CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant...

60
Established
4 filipbasara0/simple-clip

A minimal, but effective implementation of CLIP (Contrastive Language-Image...

47
Emerging
5 moein-shariatnia/OpenAI-CLIP

Simple implementation of OpenAI CLIP model in PyTorch.

46
Emerging
6 BioMedIA-MBZUAI/FetalCLIP

Official repository of FetalCLIP: A Visual-Language Foundation Model for...

45
Emerging
7 cliport/cliport

CLIPort: What and Where Pathways for Robotic Manipulation

42
Emerging
8 WolodjaZ/MSAE

Interpreting CLIP with Hierarchical Sparse Autoencoders (ICML 2025)

41
Emerging
9 Dalageo/paperclip-inspection

Analyzing Paper Clips Using Deep Learning and Computer Vision Techniques 📎

37
Emerging
10 noxdafox/iclips

CLIPS Jupyter console

36
Emerging
11 SunzeY/AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

36
Emerging
12 kyegomez/CLIPQ

A simple implementation of a CLIP that splits up an image into quandrants...

35
Emerging
13 LeapLabTHU/Cross-Modal-Adapter

[Pattern Recognition 2025] Cross-Modal Adapter for Vision-Language Retrieval

33
Emerging
14 svpino/clip-container

A containerized REST API around OpenAI's CLIP model.

33
Emerging
15 lakeraai/onnx_clip

An ONNX-based implementation of the CLIP model that doesn't depend on torch...

32
Emerging
16 SiddhantBikram/MemeCLIP

Official Repository for the paper 'MemeCLIP: Leveraging CLIP Representations...

32
Emerging
17 jaisidhsingh/CoN-CLIP

Implementation of the "Learn No to Say Yes Better" paper.

32
Emerging
18 merveenoyan/siglip

Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers...

31
Emerging
19 kevinzakka/clip_playground

An ever-growing playground of notebooks showcasing CLIP's impressive...

31
Emerging
20 sarthaxxxxx/BATCLIP

[ICCV '25] BATCLIP: Bimodal Online Test-Time Adaptation for CLIP

29
Experimental
21 UCSC-VLAA/CLIPA

[NeurIPS 2023] This repository includes the official implementation of our...

29
Experimental
22 Mauville/MedCLIP

Medical image captioning using OpenAI's CLIP

28
Experimental
23 sixu0/SeisCLIP

The code of Paper 'SeisCLIP: A seismology foundation model pre-trained by...

28
Experimental
24 aygong/ClipMind

Code for the paper "ClipMind: A Framework for Auditing Short-Format Video...

27
Experimental
25 RobertBiehl/CLIP-tf2

OpenAI CLIP converted to Tensorflow 2/Keras

26
Experimental
26 bes-dev/pytorch_clip_bbox

Pytorch based library to rank predicted bounding boxes using text/image...

26
Experimental
27 bes-dev/pytorch_clip_guided_loss

A simple library that implements CLIP guided loss in PyTorch.

24
Experimental
28 LAION-AI/scaling-laws-openclip

Reproducible scaling laws for contrastive language-image learning...

23
Experimental
29 KeremTurgutlu/clip_art

CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification - 4th...

23
Experimental
30 halixness/understanding-CLIP

Repo from the "Learning with limited labeled data" seminar @ Uni of...

22
Experimental
31 Krok1/adversarial-patch-for-clip

Adversarial patch system for privacy protection against CLIP image...

22
Experimental
32 CoderChen01/InterCLIP-MEP

Official repository of the paper "InterCLIP-MEP: Interactive CLIP and...

21
Experimental
33 ExcelsiorCJH/CLIP

CLIP: Learning Transferable Visual Models From Natural Language Supervision

20
Experimental
34 zjunlp/SPEECH

[ACL 2023] SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres

20
Experimental
35 sbmagar13/VQGAN-CLIP-Text-to-Image

Text-to-Image Synthesis using Multimodal (VQGAN + CLIP) Architectures

19
Experimental
36 A-SHOJAEI/multimodal-contrastive-captioning-with-preference-aligned-generation

Vision-language model combining CLIP-style contrastive learning with...

19
Experimental
37 your-ai-solution/generation-image-caption

This application fine-tunes the CLIP model on the Flickr8k dataset to align...

18
Experimental
38 Evfidiw/MoBA

[ACMMM'24] MoBA: Mixture of Bi-directional Adapter for Multi-modal Sarcasm Detection

17
Experimental
39 Jaso1024/Refining-Generated-Videos

IEEE 2023 | REGIS: Refining Generated Videos via Iterative Stylistic Remodeling

16
Experimental
40 D0miH/does-clip-know-my-face

Source Code for the JAIR Paper "Does CLIP Know my Face?" (Demo:...

15
Experimental
41 Komorebirumu/awe-ms-20260316-1451-01

AI Historical Document Authenticity Checker (Local Archives)

14
Experimental
42 Jeyjey123456/ReVidgen

🎥 Rethink video generation for the embodied world with ReVidgen, leveraging...

14
Experimental
43 Bijay-kumar-sethy/clip

🔍 Solve linear programming problems efficiently with Clp, an open-source...

14
Experimental
44 ImtiazShuvo/clip-lora-food101-classification

Transfer learning and parameter-efficient fine-tuning of CLIP on the...

14
Experimental
45 Fr0zenCrane/Cockatiel

The official implementation of our paper "Cockatiel: Ensembling Synthetic...

13
Experimental
46 MingliangLiang3/GLIP

Centered Masking for Language-Image Pre-training

13
Experimental
47 buraksatar/RoME_video_retrieval

It includes our two recent papers on text-to-video retrieval along with a...

12
Experimental
48 jonkahana/CLIPPR

An official PyTorch implementation for CLIPPR

12
Experimental
49 rhysdg/vision-at-a-clip

Low-latency ONNX and TensorRT based zero-shot classification and detection...

12
Experimental
50 nicolafan/clipper

Explore your CLIP embeddings in a bidimensional space

11
Experimental
51 KeithLin724/HAR_Clip

Human Action Recognition using Clip

11
Experimental
52 MaharshPatelX/qwen-clip-multimodal

Multimodal Vision-AI: CLIP eyes + Qwen2.5 brain, 155 K-step pipeline & demo.

11
Experimental
53 smb-h/mqirtn

Multimodal Query Enhancement for Image Retrieval using Transformer Networks (MQIRTN)

10
Experimental