openai/CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

/ 100

Established

Trained on 400M image-text pairs using contrastive learning, CLIP jointly encodes images and text into a shared embedding space where cosine similarity enables zero-shot classification without task-specific fine-tuning. Built on Vision Transformers and text encoders in PyTorch, it integrates seamlessly with torchvision for preprocessing and supports multiple model scales (ViT-B/32, ViT-L/14, etc.) for deployment flexibility.

32,796 stars. Actively maintained with 1 commit in the last 30 days.

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

32,796

Forks

3,961

Language

Jupyter Notebook

License

MIT

Compare

CLIP and open_clip CLIP and simple-clip

Related frameworks

mlfoundations/open_clip

An open source implementation of CLIP.

noxdafox/clipspy

Python CFFI bindings for the 'C' Language Integrated Production System CLIPS

filipbasara0/simple-clip

A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch

moein-shariatnia/OpenAI-CLIP

Simple implementation of OpenAI CLIP model in PyTorch.

BioMedIA-MBZUAI/FetalCLIP

Official repository of FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis

Explore ML Frameworks

All categories Trending ML Framework directory Insights