openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Trained on 400M image-text pairs using contrastive learning, CLIP jointly encodes images and text into a shared embedding space where cosine similarity enables zero-shot classification without task-specific fine-tuning. Built on Vision Transformers and text encoders in PyTorch, it integrates seamlessly with torchvision for preprocessing and supports multiple model scales (ViT-B/32, ViT-L/14, etc.) for deployment flexibility.
32,796 stars. Actively maintained with 1 commit in the last 30 days.
Stars
32,796
Forks
3,961
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Feb 18, 2026
Commits (30d)
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/openai/CLIP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
mlfoundations/open_clip
An open source implementation of CLIP.
noxdafox/clipspy
Python CFFI bindings for the 'C' Language Integrated Production System CLIPS
filipbasara0/simple-clip
A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch
moein-shariatnia/OpenAI-CLIP
Simple implementation of OpenAI CLIP model in PyTorch.
BioMedIA-MBZUAI/FetalCLIP
Official repository of FetalCLIP: A Visual-Language Foundation Model for Fetal Ultrasound Image Analysis