CLIP and simple-clip

The official OpenAI implementation serves as the reference model and weights that the minimal PyTorch reimplementation attempts to replicate for educational or resource-constrained purposes.

CLIP
60
Established
simple-clip
54
Established
Maintenance 13/25
Adoption 10/25
Maturity 16/25
Community 21/25
Maintenance 0/25
Adoption 12/25
Maturity 25/25
Community 17/25
Stars: 32,796
Forks: 3,961
Downloads:
Commits (30d): 1
Language: Jupyter Notebook
License: MIT
Stars: 42
Forks: 8
Downloads: 39
Commits (30d): 0
Language: Jupyter Notebook
License: MIT
No Package No Dependents
Stale 6m

About CLIP

openai/CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Trained on 400M image-text pairs using contrastive learning, CLIP jointly encodes images and text into a shared embedding space where cosine similarity enables zero-shot classification without task-specific fine-tuning. Built on Vision Transformers and text encoders in PyTorch, it integrates seamlessly with torchvision for preprocessing and supports multiple model scales (ViT-B/32, ViT-L/14, etc.) for deployment flexibility.

About simple-clip

filipbasara0/simple-clip

A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch

Scores updated daily from GitHub, PyPI, and npm data. How scores work