SunzeY/AlphaCLIP
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Incorporates alpha-channel (transparency/mask) conditioning into CLIP's vision encoder, enabling region-focused feature extraction by accepting binary foreground masks alongside images. Built on LoRA-based fine-tuning of standard CLIP backbones (ViT-B/16, ViT-L/14) trained on the MaskImageNet dataset. Integrates seamlessly with downstream applications like Stable Diffusion, LLaVA, and BLIP for improved performance in masked image understanding, zero-shot classification, and vision-language tasks.
869 stars. No commits in the last 6 months.
Stars
869
Forks
58
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Jul 20, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/SunzeY/AlphaCLIP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
mlfoundations/open_clip
An open source implementation of CLIP.
noxdafox/clipspy
Python CFFI bindings for the 'C' Language Integrated Production System CLIPS
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
filipbasara0/simple-clip
A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch
moein-shariatnia/OpenAI-CLIP
Simple implementation of OpenAI CLIP model in PyTorch.