SunzeY/AlphaCLIP

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

/ 100

Emerging

Incorporates alpha-channel (transparency/mask) conditioning into CLIP's vision encoder, enabling region-focused feature extraction by accepting binary foreground masks alongside images. Built on LoRA-based fine-tuning of standard CLIP backbones (ViT-B/16, ViT-L/14) trained on the MaskImageNet dataset. Integrates seamlessly with downstream applications like Stable Diffusion, LLaVA, and BLIP for improved performance in masked image understanding, zero-shot classification, and vision-language tasks.

869 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 9 / 25

Community 15 / 25

How are scores calculated?

Stars

869

Forks

Language

Jupyter Notebook

License

Apache-2.0

Compare

AlphaCLIP and open_clip

Higher-rated alternatives

mlfoundations/open_clip

An open source implementation of CLIP.

noxdafox/clipspy

Python CFFI bindings for the 'C' Language Integrated Production System CLIPS

openai/CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

filipbasara0/simple-clip

A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch

moein-shariatnia/OpenAI-CLIP

Simple implementation of OpenAI CLIP model in PyTorch.

Explore ML Frameworks

All categories Trending ML Framework directory Insights