NVlabs/GroupViT

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

42
/ 100
Emerging

Employs hierarchical bottom-up spatial grouping to learn semantic segmentation purely from image-text pairs without mask labels, leveraging vision transformers with a contrastive learning framework. Integrates with MMSegmentation for evaluation on Pascal VOC, Context, and COCO datasets, and supports webdataset for scalable training on large caption-image corpora like GCC and YFCC. Available as a Hugging Face Space demo and compatible with timm backbone models and NVIDIA Apex for mixed-precision training.

783 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

783

Forks

56

Language

Python

License

Last pushed

May 10, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/NVlabs/GroupViT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.