BR-IDL/PaddleViT

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

/ 100

Established

Implements 15+ transformer and MLP-based vision models (ViT, DeiT, Swin, VOLO, BEiT, etc.) with modular architectures that support image classification, object detection, semantic segmentation, and GANs. Built on PaddlePaddle 2.1+, it provides pretrained weights, unified training/validation pipelines, data augmentation utilities, and distributed training via DDP with mixed-precision support. Each model is exposed as a standalone module for rapid prototyping and fine-tuning on custom datasets.

1,241 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

1,241

Forks

328

Language

Python

License

Apache-2.0

Related tools

pathak22/unsupervised-video

[CVPR 2017] Unsupervised deep learning using unlabelled videos on the web

IBM/CrossViT

Official implementation of CrossViT. https://arxiv.org/abs/2103.14899

NVlabs/GCVit

[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers

ViTAE-Transformer/ViTDet

Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object...

bytedance/SPTSv2

The official implementation of SPTS v2: Single-Point Text Spotting

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights