mmaaz60/EdgeNeXt

[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications".

/ 100

Emerging

Introduces split depth-wise transpose attention (SDTA) that groups channels and combines depth-wise convolution with self-attention to encode multi-scale features efficiently. Achieves 79.4% ImageNet-1K accuracy with 5.6M parameters across classification, detection, and segmentation tasks, with PyTorch implementation supporting distributed training on multi-GPU setups and model variants ranging from 1.3M to 18.5M parameters.

411 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

411

Forks

Language

Python

License

MIT

Higher-rated alternatives

Kohulan/DECIMER-Image_Transformer

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of...

fcakyon/video-transformers

Easiest way of fine-tuning HuggingFace video classification models

sovit-123/vision_transformers

Vision Transformers for image classification, image segmentation, and object detection.

leaderj1001/BottleneckTransformers

Bottleneck Transformers for Visual Recognition

qubvel/transformers-notebooks

Inference and fine-tuning examples for vision models from 🤗 Transformers

Explore Transformer Models

All categories Trending Transformer directory Insights