mmaaz60/EdgeNeXt
[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications".
Introduces split depth-wise transpose attention (SDTA) that groups channels and combines depth-wise convolution with self-attention to encode multi-scale features efficiently. Achieves 79.4% ImageNet-1K accuracy with 5.6M parameters across classification, detection, and segmentation tasks, with PyTorch implementation supporting distributed training on multi-GPU setups and model variants ranging from 1.3M to 18.5M parameters.
411 stars. No commits in the last 6 months.
Stars
411
Forks
45
Language
Python
License
MIT
Category
Last pushed
Jul 25, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/mmaaz60/EdgeNeXt"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Kohulan/DECIMER-Image_Transformer
DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of...
fcakyon/video-transformers
Easiest way of fine-tuning HuggingFace video classification models
sovit-123/vision_transformers
Vision Transformers for image classification, image segmentation, and object detection.
leaderj1001/BottleneckTransformers
Bottleneck Transformers for Visual Recognition
qubvel/transformers-notebooks
Inference and fine-tuning examples for vision models from 🤗 Transformers