szagoruyko/attention-transfer
Improving Convolutional Networks via Attention Transfer (ICLR 2017)
Implements activation-based attention transfer to compress CNNs by distilling spatial attention maps from larger teacher networks to smaller students, improving accuracy on CIFAR-10 and ImageNet without additional inference cost. The approach transfers high-resolution activation patterns rather than soft targets, enabling student networks like ResNet-18 to match larger models (ResNet-34) with 1.1% ImageNet error reduction. Built on PyTorch with integrated torchnet utilities, supporting both single-GPU CIFAR experiments and distributed multi-GPU ImageNet training via command-line configuration.
1,466 stars. No commits in the last 6 months.
Stars
1,466
Forks
274
Language
Jupyter Notebook
License
—
Category
Last pushed
Jul 11, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/szagoruyko/attention-transfer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
philipperemy/keras-attention
Keras Attention Layer (Luong and Bahdanau scores).
tatp22/linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
lucidrains/fast-weight-attention
Implementation of Fast Weight Attention
thushv89/attention_keras
Keras Layer implementation of Attention for Sequential models
ematvey/hierarchical-attention-networks
Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is...