szagoruyko/attention-transfer

Improving Convolutional Networks via Attention Transfer (ICLR 2017)

/ 100

Emerging

Implements activation-based attention transfer to compress CNNs by distilling spatial attention maps from larger teacher networks to smaller students, improving accuracy on CIFAR-10 and ImageNet without additional inference cost. The approach transfers high-resolution activation patterns rather than soft targets, enabling student networks like ResNet-18 to match larger models (ResNet-34) with 1.1% ImageNet error reduction. Built on PyTorch with integrated torchnet utilities, supporting both single-GPU CIFAR experiments and distributed multi-GPU ImageNet training via command-line configuration.

1,466 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 1 / 25

Community 24 / 25

How are scores calculated?

Stars

1,466

Forks

274

Language

Jupyter Notebook

License

—

Higher-rated alternatives

philipperemy/keras-attention

Keras Attention Layer (Luong and Bahdanau scores).

tatp22/linformer-pytorch

My take on a practical implementation of Linformer for Pytorch.

lucidrains/fast-weight-attention

Implementation of Fast Weight Attention

thushv89/attention_keras

Keras Layer implementation of Attention for Sequential models

ematvey/hierarchical-attention-networks

Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is...

Explore ML Frameworks

All categories Trending ML Framework directory Insights