BICLab/MetaLA

Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)

21
/ 100
Experimental

Unified framework for linear attention mechanisms that addresses three design constraints—dynamic memory, static approximation ability, and parameter efficiency—through a meta-learning approach. Implemented as a drop-in GPT-NeoX module using Flash Linear Attention and causal convolutions for efficient inference, with HuggingFace-compatible checkpoints (380M–3B parameters) trained on 300B tokens and supporting bf16/fp16 precision.

No commits in the last 6 months.

No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 6 / 25

How are scores calculated?

Stars

35

Forks

2

Language

Python

License

Last pushed

Jan 18, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/BICLab/MetaLA"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.