Nemesis-12/multihead-latent-attention
Implementation of Multi-head Latent Attention (MLA) from DeepSeek-V2
15
/ 100
Experimental
No Package
No Dependents
Maintenance
6 / 25
Adoption
0 / 25
Maturity
9 / 25
Community
0 / 25
Stars
—
Forks
—
Language
Python
License
MIT
Category
Last pushed
Nov 22, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Nemesis-12/multihead-latent-attention"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
philipperemy/keras-attention
Keras Attention Layer (Luong and Bahdanau scores).
67
tatp22/linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
51
lucidrains/fast-weight-attention
Implementation of Fast Weight Attention
48
thushv89/attention_keras
Keras Layer implementation of Attention for Sequential models
44
ematvey/hierarchical-attention-networks
Document classification with Hierarchical Attention Networks in TensorFlow. WARNING: project is...
44