Transformer Architecture Education ML Frameworks
Educational repositories focused on implementing transformer models from scratch to understand core components and mechanisms. Includes tutorials, explanations, and hands-on implementations of attention, positional encoding, and encoder-decoder structures. Does NOT include pre-trained model usage, applications (translation, BERT fine-tuning), or production frameworks.
There are 26 transformer architecture education frameworks tracked. The highest-rated is lvapeab/nmt-keras at 44/100 with 531 stars.
Get all 26 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=transformer-architecture-education&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
lvapeab/nmt-keras
Neural Machine Translation with Keras |
|
Emerging |
| 2 |
jaketae/ensemble-transformers
Ensembling Hugging Face transformers made easy |
|
Emerging |
| 3 |
dair-ai/Transformers-Recipe
🧠 A study guide to learn about Transformers |
|
Emerging |
| 4 |
SirawitC/Transformer_from_scratch_pytorch
Build a transformer model from scratch using pytorch to understand its inner... |
|
Emerging |
| 5 |
lof310/transformer
PyTorch implementation of the current SOTA Transformer. Configurable,... |
|
Emerging |
| 6 |
submarat/removing-layer-norm
Transformers Don’t Need LayerNorm at Inference Time |
|
Experimental |
| 7 |
jiangtaoxie/SoT
SoT: Delving Deeper into Classification Head for Transformer |
|
Experimental |
| 8 |
balhafni/personalized-gen
Code, models, and data for "Personalized Text Generation with Fine-Grained... |
|
Experimental |
| 9 |
L-Zhe/FasySeq
A fast and easy implementation of Transformer with PyTorch. |
|
Experimental |
| 10 |
anmolg1997/SLM-From-Scratch
Build small language models from scratch — BPE tokenizer, composable... |
|
Experimental |
| 11 |
leeway0507/Transformer_from_scratch
Transformer 구현 및 학습 방법 설명 |
|
Experimental |
| 12 |
dianjiang75/Transformer
A decoder-only Transformer built entirely from scratch in PyTorch. Trained... |
|
Experimental |
| 13 |
Ayush-Aditya/decoder-only-seq2seq
Minimal decoder-only seq2seq pipeline with proper causal masking, teacher... |
|
Experimental |
| 14 |
Harsha-hue/visual-transformer-guide
I built a visual guide explaining how Transformers work. Tokenization... |
|
Experimental |
| 15 |
cosimo17/transformer_notebook
Transformer turorial. Transformer教程 |
|
Experimental |
| 16 |
Banniesdread/decoder-only-seq2seq
Implement a decoder-only Transformer in PyTorch to reverse character... |
|
Experimental |
| 17 |
Joe-Naz01/encoder-decoder
This PyTorch notebook implements a complete Transformer architecture from... |
|
Experimental |
| 18 |
driellecristine/BERT-Contrastive-LoRA
Enhance BERT fine-tuning for intent classification using supervised... |
|
Experimental |
| 19 |
msi1427/Original-Transformer-for-Bengali-Translation
A neural machine translation project for Bengali Translation where the... |
|
Experimental |
| 20 |
MyDarapy/gpt-1-from-scratch
Rewriting and pretraining GPT-1 from scratch. Implementing Multihead... |
|
Experimental |
| 21 |
mfarisadip/Multi-X-Transformers
A neural network based on the encoder-decoder architecture the modeling... |
|
Experimental |
| 22 |
wj-Mcat/transformer-handbook
add transformer related blogs & codes |
|
Experimental |
| 23 |
ankushhKapoor/transformer-from-scratch
Transformer from scratch implementation in PyTorch for Neural Machine... |
|
Experimental |
| 24 |
Xachchchch/deberta-fine-tune-comparison
Experimenting with LoRA vs head-only fine-tuning for DeBERTa on sentiment analysis |
|
Experimental |
| 25 |
dino65-dev/Transformers
Transformers from scratch implemented GQA,RoPE,RMS-Norm and trained on that code |
|
Experimental |
| 26 |
konodiodaaaaa1/PyTorch-Transformer-From-Scratch
A numerical stable implementation of Transformer from scratch using PyTorch.... |
|
Experimental |