Transformer Architecture Education ML Frameworks

Educational repositories focused on implementing transformer models from scratch to understand core components and mechanisms. Includes tutorials, explanations, and hands-on implementations of attention, positional encoding, and encoder-decoder structures. Does NOT include pre-trained model usage, applications (translation, BERT fine-tuning), or production frameworks.

There are 26 transformer architecture education frameworks tracked. The highest-rated is lvapeab/nmt-keras at 44/100 with 531 stars.

Get all 26 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=transformer-architecture-education&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 lvapeab/nmt-keras

Neural Machine Translation with Keras

44
Emerging
2 jaketae/ensemble-transformers

Ensembling Hugging Face transformers made easy

39
Emerging
3 dair-ai/Transformers-Recipe

🧠 A study guide to learn about Transformers

39
Emerging
4 SirawitC/Transformer_from_scratch_pytorch

Build a transformer model from scratch using pytorch to understand its inner...

37
Emerging
5 lof310/transformer

PyTorch implementation of the current SOTA Transformer. Configurable,...

37
Emerging
6 submarat/removing-layer-norm

Transformers Don’t Need LayerNorm at Inference Time

26
Experimental
7 jiangtaoxie/SoT

SoT: Delving Deeper into Classification Head for Transformer

22
Experimental
8 balhafni/personalized-gen

Code, models, and data for "Personalized Text Generation with Fine-Grained...

22
Experimental
9 L-Zhe/FasySeq

A fast and easy implementation of Transformer with PyTorch.

22
Experimental
10 anmolg1997/SLM-From-Scratch

Build small language models from scratch — BPE tokenizer, composable...

22
Experimental
11 leeway0507/Transformer_from_scratch

Transformer 구현 및 학습 방법 설명

20
Experimental
12 dianjiang75/Transformer

A decoder-only Transformer built entirely from scratch in PyTorch. Trained...

19
Experimental
13 Ayush-Aditya/decoder-only-seq2seq

Minimal decoder-only seq2seq pipeline with proper causal masking, teacher...

15
Experimental
14 Harsha-hue/visual-transformer-guide

I built a visual guide explaining how Transformers work. Tokenization...

15
Experimental
15 cosimo17/transformer_notebook

Transformer turorial. Transformer教程

15
Experimental
16 Banniesdread/decoder-only-seq2seq

Implement a decoder-only Transformer in PyTorch to reverse character...

14
Experimental
17 Joe-Naz01/encoder-decoder

This PyTorch notebook implements a complete Transformer architecture from...

14
Experimental
18 driellecristine/BERT-Contrastive-LoRA

Enhance BERT fine-tuning for intent classification using supervised...

14
Experimental
19 msi1427/Original-Transformer-for-Bengali-Translation

A neural machine translation project for Bengali Translation where the...

13
Experimental
20 MyDarapy/gpt-1-from-scratch

Rewriting and pretraining GPT-1 from scratch. Implementing Multihead...

13
Experimental
21 mfarisadip/Multi-X-Transformers

A neural network based on the encoder-decoder architecture the modeling...

13
Experimental
22 wj-Mcat/transformer-handbook

add transformer related blogs & codes

12
Experimental
23 ankushhKapoor/transformer-from-scratch

Transformer from scratch implementation in PyTorch for Neural Machine...

11
Experimental
24 Xachchchch/deberta-fine-tune-comparison

Experimenting with LoRA vs head-only fine-tuning for DeBERTa on sentiment analysis

11
Experimental
25 dino65-dev/Transformers

Transformers from scratch implemented GQA,RoPE,RMS-Norm and trained on that code

11
Experimental
26 konodiodaaaaa1/PyTorch-Transformer-From-Scratch

A numerical stable implementation of Transformer from scratch using PyTorch....

10
Experimental