Transformer Architecture Education ML Frameworks

Educational repositories focused on implementing transformer models from scratch to understand core components and mechanisms. Includes tutorials, explanations, and hands-on implementations of attention, positional encoding, and encoder-decoder structures. Does NOT include pre-trained model usage, applications (translation, BERT fine-tuning), or production frameworks.

There are 26 transformer architecture education frameworks tracked. The highest-rated is lvapeab/nmt-keras at 44/100 with 531 stars.

Get all 26 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=transformer-architecture-education&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Framework	Score	Tier	Stars	Language
1	lvapeab/nmt-keras Neural Machine Translation with Keras	44	Emerging	531	Python
2	jaketae/ensemble-transformers Ensembling Hugging Face transformers made easy	39	Emerging	61	Python
3	dair-ai/Transformers-Recipe 🧠 A study guide to learn about Transformers	39	Emerging	1,624	—
4	SirawitC/Transformer_from_scratch_pytorch Build a transformer model from scratch using pytorch to understand its inner...	37	Emerging	42	Python
5	lof310/transformer PyTorch implementation of the current SOTA Transformer. Configurable,...	37	Emerging	3	Python
6	submarat/removing-layer-norm Transformers Don’t Need LayerNorm at Inference Time	26	Experimental	3	Python
7	jiangtaoxie/SoT SoT: Delving Deeper into Classification Head for Transformer	22	Experimental	50	Python
8	balhafni/personalized-gen Code, models, and data for "Personalized Text Generation with Fine-Grained...	22	Experimental	7	Python
9	L-Zhe/FasySeq A fast and easy implementation of Transformer with PyTorch.	22	Experimental	7	Python
10	anmolg1997/SLM-From-Scratch Build small language models from scratch — BPE tokenizer, composable...	22	Experimental	—	Python
11	leeway0507/Transformer_from_scratch Transformer 구현 및 학습 방법 설명	20	Experimental	6	Jupyter Notebook
12	dianjiang75/Transformer A decoder-only Transformer built entirely from scratch in PyTorch. Trained...	19	Experimental	—	Jupyter Notebook
13	Ayush-Aditya/decoder-only-seq2seq Minimal decoder-only seq2seq pipeline with proper causal masking, teacher...	15	Experimental	6	Python
14	Harsha-hue/visual-transformer-guide I built a visual guide explaining how Transformers work. Tokenization...	15	Experimental	1	HTML
15	cosimo17/transformer_notebook Transformer turorial. Transformer教程	15	Experimental	6	Jupyter Notebook
16	Banniesdread/decoder-only-seq2seq Implement a decoder-only Transformer in PyTorch to reverse character...	14	Experimental	—	Python
17	Joe-Naz01/encoder-decoder This PyTorch notebook implements a complete Transformer architecture from...	14	Experimental	—	Jupyter Notebook
18	driellecristine/BERT-Contrastive-LoRA Enhance BERT fine-tuning for intent classification using supervised...	14	Experimental	—	Python
19	msi1427/Original-Transformer-for-Bengali-Translation A neural machine translation project for Bengali Translation where the...	13	Experimental	8	Jupyter Notebook
20	MyDarapy/gpt-1-from-scratch Rewriting and pretraining GPT-1 from scratch. Implementing Multihead...	13	Experimental	7	Python
21	mfarisadip/Multi-X-Transformers A neural network based on the encoder-decoder architecture the modeling...	13	Experimental	5	Python
22	wj-Mcat/transformer-handbook add transformer related blogs & codes	12	Experimental	3	HTML
23	ankushhKapoor/transformer-from-scratch Transformer from scratch implementation in PyTorch for Neural Machine...	11	Experimental	8	Python
24	Xachchchch/deberta-fine-tune-comparison Experimenting with LoRA vs head-only fine-tuning for DeBERTa on sentiment analysis	11	Experimental	—	Jupyter Notebook
25	dino65-dev/Transformers Transformers from scratch implemented GQA,RoPE,RMS-Norm and trained on that code	11	Experimental	—	Jupyter Notebook
26	konodiodaaaaa1/PyTorch-Transformer-From-Scratch A numerical stable implementation of Transformer from scratch using PyTorch....	10	Experimental	3	Python

Comparisons in this category

Transformer_from_scratch_pytorch and Transformer_from_scratch (37 vs 20)