rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

/ 100

Established

Covers the complete pipeline from tokenization and attention mechanisms through pretraining on unlabeled data and finetuning for classification and instruction-following tasks. Includes practical implementations of multi-head attention, causal masking, and parameter-efficient techniques like LoRA, alongside code for loading pretrained model weights. Organized as Jupyter notebooks and standalone Python scripts that progressively build a functional GPT architecture while explaining each component's role in modern LLM training.

87,892 stars. Actively maintained with 8 commits in the last 30 days.

No Package No Dependents

Maintenance 20 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

87,892

Forks

13,408

Language

Jupyter Notebook

License

—

Compare

LLMs-from-scratch and llms-from-scratch-cn LLMs-from-scratch and train-llm-from-scratch LLMs-from-scratch and llms-from-scratch LLMs-from-scratch and Building-LLMs-from-scratch LLMs-from-scratch and llm-scratch-pytorch LLMs-from-scratch and scratch-llm LLMs-from-scratch and create-million-parameter-llm-from-scratch LLMs-from-scratch and llm-from-scratch

Related models

datawhalechina/llms-from-scratch-cn

仅需Python基础，从0构建大语言模型；从0逐步构建GLM4\Llama3\RWKV6，深入理解大模型原理

FareedKhan-dev/train-llm-from-scratch

A straightforward method for training your LLM, from downloading data to generating text.

facebookresearch/LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

analyticalrohit/llms-from-scratch

Build a ChatGPT like LLM from scratch in PyTorch, explained step by step.

Explore Transformer Models

All categories Trending Transformer directory Insights