LLMs-from-scratch and llms-from-scratch-cn

These are complement resources serving different language communities—the English PyTorch implementation paired with a Chinese-language variant that covers multiple architectures (GLM4, Llama3, RWKV6)—allowing practitioners to learn LLM construction in their preferred language while referencing the same foundational concepts.

LLMs-from-scratch
69
Established
llms-from-scratch-cn
48
Emerging
Maintenance 20/25
Adoption 10/25
Maturity 16/25
Community 23/25
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 22/25
Stars: 87,892
Forks: 13,408
Downloads:
Commits (30d): 8
Language: Jupyter Notebook
License:
Stars: 4,010
Forks: 552
Downloads:
Commits (30d): 0
Language: Jupyter Notebook
License:
No Package No Dependents
Stale 6m No Package No Dependents

About LLMs-from-scratch

rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Covers the complete pipeline from tokenization and attention mechanisms through pretraining on unlabeled data and finetuning for classification and instruction-following tasks. Includes practical implementations of multi-head attention, causal masking, and parameter-efficient techniques like LoRA, alongside code for loading pretrained model weights. Organized as Jupyter notebooks and standalone Python scripts that progressively build a functional GPT architecture while explaining each component's role in modern LLM training.

About llms-from-scratch-cn

datawhalechina/llms-from-scratch-cn

仅需Python基础,从0构建大语言模型;从0逐步构建GLM4\Llama3\RWKV6, 深入理解大模型原理

# Technical Summary Provides hands-on implementation of LLM core components using PyTorch, covering tokenization, attention mechanisms, and GPT-style architectures through progressive Jupyter notebooks alongside detailed theoretical explanations. Includes architecture dissections and implementation guides for multiple production models (ChatGLM3/4, Llama3, RWKV V2-V6, MiniCPM), enabling learners to understand both foundational transformer mechanics and variant design choices across different model families.

Scores updated daily from GitHub, PyPI, and npm data. How scores work