datawhalechina/llms-from-scratch-cn

仅需Python基础，从0构建大语言模型；从0逐步构建GLM4\Llama3\RWKV6，深入理解大模型原理

/ 100

Emerging

# Technical Summary Provides hands-on implementation of LLM core components using PyTorch, covering tokenization, attention mechanisms, and GPT-style architectures through progressive Jupyter notebooks alongside detailed theoretical explanations. Includes architecture dissections and implementation guides for multiple production models (ChatGLM3/4, Llama3, RWKV V2-V6, MiniCPM), enabling learners to understand both foundational transformer mechanics and variant design choices across different model families.

4,010 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

4,010

Forks

552

Language

Jupyter Notebook

License

—

Compare

llms-from-scratch-cn and LLMs-from-scratch

Higher-rated alternatives

rasbt/LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

FareedKhan-dev/train-llm-from-scratch

A straightforward method for training your LLM, from downloading data to generating text.

facebookresearch/LayerSkip

Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024

kmeng01/rome

Locating and editing factual associations in GPT (NeurIPS 2022)

analyticalrohit/llms-from-scratch

Build a ChatGPT like LLM from scratch in PyTorch, explained step by step.

Explore Transformer Models

All categories Trending Transformer directory Insights