srush/LLM-Training-Puzzles

What would you do with 1000 H100s...

/ 100

Emerging

A hands-on puzzle collection exploring distributed training primitives across thousands of GPUs, covering memory efficiency and compute pipelining strategies critical to large-scale model training. Each puzzle presents concrete challenges in multi-GPU coordination, requiring implementation of techniques like gradient accumulation, pipeline parallelism, and communication optimization. Designed for Colab execution with progressive difficulty, building on prior puzzle series covering GPU kernels, tensors, autodiff, and transformers.

1,157 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

1,157

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

SepineTam/stata-mcp

Let LLM help you achieve your regression with Stata. Evolve from reg monkey to causal thinker.

datawhalechina/code-your-own-llm

一份全栈式大语言模型参考指南，用最简洁的代码帮助你端到端定义模型从零训练到工程落地的每一个细节

leonid20000/odin-slides

This is an advanced Python tool that empowers you to effortlessly draft customizable PowerPoint...

onejune2018/Awesome-LLM-Eval

Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs...

R3gm/InsightSolver-Colab

InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning,...

Explore LLM Tools

All categories Trending LLM Tool directory Insights