srush/LLM-Training-Puzzles

What would you do with 1000 H100s...

42
/ 100
Emerging

A hands-on puzzle collection exploring distributed training primitives across thousands of GPUs, covering memory efficiency and compute pipelining strategies critical to large-scale model training. Each puzzle presents concrete challenges in multi-GPU coordination, requiring implementation of techniques like gradient accumulation, pipeline parallelism, and communication optimization. Designed for Colab execution with progressive difficulty, building on prior puzzle series covering GPU kernels, tensors, autodiff, and transformers.

1,157 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

1,157

Forks

72

Language

Jupyter Notebook

License

MIT

Last pushed

Jan 10, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/srush/LLM-Training-Puzzles"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.