NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

/ 100

Verified

Provides composable GPU-optimized building blocks for transformer training, including advanced parallelism strategies (tensor, pipeline, expert, and context parallelism), mixed precision support (FP16, BF16, FP8, FP4), and custom training pipeline construction. Achieves up to 47% Model FLOP Utilization on H100 clusters while scaling from 2B to 462B parameter models. Integrates with Hugging Face via Megatron Bridge for checkpoint conversion and works with NVIDIA NeMo framework for production-ready recipes.

15,633 stars. Actively maintained with 226 commits in the last 30 days.

No Package No Dependents

Maintenance 25 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

15,633

Forks

3,689

Language

Python

License

—

Related models

huggingface/optimum

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers...

openvinotoolkit/nncf

Neural Network Compression Framework for enhanced OpenVINO™ inference

huggingface/optimum-intel

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

RBLN-SW/optimum-rbln

⚡ A seamless integration of HuggingFace Transformers & Diffusers with RBLN SDK for efficient...

eole-nlp/eole

Open language modeling toolkit based on PyTorch

Explore Transformer Models

All categories Trending Transformer directory Insights