deepspeedai/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Implements ZeRO memory optimization (partitioning optimizer states, gradients, and parameters across devices), sequence parallelism, and mixture-of-experts techniques to enable training of hundred-billion-parameter models. Integrates seamlessly with PyTorch, Hugging Face Transformers, and Accelerate through a unified API, supporting both training and inference pipelines across multi-GPU and multi-node clusters.
41,801 stars and 1,187,695 monthly downloads. Used by 24 other packages. Actively maintained with 32 commits in the last 30 days. Available on PyPI.
Stars
41,801
Forks
4,751
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Monthly downloads
1,187,695
Commits (30d)
32
Dependencies
11
Reverse dependents
24
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/deepspeedai/DeepSpeed"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
horovod/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
bsc-wdc/dislib
The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.
google/sedpack
Sedpack - Scalable and efficient data packing