hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
Implements distributed training through tensor, pipeline, and data parallelism with ZeRO optimizer variants to partition model state across GPUs, enabling training of 70B+ parameter models on commodity clusters. Integrates with PyTorch and Hugging Face ecosystems, supporting mixed-precision training (FP8) and memory-optimized inference for LLMs, video generation models, and protein folding workloads.
41,362 stars. Available on PyPI.
Stars
41,362
Forks
4,526
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 09, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/hpcaitech/ColossalAI"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
deepspeedai/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference...
horovod/horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
helmholtz-analytics/heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
bsc-wdc/dislib
The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.
google/sedpack
Sedpack - Scalable and efficient data packing