pytorch/ao

PyTorch native quantization and sparsity for training and inference

74
/ 100
Verified

Provides composable quantization techniques (int4/int8 weight-only, float8 dynamic, QAT) and structured sparsity methods (2:4 semi-structured, block sparsity) with optimized kernels via MSLK, enabling training speedups up to 1.5x and inference gains up to 2.37x. Integrates seamlessly with `torch.compile()`, FSDP2, and popular fine-tuning frameworks (Unsloth, Axolotl, HF Transformers), plus inference backends like vLLM and ExecuTorch for edge deployment.

2,729 stars. Actively maintained with 132 commits in the last 30 days.

No Package No Dependents
Maintenance 25 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 23 / 25

How are scores calculated?

Stars

2,729

Forks

456

Language

Python

License

Last pushed

Mar 13, 2026

Commits (30d)

132

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/pytorch/ao"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.