Efficient-ML/Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
The repository organizes quantization research across benchmarks (BiBench, MQBench), comprehensive surveys on binarization and quantization methods, and chronologically-indexed papers spanning 2015-2024. It covers specialized domains including binary neural networks, spiking neural networks, hardware deployment optimization, and recent advances in LLM/diffusion model quantization. The curated collection enables researchers to track methodological evolution while cross-referencing implementations, empirical comparisons, and deployment-specific solutions.
2,333 stars. Actively maintained with 6 commits in the last 30 days.
Stars
2,333
Forks
232
Language
—
License
—
Category
Last pushed
Jan 29, 2026
Commits (30d)
6
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Efficient-ML/Awesome-Model-Quantization"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
Xilinx/brevitas
Brevitas: neural network quantization in PyTorch
fastmachinelearning/qonnx
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
open-mmlab/mmengine
OpenMMLab Foundational Library for Training Deep Learning Models
google/qkeras
QKeras: a quantization deep learning library for Tensorflow Keras
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...