Efficient-ML/Awesome-Model-Quantization

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

/ 100

Established

The repository organizes quantization research across benchmarks (BiBench, MQBench), comprehensive surveys on binarization and quantization methods, and chronologically-indexed papers spanning 2015-2024. It covers specialized domains including binary neural networks, spiking neural networks, hardware deployment optimization, and recent advances in LLM/diffusion model quantization. The curated collection enables researchers to track methodological evolution while cross-referencing implementations, empirical comparisons, and deployment-specific solutions.

2,333 stars. Actively maintained with 6 commits in the last 30 days.

No License No Package No Dependents

Maintenance 17 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 20 / 25

How are scores calculated?

Stars

2,333

Forks

232

Language

—

License

—

Related frameworks

Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...

Explore ML Frameworks

All categories Trending ML Framework directory Insights