OpenPPL/ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Supports extensible quantization via 27 composable optimization passes and a custom execution engine handling 99 ONNX operators natively, enabling per-operator and per-tensor bit-width/granularity control. Integrates with 10+ inference frameworks including TensorRT, OpenPPL, OpenVINO, NCNN, and MNN, with hardware-specific quantization strategies and QAT capabilities. Features FP8 quantization (E4M3/E5M2 formats), graph fusion, pattern matching, and bias correction for low-latency edge deployment.
1,788 stars. No commits in the last 6 months.
Stars
1,788
Forks
274
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 28, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/OpenPPL/ppq"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Xilinx/brevitas
Brevitas: neural network quantization in PyTorch
fastmachinelearning/qonnx
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
open-mmlab/mmengine
OpenMMLab Foundational Library for Training Deep Learning Models
google/qkeras
QKeras: a quantization deep learning library for Tensorflow Keras
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization...