Model Compression Optimization ML Frameworks
Tools and techniques for reducing neural network size and computational requirements through quantization, pruning, and compression. Does NOT include training acceleration, architecture search, or general model deployment frameworks.
There are 74 model compression optimization frameworks tracked. 3 score above 70 (verified tier). The highest-rated is Xilinx/brevitas at 82/100 with 1,500 stars and 33,208 monthly downloads. 2 of the top 10 are actively maintained.
Get all 74 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=model-compression-optimization&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
Xilinx/brevitas
Brevitas: neural network quantization in PyTorch |
|
Verified |
| 2 |
fastmachinelearning/qonnx
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX |
|
Verified |
| 3 |
open-mmlab/mmengine
OpenMMLab Foundational Library for Training Deep Learning Models |
|
Verified |
| 4 |
google/qkeras
QKeras: a quantization deep learning library for Tensorflow Keras |
|
Established |
| 5 |
tensorflow/model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow,... |
|
Established |
| 6 |
FasterAI-Labs/fasterai
FasterAI: Prune and Distill your models with FastAI and PyTorch |
|
Established |
| 7 |
SonySemiconductorSolutions/mct-model-optimization
Model Compression Toolkit (MCT) is an open source project for neural network... |
|
Established |
| 8 |
lucidrains/vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch |
|
Established |
| 9 |
krasserm/perceiver-io
A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with... |
|
Established |
| 10 |
QPT-Family/QPT
[内测中]QPT - 致力于让开源项目更好通往互联网世界的Python to EXE工具(Python打包)。 |
|
Established |
| 11 |
Efficient-ML/Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed... |
|
Established |
| 12 |
happynear/AMSoftmax
A simple yet effective loss function for face verification. |
|
Established |
| 13 |
Eric-mingjie/network-slimming
Network Slimming (Pytorch) (ICCV 2017) |
|
Established |
| 14 |
Eric-mingjie/rethinking-network-pruning
Rethinking the Value of Network Pruning (Pytorch) (ICLR 2019) |
|
Established |
| 15 |
OpenPPL/ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool. |
|
Emerging |
| 16 |
MingSun-Tse/Efficient-Deep-Learning
Collection of recent methods on (deep) neural network compression and acceleration. |
|
Emerging |
| 17 |
foolwood/pytorch-slimming
Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017. |
|
Emerging |
| 18 |
jack-willturner/deep-compression
Learning both Weights and Connections for Efficient Neural Networks... |
|
Emerging |
| 19 |
onnx/neural-compressor
Model compression for ONNX |
|
Emerging |
| 20 |
tianyic/only_train_once_personal_footprint
OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured... |
|
Emerging |
| 21 |
liuzhuang13/slimming
Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017. |
|
Emerging |
| 22 |
google-research/rigl
End-to-end training of sparse deep neural networks with little-to-no... |
|
Emerging |
| 23 |
cedrickchee/awesome-ml-model-compression
Awesome machine learning model compression research papers, quantization,... |
|
Emerging |
| 24 |
megvii-research/Sparsebit
A model compression and acceleration toolbox based on pytorch. |
|
Emerging |
| 25 |
lucaslie/torchprune
A research library for pytorch-based neural network pruning, compression, and more. |
|
Emerging |
| 26 |
jacobgil/pytorch-pruning
PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks... |
|
Emerging |
| 27 |
chester256/Model-Compression-Papers
Papers for deep neural network compression and acceleration |
|
Emerging |
| 28 |
snap-research/F8Net
[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network... |
|
Emerging |
| 29 |
apple/ml-upscale
Export utility for unconstrained channel pruned models |
|
Emerging |
| 30 |
GoGoDuck912/pytorch-vector-quantization
A Pytorch Implementations for Various Vector Quantization Methods |
|
Emerging |
| 31 |
hkproj/quantization-notes
Notes on quantization in neural networks |
|
Emerging |
| 32 |
skolai/fewbit
Compression schema for gradients of activations in backward pass |
|
Emerging |
| 33 |
jeshraghian/QSNNs
Quantization-aware training with spiking neural networks |
|
Emerging |
| 34 |
harvard-edge/QuaRL
QuaRL is an open-source framework for systematically studying the effect of... |
|
Emerging |
| 35 |
LeapLabTHU/EfficientTrain
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf,... |
|
Emerging |
| 36 |
fdbtrs/ElasticFace
Official repository of CVPRW2022 paper, ElasticFace: Elastic Margin Loss for... |
|
Emerging |
| 37 |
StijnVerdenius/SNIP-it
This repository is the official implementation of the paper Pruning via... |
|
Emerging |
| 38 |
EIDOSLAB/simplify
Simplification of pruned models for accelerated inference | SoftwareX... |
|
Emerging |
| 39 |
ciodar/deep-compression
PyTorch Lightning implementation of the paper Deep Compression: Compressing... |
|
Emerging |
| 40 |
cybertronai/SutroYaro
Sutro Group — Energy-Efficient AI Training Research. Sparse parity... |
|
Emerging |
| 41 |
523333333/quantile_pooling
Stacking Deep Set Networks and Pooling by Quantiles |
|
Experimental |
| 42 |
Eclipsess/CHIP_NeurIPS2021
Code for CHIP: CHannel Independence-based Pruning for Compact Neural... |
|
Experimental |
| 43 |
mcmahon-lab/ONN-device-control
Device control modules for an optical matrix-vector multiplier with a low... |
|
Experimental |
| 44 |
oluwafemidiakhoa/adaptive-sparse-training
Adaptive Sparse Training (AST): 92.1% ImageNet-100 accuracy with 61% energy... |
|
Experimental |
| 45 |
Chenqing-Lin/FAIR-Pruner
Research-ready and production-friendly neural network pruning for... |
|
Experimental |
| 46 |
QiaozheZhang/Global-One-shot-Pruning
An official implementation of the paper "How Sparse Can We Prune A Deep... |
|
Experimental |
| 47 |
archinetai/bitcodes-pytorch
A vector quantization method with binary codes, in PyTorch. |
|
Experimental |
| 48 |
iurada/px-ntk-pruning
Official repository of our work "Finding Lottery Tickets in Vision Models... |
|
Experimental |
| 49 |
kklemon/FlashPerceiver
Fast and memory efficient PyTorch implementation of the Perceiver with... |
|
Experimental |
| 50 |
Firmamento-Technologies/TurboQuant
TurboQuant: Near-Optimal Vector Quantization for AI — Pure Python/NumPy... |
|
Experimental |
| 51 |
mcmahon-lab/ONN-QAT-SQL
Scripts for training neural networks resistant to photon shot noise with... |
|
Experimental |
| 52 |
AkliluYirgalem/live-quantization
real-time model quantization directly in the browser |
|
Experimental |
| 53 |
shinymonitor/qmtik
Quantized Model Training and Inference Kit |
|
Experimental |
| 54 |
GenauraApp/TurboQuant
Near-optimal vector quantization with zero metadata overhead — PyTorch SDK... |
|
Experimental |
| 55 |
Intelligent-Microsystems-Lab/SNNQuantPrune
Code for the ISCAS23 paper "The Hardware Impact of Quantization and Pruning... |
|
Experimental |
| 56 |
m4urin/quantized-liquid-state-machines
A Liquid State Machine using quantized neurons that are operating on... |
|
Experimental |
| 57 |
ksm26/Quantization-in-Depth
Dive into advanced quantization techniques. Learn to implement and customize... |
|
Experimental |
| 58 |
approx-ml/approx
Automatic quantization library |
|
Experimental |
| 59 |
Nikolai10/FSQ
TensorFlow implementation of "Finite Scalar Quantization: VQ-VAE Made... |
|
Experimental |
| 60 |
ZIB-IOL/SMS
Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups:... |
|
Experimental |
| 61 |
iurada/talos-task-arithmetic
Official repository of our work "Efficient Model Editing with Task-Localized... |
|
Experimental |
| 62 |
camail-official/LinearAttentionPruning
This is the official repository for the pre-print "The Key to State... |
|
Experimental |
| 63 |
zanvari/resnet50-quantization
Resnet50 Quantization for Inference Speedup in PyTorch |
|
Experimental |
| 64 |
yzamari/turboQuantPlayground
TurboQuant (ICLR 2026) ported to Apple Silicon — KV cache compression with... |
|
Experimental |
| 65 |
DataDarling/AI-Proposal-Model-Compression-for-Low-Carbon-Ecological-Image-Classification-on-Edge-Devices
This paper proposes evaluating pruning and quantization techniques to reduce... |
|
Experimental |
| 66 |
upunaprosk/Awesome-LLM-Compression-Safety
A curated list of papers, docs, and code on the undesired effects of model... |
|
Experimental |
| 67 |
jianhayes/NESTQUANT
NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN... |
|
Experimental |
| 68 |
erectbranch/Awesome-Activation-Sparsification
A curated list of neural network activation sparsification resources. |
|
Experimental |
| 69 |
medoidai/model-quantization-blog-notebooks
Notebook from "A Hands-On Walkthrough on Model Quantization" blog post. |
|
Experimental |
| 70 |
priyanshujiiii/awesome-Quantization
In this repo you will understand .The process of reducing the precision of a... |
|
Experimental |
| 71 |
priyankkalgaonkar/CondenseNeXt
An Ultra-Efficient Deep Neural Network for Embedded Systems |
|
Experimental |
| 72 |
julianscher/gpt-adaprune
An integrated PyTorch pipeline for pretraining GPT-2 on linear regression... |
|
Experimental |
| 73 |
chadHGY/awesome-deep-model-compression
Awesome Deep Model Compression |
|
Experimental |
| 74 |
Mainframework/Quanta
Convert and quantize llm models |
|
Experimental |