Model Compression Optimization ML Frameworks

Tools and techniques for reducing neural network size and computational requirements through quantization, pruning, and compression. Does NOT include training acceleration, architecture search, or general model deployment frameworks.

There are 74 model compression optimization frameworks tracked. 3 score above 70 (verified tier). The highest-rated is Xilinx/brevitas at 82/100 with 1,500 stars and 33,208 monthly downloads. 2 of the top 10 are actively maintained.

Get all 74 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=model-compression-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 Xilinx/brevitas

Brevitas: neural network quantization in PyTorch

82
Verified
2 fastmachinelearning/qonnx

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

71
Verified
3 open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

71
Verified
4 google/qkeras

QKeras: a quantization deep learning library for Tensorflow Keras

68
Established
5 tensorflow/model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow,...

67
Established
6 FasterAI-Labs/fasterai

FasterAI: Prune and Distill your models with FastAI and PyTorch

64
Established
7 SonySemiconductorSolutions/mct-model-optimization

Model Compression Toolkit (MCT) is an open source project for neural network...

61
Established
8 lucidrains/vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

61
Established
9 krasserm/perceiver-io

A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with...

57
Established
10 QPT-Family/QPT

[内测中]QPT - 致力于让开源项目更好通往互联网世界的Python to EXE工具(Python打包)。

56
Established
11 Efficient-ML/Awesome-Model-Quantization

A list of papers, docs, codes about model quantization. This repo is aimed...

55
Established
12 happynear/AMSoftmax

A simple yet effective loss function for face verification.

51
Established
13 Eric-mingjie/network-slimming

Network Slimming (Pytorch) (ICCV 2017)

51
Established
14 Eric-mingjie/rethinking-network-pruning

Rethinking the Value of Network Pruning (Pytorch) (ICLR 2019)

51
Established
15 OpenPPL/ppq

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

49
Emerging
16 MingSun-Tse/Efficient-Deep-Learning

Collection of recent methods on (deep) neural network compression and acceleration.

48
Emerging
17 foolwood/pytorch-slimming

Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.

48
Emerging
18 jack-willturner/deep-compression

Learning both Weights and Connections for Efficient Neural Networks...

47
Emerging
19 onnx/neural-compressor

Model compression for ONNX

47
Emerging
20 tianyic/only_train_once_personal_footprint

OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured...

46
Emerging
21 liuzhuang13/slimming

Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.

46
Emerging
22 google-research/rigl

End-to-end training of sparse deep neural networks with little-to-no...

45
Emerging
23 cedrickchee/awesome-ml-model-compression

Awesome machine learning model compression research papers, quantization,...

44
Emerging
24 megvii-research/Sparsebit

A model compression and acceleration toolbox based on pytorch.

43
Emerging
25 lucaslie/torchprune

A research library for pytorch-based neural network pruning, compression, and more.

43
Emerging
26 jacobgil/pytorch-pruning

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks...

43
Emerging
27 chester256/Model-Compression-Papers

Papers for deep neural network compression and acceleration

41
Emerging
28 snap-research/F8Net

[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network...

41
Emerging
29 apple/ml-upscale

Export utility for unconstrained channel pruned models

40
Emerging
30 GoGoDuck912/pytorch-vector-quantization

A Pytorch Implementations for Various Vector Quantization Methods

39
Emerging
31 hkproj/quantization-notes

Notes on quantization in neural networks

38
Emerging
32 skolai/fewbit

Compression schema for gradients of activations in backward pass

37
Emerging
33 jeshraghian/QSNNs

Quantization-aware training with spiking neural networks

37
Emerging
34 harvard-edge/QuaRL

QuaRL is an open-source framework for systematically studying the effect of...

36
Emerging
35 LeapLabTHU/EfficientTrain

1.5−3.0× lossless training or pre-training speedup. An off-the-shelf,...

35
Emerging
36 fdbtrs/ElasticFace

Official repository of CVPRW2022 paper, ElasticFace: Elastic Margin Loss for...

35
Emerging
37 StijnVerdenius/SNIP-it

This repository is the official implementation of the paper Pruning via...

34
Emerging
38 EIDOSLAB/simplify

Simplification of pruned models for accelerated inference | SoftwareX...

31
Emerging
39 ciodar/deep-compression

PyTorch Lightning implementation of the paper Deep Compression: Compressing...

31
Emerging
40 cybertronai/SutroYaro

Sutro Group — Energy-Efficient AI Training Research. Sparse parity...

30
Emerging
41 523333333/quantile_pooling

Stacking Deep Set Networks and Pooling by Quantiles

29
Experimental
42 Eclipsess/CHIP_NeurIPS2021

Code for CHIP: CHannel Independence-based Pruning for Compact Neural...

29
Experimental
43 mcmahon-lab/ONN-device-control

Device control modules for an optical matrix-vector multiplier with a low...

28
Experimental
44 oluwafemidiakhoa/adaptive-sparse-training

Adaptive Sparse Training (AST): 92.1% ImageNet-100 accuracy with 61% energy...

28
Experimental
45 Chenqing-Lin/FAIR-Pruner

Research-ready and production-friendly neural network pruning for...

27
Experimental
46 QiaozheZhang/Global-One-shot-Pruning

An official implementation of the paper "How Sparse Can We Prune A Deep...

25
Experimental
47 archinetai/bitcodes-pytorch

A vector quantization method with binary codes, in PyTorch.

25
Experimental
48 iurada/px-ntk-pruning

Official repository of our work "Finding Lottery Tickets in Vision Models...

24
Experimental
49 kklemon/FlashPerceiver

Fast and memory efficient PyTorch implementation of the Perceiver with...

24
Experimental
50 Firmamento-Technologies/TurboQuant

TurboQuant: Near-Optimal Vector Quantization for AI — Pure Python/NumPy...

23
Experimental
51 mcmahon-lab/ONN-QAT-SQL

Scripts for training neural networks resistant to photon shot noise with...

23
Experimental
52 AkliluYirgalem/live-quantization

real-time model quantization directly in the browser

22
Experimental
53 shinymonitor/qmtik

Quantized Model Training and Inference Kit

22
Experimental
54 GenauraApp/TurboQuant

Near-optimal vector quantization with zero metadata overhead — PyTorch SDK...

22
Experimental
55 Intelligent-Microsystems-Lab/SNNQuantPrune

Code for the ISCAS23 paper "The Hardware Impact of Quantization and Pruning...

21
Experimental
56 m4urin/quantized-liquid-state-machines

A Liquid State Machine using quantized neurons that are operating on...

20
Experimental
57 ksm26/Quantization-in-Depth

Dive into advanced quantization techniques. Learn to implement and customize...

20
Experimental
58 approx-ml/approx

Automatic quantization library

20
Experimental
59 Nikolai10/FSQ

TensorFlow implementation of "Finite Scalar Quantization: VQ-VAE Made...

19
Experimental
60 ZIB-IOL/SMS

Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups:...

18
Experimental
61 iurada/talos-task-arithmetic

Official repository of our work "Efficient Model Editing with Task-Localized...

17
Experimental
62 camail-official/LinearAttentionPruning

This is the official repository for the pre-print "The Key to State...

16
Experimental
63 zanvari/resnet50-quantization

Resnet50 Quantization for Inference Speedup in PyTorch

15
Experimental
64 yzamari/turboQuantPlayground

TurboQuant (ICLR 2026) ported to Apple Silicon — KV cache compression with...

15
Experimental
65 DataDarling/AI-Proposal-Model-Compression-for-Low-Carbon-Ecological-Image-Classification-on-Edge-Devices

This paper proposes evaluating pruning and quantization techniques to reduce...

15
Experimental
66 upunaprosk/Awesome-LLM-Compression-Safety

A curated list of papers, docs, and code on the undesired effects of model...

13
Experimental
67 jianhayes/NESTQUANT

NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN...

12
Experimental
68 erectbranch/Awesome-Activation-Sparsification

A curated list of neural network activation sparsification resources.

12
Experimental
69 medoidai/model-quantization-blog-notebooks

Notebook from "A Hands-On Walkthrough on Model Quantization" blog post.

12
Experimental
70 priyanshujiiii/awesome-Quantization

In this repo you will understand .The process of reducing the precision of a...

11
Experimental
71 priyankkalgaonkar/CondenseNeXt

An Ultra-Efficient Deep Neural Network for Embedded Systems

11
Experimental
72 julianscher/gpt-adaprune

An integrated PyTorch pipeline for pretraining GPT-2 on linear regression...

11
Experimental
73 chadHGY/awesome-deep-model-compression

Awesome Deep Model Compression

11
Experimental
74 Mainframework/Quanta

Convert and quantize llm models

10
Experimental