Model Compression Optimization ML Frameworks

Tools and techniques for reducing neural network size and computational requirements through quantization, pruning, and compression. Does NOT include training acceleration, architecture search, or general model deployment frameworks.

There are 74 model compression optimization frameworks tracked. 3 score above 70 (verified tier). The highest-rated is Xilinx/brevitas at 82/100 with 1,500 stars and 33,208 monthly downloads. 2 of the top 10 are actively maintained.

Get all 74 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=model-compression-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Framework	Score	Tier	Stars	Language
1	Xilinx/brevitas Brevitas: neural network quantization in PyTorch	82	Verified	1,500	Python
2	fastmachinelearning/qonnx QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX	71	Verified	179	Python
3	open-mmlab/mmengine OpenMMLab Foundational Library for Training Deep Learning Models	71	Verified	1,456	Python
4	google/qkeras QKeras: a quantization deep learning library for Tensorflow Keras	68	Established	578	Python
5	tensorflow/model-optimization A toolkit to optimize ML models for deployment for Keras and TensorFlow,...	67	Established	1,565	Python
6	FasterAI-Labs/fasterai FasterAI: Prune and Distill your models with FastAI and PyTorch	64	Established	253	Jupyter Notebook
7	SonySemiconductorSolutions/mct-model-optimization Model Compression Toolkit (MCT) is an open source project for neural network...	61	Established	431	Python
8	lucidrains/vector-quantize-pytorch Vector (and Scalar) Quantization, in Pytorch	61	Established	3,878	Python
9	krasserm/perceiver-io A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with...	57	Established	518	Python
10	QPT-Family/QPT [内测中]QPT - 致力于让开源项目更好通往互联网世界的Python to EXE工具（Python打包）。	56	Established	795	Python
11	Efficient-ML/Awesome-Model-Quantization A list of papers, docs, codes about model quantization. This repo is aimed...	55	Established	2,333	—
12	happynear/AMSoftmax A simple yet effective loss function for face verification.	51	Established	491	Matlab
13	Eric-mingjie/network-slimming Network Slimming (Pytorch) (ICCV 2017)	51	Established	919	Python
14	Eric-mingjie/rethinking-network-pruning Rethinking the Value of Network Pruning (Pytorch) (ICLR 2019)	51	Established	1,516	Python
15	OpenPPL/ppq PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.	49	Emerging	1,788	Python
16	MingSun-Tse/Efficient-Deep-Learning Collection of recent methods on (deep) neural network compression and acceleration.	48	Emerging	954	—
17	foolwood/pytorch-slimming Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.	48	Emerging	577	Python
18	jack-willturner/deep-compression Learning both Weights and Connections for Efficient Neural Networks...	47	Emerging	181	Jupyter Notebook
19	onnx/neural-compressor Model compression for ONNX	47	Emerging	99	Python
20	tianyic/only_train_once_personal_footprint OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured...	46	Emerging	310	Python
21	liuzhuang13/slimming Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.	46	Emerging	576	Lua
22	google-research/rigl End-to-end training of sparse deep neural networks with little-to-no...	45	Emerging	335	Python
23	cedrickchee/awesome-ml-model-compression Awesome machine learning model compression research papers, quantization,...	44	Emerging	539	—
24	megvii-research/Sparsebit A model compression and acceleration toolbox based on pytorch.	43	Emerging	332	Python
25	lucaslie/torchprune A research library for pytorch-based neural network pruning, compression, and more.	43	Emerging	163	Shell
26	jacobgil/pytorch-pruning PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks...	43	Emerging	887	Python
27	chester256/Model-Compression-Papers Papers for deep neural network compression and acceleration	41	Emerging	401	—
28	snap-research/F8Net [ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network...	41	Emerging	93	Python
29	apple/ml-upscale Export utility for unconstrained channel pruned models	40	Emerging	71	Jupyter Notebook
30	GoGoDuck912/pytorch-vector-quantization A Pytorch Implementations for Various Vector Quantization Methods	39	Emerging	36	Python
31	hkproj/quantization-notes Notes on quantization in neural networks	38	Emerging	121	Jupyter Notebook
32	skolai/fewbit Compression schema for gradients of activations in backward pass	37	Emerging	45	Python
33	jeshraghian/QSNNs Quantization-aware training with spiking neural networks	37	Emerging	53	Python
34	harvard-edge/QuaRL QuaRL is an open-source framework for systematically studying the effect of...	36	Emerging	82	Python
35	LeapLabTHU/EfficientTrain 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf,...	35	Emerging	226	Python
36	fdbtrs/ElasticFace Official repository of CVPRW2022 paper, ElasticFace: Elastic Margin Loss for...	35	Emerging	175	Python
37	StijnVerdenius/SNIP-it This repository is the official implementation of the paper Pruning via...	34	Emerging	32	Python
38	EIDOSLAB/simplify Simplification of pruned models for accelerated inference \| SoftwareX...	31	Emerging	36	Python
39	ciodar/deep-compression PyTorch Lightning implementation of the paper Deep Compression: Compressing...	31	Emerging	35	Jupyter Notebook
40	cybertronai/SutroYaro Sutro Group — Energy-Efficient AI Training Research. Sparse parity...	30	Emerging	5	Python
41	523333333/quantile_pooling Stacking Deep Set Networks and Pooling by Quantiles	29	Experimental	4	Jupyter Notebook
42	Eclipsess/CHIP_NeurIPS2021 Code for CHIP: CHannel Independence-based Pruning for Compact Neural...	29	Experimental	39	Python
43	mcmahon-lab/ONN-device-control Device control modules for an optical matrix-vector multiplier with a low...	28	Experimental	13	Jupyter Notebook
44	oluwafemidiakhoa/adaptive-sparse-training Adaptive Sparse Training (AST): 92.1% ImageNet-100 accuracy with 61% energy...	28	Experimental	5	Python
45	Chenqing-Lin/FAIR-Pruner Research-ready and production-friendly neural network pruning for...	27	Experimental	5	Python
46	QiaozheZhang/Global-One-shot-Pruning An official implementation of the paper "How Sparse Can We Prune A Deep...	25	Experimental	29	Python
47	archinetai/bitcodes-pytorch A vector quantization method with binary codes, in PyTorch.	25	Experimental	6	Python
48	iurada/px-ntk-pruning Official repository of our work "Finding Lottery Tickets in Vision Models...	24	Experimental	26	Python
49	kklemon/FlashPerceiver Fast and memory efficient PyTorch implementation of the Perceiver with...	24	Experimental	32	Python
50	Firmamento-Technologies/TurboQuant TurboQuant: Near-Optimal Vector Quantization for AI — Pure Python/NumPy...	23	Experimental	1	Python
51	mcmahon-lab/ONN-QAT-SQL Scripts for training neural networks resistant to photon shot noise with...	23	Experimental	19	Jupyter Notebook
52	AkliluYirgalem/live-quantization real-time model quantization directly in the browser	22	Experimental	29	CSS
53	shinymonitor/qmtik Quantized Model Training and Inference Kit	22	Experimental	3	C
54	GenauraApp/TurboQuant Near-optimal vector quantization with zero metadata overhead — PyTorch SDK...	22	Experimental	—	Python
55	Intelligent-Microsystems-Lab/SNNQuantPrune Code for the ISCAS23 paper "The Hardware Impact of Quantization and Pruning...	21	Experimental	11	Python
56	m4urin/quantized-liquid-state-machines A Liquid State Machine using quantized neurons that are operating on...	20	Experimental	15	Jupyter Notebook
57	ksm26/Quantization-in-Depth Dive into advanced quantization techniques. Learn to implement and customize...	20	Experimental	6	Jupyter Notebook
58	approx-ml/approx Automatic quantization library	20	Experimental	12	Python
59	Nikolai10/FSQ TensorFlow implementation of "Finite Scalar Quantization: VQ-VAE Made...	19	Experimental	21	Python
60	ZIB-IOL/SMS Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups:...	18	Experimental	12	Python
61	iurada/talos-task-arithmetic Official repository of our work "Efficient Model Editing with Task-Localized...	17	Experimental	5	Python
62	camail-official/LinearAttentionPruning This is the official repository for the pre-print "The Key to State...	16	Experimental	9	Python
63	zanvari/resnet50-quantization Resnet50 Quantization for Inference Speedup in PyTorch	15	Experimental	22	Jupyter Notebook
64	yzamari/turboQuantPlayground TurboQuant (ICLR 2026) ported to Apple Silicon — KV cache compression with...	15	Experimental	1	Python
65	DataDarling/AI-Proposal-Model-Compression-for-Low-Carbon-Ecological-Image-Classification-on-Edge-Devices This paper proposes evaluating pruning and quantization techniques to reduce...	15	Experimental	1	—
66	upunaprosk/Awesome-LLM-Compression-Safety A curated list of papers, docs, and code on the undesired effects of model...	13	Experimental	2	—
67	jianhayes/NESTQUANT NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN...	12	Experimental	9	Python
68	erectbranch/Awesome-Activation-Sparsification A curated list of neural network activation sparsification resources.	12	Experimental	1	—
69	medoidai/model-quantization-blog-notebooks Notebook from "A Hands-On Walkthrough on Model Quantization" blog post.	12	Experimental	4	Jupyter Notebook
70	priyanshujiiii/awesome-Quantization In this repo you will understand .The process of reducing the precision of a...	11	Experimental	—	—
71	priyankkalgaonkar/CondenseNeXt An Ultra-Efficient Deep Neural Network for Embedded Systems	11	Experimental	—	Python
72	julianscher/gpt-adaprune An integrated PyTorch pipeline for pretraining GPT-2 on linear regression...	11	Experimental	—	Python
73	chadHGY/awesome-deep-model-compression Awesome Deep Model Compression	11	Experimental	2	—
74	Mainframework/Quanta Convert and quantize llm models	10	Experimental	3	Python

Comparisons in this category

model-optimization and mct-model-optimization (67 vs 61) model-optimization and neural-compressor (67 vs 47) network-slimming and pytorch-slimming (51 vs 48)