Llm Quantization Techniques Transformer Models

There are 13 llm quantization techniques models tracked. 2 score above 70 (verified tier). The highest-rated is intel/neural-compressor at 90/100 with 2,597 stars and 27,671 monthly downloads. 2 of the top 10 are actively maintained.

Get all 13 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-quantization-techniques&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	intel/neural-compressor SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity;...	90	Verified	2,597	Python
2	bitsandbytes-foundation/bitsandbytes Accessible large language models via k-bit quantization for PyTorch.	90	Verified	8,033	Python
3	dropbox/hqq Official implementation of Half-Quadratic Quantization (HQQ)	47	Emerging	917	Python
4	OpenGVLab/OmniQuant [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization...	42	Emerging	890	Python
5	VITA-Group/Q-GaLore Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank...	33	Emerging	203	Python
6	Hsu1023/DuQuant [NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation...	33	Emerging	180	Python
7	taishan1994/LLM-Quantization 记录量化LLM中的总结。	32	Emerging	63	Python
8	Aaronhuang-778/BiLLM [ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs	32	Emerging	228	Python
9	actypedef/ARCQuant Code for the paper "ARCQuant: Boosting NVFP4 Quantization with Augmented...	29	Experimental	18	Cuda
10	IST-DASLab/Quartet-II Quartet II Official Code	28	Experimental	53	Python
11	snu-mllab/GuidedQuant Official PyTorch implementation of "GuidedQuant: Large Language Model...	24	Experimental	50	Python
12	xvyaward/owq Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization...	22	Experimental	69	Python
13	NoakLiu/LLMEasyQuant A Serving System for Distributed and Parallel LLM Quantization [Efficient ML System]	14	Experimental	26	Python