Llm Quantization Techniques Transformer Models

There are 13 llm quantization techniques models tracked. 2 score above 70 (verified tier). The highest-rated is intel/neural-compressor at 90/100 with 2,597 stars and 27,671 monthly downloads. 2 of the top 10 are actively maintained.

Get all 13 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-quantization-techniques&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 intel/neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity;...

90
Verified
2 bitsandbytes-foundation/bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

90
Verified
3 dropbox/hqq

Official implementation of Half-Quadratic Quantization (HQQ)

47
Emerging
4 OpenGVLab/OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization...

42
Emerging
5 VITA-Group/Q-GaLore

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank...

33
Emerging
6 Hsu1023/DuQuant

[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation...

33
Emerging
7 taishan1994/LLM-Quantization

记录量化LLM中的总结。

32
Emerging
8 Aaronhuang-778/BiLLM

[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

32
Emerging
9 actypedef/ARCQuant

Code for the paper "ARCQuant: Boosting NVFP4 Quantization with Augmented...

29
Experimental
10 IST-DASLab/Quartet-II

Quartet II Official Code

28
Experimental
11 snu-mllab/GuidedQuant

Official PyTorch implementation of "GuidedQuant: Large Language Model...

24
Experimental
12 xvyaward/owq

Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization...

22
Experimental
13 NoakLiu/LLMEasyQuant

A Serving System for Distributed and Parallel LLM Quantization [Efficient ML System]

14
Experimental