Llm Quantization Techniques Transformer Models
There are 13 llm quantization techniques models tracked. 2 score above 70 (verified tier). The highest-rated is intel/neural-compressor at 90/100 with 2,597 stars and 27,671 monthly downloads. 2 of the top 10 are actively maintained.
Get all 13 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-quantization-techniques&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity;... |
|
Verified |
| 2 |
bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch. |
|
Verified |
| 3 |
dropbox/hqq
Official implementation of Half-Quadratic Quantization (HQQ) |
|
Emerging |
| 4 |
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization... |
|
Emerging |
| 5 |
VITA-Group/Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank... |
|
Emerging |
| 6 |
Hsu1023/DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation... |
|
Emerging |
| 7 |
taishan1994/LLM-Quantization
记录量化LLM中的总结。 |
|
Emerging |
| 8 |
Aaronhuang-778/BiLLM
[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs |
|
Emerging |
| 9 |
actypedef/ARCQuant
Code for the paper "ARCQuant: Boosting NVFP4 Quantization with Augmented... |
|
Experimental |
| 10 |
IST-DASLab/Quartet-II
Quartet II Official Code |
|
Experimental |
| 11 |
snu-mllab/GuidedQuant
Official PyTorch implementation of "GuidedQuant: Large Language Model... |
|
Experimental |
| 12 |
xvyaward/owq
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization... |
|
Experimental |
| 13 |
NoakLiu/LLMEasyQuant
A Serving System for Distributed and Parallel LLM Quantization [Efficient ML System] |
|
Experimental |