bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Implements vector-wise and block-wise quantization strategies with specialized handling for outliers, enabling 8-bit inference without performance loss and 4-bit training via low-rank adaptation (LoRA). Provides drop-in `Linear8bitLt` and `Linear4bit` modules alongside 8-bit optimizers, integrating directly with Hugging Face Transformers, Diffusers, and PEFT. Supports NVIDIA/AMD/Intel GPUs, CPUs with AVX2+, and Apple Silicon across Linux, Windows, and macOS.
8,033 stars and 6,225,728 monthly downloads. Used by 73 other packages. Actively maintained with 17 commits in the last 30 days. Available on PyPI.
Stars
8,033
Forks
831
Language
Python
License
MIT
Category
Last pushed
Mar 10, 2026
Monthly downloads
6,225,728
Commits (30d)
17
Dependencies
3
Reverse dependents
73
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/bitsandbytes-foundation/bitsandbytes"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model...
dropbox/hqq
Official implementation of Half-Quadratic Quantization (HQQ)
OpenGVLab/OmniQuant
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
VITA-Group/Q-GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.
Hsu1023/DuQuant
[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger...