SRafi007/Quantization-for-LLMs-An-Intuitive-Introduction

A beginner-friendly note explaining why and how quantization is used in large language models, covering FP32, FP16, INT8/INT4, symmetric vs asymmetric quantization, and basic scaling concepts in a simple, intuitive way

/ 100

Experimental

No License No Package No Dependents

Maintenance 10 / 25

Adoption 0 / 25

Maturity 1 / 25

Community 0 / 25

How are scores calculated?

Stars

—

Forks

—

Language

Jupyter Notebook

License

—

Category

llm-quantization-techniques

Last pushed

Feb 01, 2026

Commits (30d)

GitHub

LLM Quantization Techniques · 21 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/SRafi007/Quantization-for-LLMs-An-Intuitive-Introduction"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

huawei-csl/SINQ

Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method...

SILX-LABS/QUASAR-SUBNET

QUASAR is a long-context foundation model and decentralized evaluation subnet built on Bittensor,

stackblogger/bitnet.js

BitNet.Js - A node.js implementation of the microsoft bitnet.cpp inference framework.

AnswerDotAI/cold-compress

Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking...

FMInference/H2O

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

Explore LLM Tools

All categories Trending LLM Tool directory Insights