LLM Compression Optimization LLM Tools

Tools and techniques for reducing LLM size, memory footprint, and inference latency through compression, pruning, quantization, and architectural optimization. Does NOT include general model training, fine-tuning frameworks, or inference serving infrastructure.

There are 30 llm compression optimization tools tracked. 1 score above 70 (verified tier). The highest-rated is Tencent/AngelSlim at 79/100 with 536 stars and 5,117 monthly downloads. 1 of the top 10 are actively maintained.

Get all 30 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-compression-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 Tencent/AngelSlim

Model compression toolkit engineered for enhanced usability,...

79
Verified
2 nebuly-ai/optimate

A collection of libraries to optimise AI model performances

45
Emerging
3 kyo-takano/chinchilla

A toolkit for scaling law research ⚖

38
Emerging
4 liyucheng09/Selective_Context

Compress your input to ChatGPT or other LLMs, to let them process 2x more...

38
Emerging
5 antgroup/glake

GLake: optimizing GPU memory management and IO transmission.

35
Emerging
6 TsingmaoAI/MI-optimize

mi-optimize is a versatile tool designed for the quantization and evaluation...

31
Emerging
7 microsoft/only_train_once

OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured...

29
Experimental
8 robtacconelli/Nacrith-GPU

Nacrith — Lossless text compression via ensemble neural arithmetic coding....

28
Experimental
9 amazon-science/llm-rank-pruning

LLM-Rank: A graph theoretical approach to structured pruning of large...

27
Experimental
10 naskio/mergeui

All-in-one UI for merged LLMs in Hugging Face

26
Experimental
11 AndyyyYuuu/lm-is-compressor

An accurate language model is a high-compression, lossless data compressor

25
Experimental
12 LINs-lab/DeFT

[ICLR 2025] DeFT: Decoding with Flash Tree-attention for Efficient...

24
Experimental
13 deadlykitten4/ERC-SVD

ERC-SVD: Error-Controlled SVD for Large Language Model Compression

23
Experimental
14 oliviersaidi/PACF_LLM

Pattern-aware optimization framework achieving 93.8% complexity reduction in...

23
Experimental
15 M9rth/heretic

🛠 Remove censorship from language models instantly using advanced...

23
Experimental
16 friendshipkim/overfill

Code for OverFill: Two-Stage Models for Efficient Language Model Decoding

19
Experimental
17 Pro-GenAI/ShortLang

Compressed Text for efficient LLMs

18
Experimental
18 talkking/PrunerGPT

[ICASSP2024] One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large...

18
Experimental
19 Yvancg/optimizers

A collection of minimal, dependency-free, performance-focused utilities for...

16
Experimental
20 louisbrulenaudet/mergeKit

Tools for merging pretrained Large Language Models and create Mixture of...

15
Experimental
21 Mikola78/trinity-large-tech-report

🚀 Explore advanced sparse Mixture-of-Experts models with up to 400B...

14
Experimental
22 simocolo/nnDrain

A PyTorch implementation for structural pruning applied to neural networks...

13
Experimental
23 plandes/lmtask

Inferencing and Training Large Language Model Tasks

12
Experimental
24 burcgokden/LLM-from-Power-Law-Decoder-Representations

Implementation of PLDR-LLM: Large Language Model from Power Law Decoder...

11
Experimental
25 0xnu/multicollinearity_llm

A multicollinearity-based compression C program, identifies and removes...

11
Experimental
26 chandan11248/deepseek-innovations-from-scratch

Reverse-engineering how DeepSeek achieved frontier LLM performance at a...

11
Experimental
27 arrmansa/Temporal-Neuron-Variance-Pruning-Demo

An implementation of Variance Pruning: Pruning Language Models via Temporal...

10
Experimental
28 burcgokden/PLDR-LLM-with-KVG-cache

Implementation of PLDR-LLM with KV-cache and G-cache in Pytorch for the...

10
Experimental
29 Exthalpy/GenLang

Self-Decoding Compression Architecture

10
Experimental
30 louisbrulenaudet/mergekit-assistant

Mergekit Assistant is a cutting-edge toolkit designed for the seamless...

10
Experimental