Kv Cache Optimization Transformer Models

There are 7 kv cache optimization models tracked. 1 score above 70 (verified tier). The highest-rated is LMCache/LMCache at 92/100 with 7,664 stars and 170,335 monthly downloads. 1 of the top 10 are actively maintained.

Get all 7 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=kv-cache-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	LMCache/LMCache Supercharge Your LLM with the Fastest KV Cache Layer	92	Verified	7,664	Python
2	Zefan-Cai/KVCache-Factory Unified KV Cache Compression Methods for Auto-Regressive Models	47	Emerging	1,309	Python
3	dataflowr/llm_efficiency KV Cache & LoRA for minGPT	41	Emerging	59	Python
4	itsnamgyu/block-transformer Block Transformer: Global-to-Local Language Modeling for Fast Inference...	38	Emerging	163	Python
5	OnlyTerp/turboquant First open-source implementation of Google TurboQuant (ICLR 2026) --...	35	Emerging	36	Python
6	codepawl/turboquant-torch Unofficial PyTorch implementation of TurboQuant (Google Research, ICLR...	27	Experimental	9	Python
7	DRSY/EasyKV Easy control for Key-Value Constrained Generative LLM...	25	Experimental	62	Python

Comparisons in this category

LMCache and llm_efficiency (92 vs 41) turboquant and turboquant-torch (35 vs 27)