Kv Cache Optimization Transformer Models

There are 7 kv cache optimization models tracked. 1 score above 70 (verified tier). The highest-rated is LMCache/LMCache at 92/100 with 7,664 stars and 170,335 monthly downloads. 1 of the top 10 are actively maintained.

Get all 7 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=kv-cache-optimization&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 LMCache/LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

92
Verified
2 Zefan-Cai/KVCache-Factory

Unified KV Cache Compression Methods for Auto-Regressive Models

47
Emerging
3 dataflowr/llm_efficiency

KV Cache & LoRA for minGPT

41
Emerging
4 itsnamgyu/block-transformer

Block Transformer: Global-to-Local Language Modeling for Fast Inference...

38
Emerging
5 OnlyTerp/turboquant

First open-source implementation of Google TurboQuant (ICLR 2026) --...

35
Emerging
6 codepawl/turboquant-torch

Unofficial PyTorch implementation of TurboQuant (Google Research, ICLR...

27
Experimental
7 DRSY/EasyKV

Easy control for Key-Value Constrained Generative LLM...

25
Experimental