kvcache-ai/Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

72
/ 100
Verified

Implements a disaggregated KVCache architecture that decouples prefill and decode stages via efficient cross-device/cross-machine transfer, with components like the Transfer Engine (RDMA-optimized KV cache movement) and Mooncake Store (hierarchical distributed cache pool). Integrates with vLLM, SGLang, TensorRT-LLM, and LMDeploy as a backend connector for multi-node inference pipelines, enabling zero-copy embeddings sharing and dynamic KV cache offloading across GPU, host, and remote storage tiers.

4,911 stars. Actively maintained with 119 commits in the last 30 days.

No Package No Dependents
Maintenance 25 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 21 / 25

How are scores calculated?

Stars

4,911

Forks

600

Language

C++

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

119

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/kvcache-ai/Mooncake"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.