hhhuang/CAG

Cache-Augmented Generation: A Simple, Efficient Alternative to RAG

43
/ 100
Emerging

Preloads knowledge documents into the model's KV-cache during initialization, enabling inference without real-time retrieval steps. Supports comparative evaluation against RAG pipelines using BM25 and OpenAI retrievers on SQuAD and HotpotQA datasets, with configurable context lengths and document counts to measure performance tradeoffs. Works with Hugging Face models (tested on Llama-3.1-8B-Instruct) and includes Docker support for reproducible experimentation.

1,471 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 9 / 25
Community 22 / 25

How are scores calculated?

Stars

1,471

Forks

217

Language

Python

License

MIT

Last pushed

May 26, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/hhhuang/CAG"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.