hhhuang/CAG
Cache-Augmented Generation: A Simple, Efficient Alternative to RAG
Preloads knowledge documents into the model's KV-cache during initialization, enabling inference without real-time retrieval steps. Supports comparative evaluation against RAG pipelines using BM25 and OpenAI retrievers on SQuAD and HotpotQA datasets, with configurable context lengths and document counts to measure performance tradeoffs. Works with Hugging Face models (tested on Llama-3.1-8B-Instruct) and includes Docker support for reproducible experimentation.
1,471 stars. No commits in the last 6 months.
Stars
1,471
Forks
217
Language
Python
License
MIT
Category
Last pushed
May 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/hhhuang/CAG"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ictnlp/FlexRAG
FlexRAG: A RAG Framework for Information Retrieval and Generation.
VectorInstitute/fed-rag
A framework for fine-tuning retrieval-augmented generation (RAG) systems.
NirDiamant/RAG_Techniques
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG)...
RUC-NLPIR/FlashRAG
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
gomate-community/TrustRAG
TrustRAG:The RAG Framework within Reliable input,Trusted output