ronantakizawa/cacheaugmentedgeneration
A Demo of Cache-Augmented Generation (CAG) in an LLM
Leverages Hugging Face Transformers' `DynamicCache` API to preload domain knowledge into Mistral-7B's context window, eliminating runtime document retrieval overhead. The implementation achieves 76% token reduction compared to traditional RAG by persisting cached context to disk for reuse across multiple chat sessions. Demonstrated through a Jupyter notebook that shows knowledge preloading, query answering against cached context, and cache serialization workflows.
123 stars. No commits in the last 6 months.
Stars
123
Forks
21
Language
Jupyter Notebook
License
—
Category
Last pushed
Jun 10, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/ronantakizawa/cacheaugmentedgeneration"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
nicolaric/rahmenabkommen-gpt
"Ask your question about the new framework agreement between Switzerland and the EU." Answers...
talkdai/dialog
RAG LLM Ops App for easy deployment and testing
ARUNAGIRINATHAN-K/pdf-RAG-question-answering
Upload PDFs → ask questions → get grounded answers.
Kesara03/knowledge-graph-rag-pipeline
Knowledge graph RAG system with LLM-powered entity extraction, Neo4j graph traversal, and hybrid...
fabao2024/Rag-doc-assistant
Production-ready Retrieval-Augmented Generation (RAG) system for PDF question-answering. Built...