ronantakizawa/cacheaugmentedgeneration

A Demo of Cache-Augmented Generation (CAG) in an LLM

/ 100

Emerging

Leverages Hugging Face Transformers' `DynamicCache` API to preload domain knowledge into Mistral-7B's context window, eliminating runtime document retrieval overhead. The implementation achieves 76% token reduction compared to traditional RAG by persisting cached context to disk for reuse across multiple chat sessions. Demonstrated through a Jupyter notebook that shows knowledge preloading, query answering against cached context, and cache serialization workflows.

123 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 1 / 25

Community 18 / 25

How are scores calculated?

Stars

123

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

nicolaric/rahmenabkommen-gpt

"Ask your question about the new framework agreement between Switzerland and the EU." Answers...

talkdai/dialog

RAG LLM Ops App for easy deployment and testing

ARUNAGIRINATHAN-K/pdf-RAG-question-answering

Upload PDFs → ask questions → get grounded answers.

Kesara03/knowledge-graph-rag-pipeline

Knowledge graph RAG system with LLM-powered entity extraction, Neo4j graph traversal, and hybrid...

fabao2024/Rag-doc-assistant

Production-ready Retrieval-Augmented Generation (RAG) system for PDF question-answering. Built...

Explore LLM Tools

All categories Trending LLM Tool directory Insights