Bbs1412/rag-with-gemma3
This project is a modular Retrieval-Augmented Generation (RAG) system built with Google DeepMind's - Gemma 3 served locally using Ollama.
Implements end-to-end document processing with vector embeddings via FAISS, history summarization for context-aware retrieval, and streaming responses through FastAPI Server-Sent Events. The modular LangChain-based system handles multi-file ingestion, user-specific document storage with SQLite authentication, and supports thinking models for transparent reasoning. Fully containerized with Docker for deployment on Hugging Face Spaces, using locally-served mxbai embeddings and Gemma-3 via Ollama to maintain privacy and low latency.
No commits in the last 6 months.
Stars
11
Forks
2
Language
Python
License
GPL-3.0
Category
Last pushed
Jul 04, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/Bbs1412/rag-with-gemma3"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
ImadSaddik/RAG_With_Gemini
Providing useful context by using Retrieval Augmented Generation (RAG) to Gemini
falconlee236/rag-from-scratch-with-gemini
This Repository is Google Gemini version of rag-from-scratch with langchain
spashx/abyss.site
website for abyss
ImadSaddik/DoCamp
RAG (Retrieval Augmented Generation) on Android
Grashopr-888/API_AutoTag
Audio Processing and Indexing - RAG and Transfer Learning