sakshamVerma08/MultiModal-RAG-Practice-

Multi-Modal RAG: Retrieval-Augmented Generation over Text and Visual PDFs A multi-modal RAG system capable of understanding and reasoning over PDFs containing both text and images. Combines LangChain, CLIP, and FAISS to extract textual content, encode visual features, and enable unified semantic retrieval for context-aware responses.

/ 100

Experimental

No Package No Dependents

Maintenance 6 / 25

Adoption 0 / 25

Maturity 9 / 25

Community 0 / 25

How are scores calculated?

Stars

—

Forks

—

Language

—

License

MIT

Category

multimodal-rag-systems

Last pushed

Oct 27, 2025

Commits (30d)

GitHub

Multimodal RAG Systems · 98 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/sakshamVerma08/MultiModal-RAG-Practice-"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

AnswerDotAI/byaldi

Use late-interaction multi-modal models such as ColPali in just a few lines of code.

illuin-tech/colpali

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

jolibrain/colette

Multimodal RAG to search and interact locally with technical documents of any kind

nannib/nbmultirag

Un framework in Italiano ed Inglese, che permette di chattare con i propri documenti in RAG,...

OpenBMB/VisRAG

Parsing-free RAG supported by VLMs

Explore RAG Tools

All categories Trending RAG directory Insights