Multimodal-RAG-Survey and UniversalRAG
These are ecosystem siblings—the survey provides a comprehensive taxonomy and analysis of multimodal RAG approaches that UniversalRAG exemplifies as a practical implementation handling diverse modalities and granularities.
About Multimodal-RAG-Survey
llm-lab-org/Multimodal-RAG-Survey
A Survey on Multimodal Retrieval-Augmented Generation
Organizes and taxonomizes papers on multimodal RAG systems across retrieval strategies (text/vision/video/audio-centric), fusion mechanisms, augmentation techniques, and generation approaches. Provides comprehensive dataset benchmarks spanning image-text, video, audio, medical, and fashion domains with evaluation metrics and training methodologies. Continuously updated resource tracking advances in cross-modal alignment, agentic interaction, and robustness for systems that ground LLM outputs in multimodal external knowledge bases.
About UniversalRAG
wgcyeo/UniversalRAG
UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities
Scores updated daily from GitHub, PyPI, and npm data. How scores work