BradyFU/Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

/ 100

Established

Comprehensive curated repository of research papers, datasets, and benchmarks covering multimodal LLM advances across instruction tuning, hallucination mitigation, and reasoning tasks. Features proprietary evaluation frameworks (MME, Video-MME, MME-RealWorld) and the VITA series of omni-modal models supporting real-time vision-speech interaction and embodied reasoning. Targets the broader MLLM research ecosystem with extensive documentation of 750+ references and curated resources for model development and evaluation.

17,448 stars. Actively maintained with 7 commits in the last 30 days.

No License No Package No Dependents

Maintenance 20 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 18 / 25

How are scores calculated?

Stars

17,448

Forks

1,112

Language

—

License

—

Compare

Awesome-Multimodal-Large-Language-Models and Awesome-Multimodal-LLM Awesome-Multimodal-Large-Language-Models and Awesome-Multimodal-LLM-Autonomous-Driving Awesome-Multimodal-Large-Language-Models and Awesome-VLA

Related models

FoundationVision/Liquid

(Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generators

Paranioar/Awesome_Matching_Pretraining_Transfering

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification),...

Yangyi-Chen/Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the...

thuml/AutoTimes

Official implementation for "AutoTimes: Autoregressive Time Series Forecasters via Large Language Models"

flixpar/med-ts-llm

MedTsLLM: Leveraging LLMs for Multimodal Medical Time Series Analysis

Explore Transformer Models

All categories Trending Transformer directory Insights