Awesome-Multimodal-Large-Language-Models and Awesome-VLA

A is a comprehensive collection of resources on multimodal large language models, including Vision Language Action (VLA) models, making B, which specifically focuses on VLA advancements, a specialized subset or a more focused alternative to A within the broader multimodal LLM ecosystem.

Maintenance 20/25
Adoption 10/25
Maturity 8/25
Community 18/25
Maintenance 10/25
Adoption 9/25
Maturity 7/25
Community 7/25
Stars: 17,448
Forks: 1,112
Downloads:
Commits (30d): 7
Language:
License:
Stars: 109
Forks: 4
Downloads:
Commits (30d): 0
Language:
License:
No License No Package No Dependents
No License No Package No Dependents

About Awesome-Multimodal-Large-Language-Models

BradyFU/Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Comprehensive curated repository of research papers, datasets, and benchmarks covering multimodal LLM advances across instruction tuning, hallucination mitigation, and reasoning tasks. Features proprietary evaluation frameworks (MME, Video-MME, MME-RealWorld) and the VITA series of omni-modal models supporting real-time vision-speech interaction and embodied reasoning. Targets the broader MLLM research ecosystem with extensive documentation of 750+ references and curated resources for model development and evaluation.

About Awesome-VLA

Orlando-CS/Awesome-VLA

✨✨latest advancements in VLA models(VIsion Language Action)

Scores updated daily from GitHub, PyPI, and npm data. How scores work