Multimodal Vision Language Models LLM Tools

Comprehensive surveys, benchmarks, and research collections on vision-language models, multimodal learning architectures, and their domain-specific applications (remote sensing, transportation, urban computing, weather). Does NOT include individual model implementations, fine-tuning techniques, or tools for building applications with these models.

There are 43 multimodal vision language models tools tracked. 1 score above 50 (established tier). The highest-rated is hijkzzz/Awesome-LLM-Strawberry at 50/100 with 6,896 stars. 1 of the top 10 are actively maintained.

Get all 43 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=multimodal-vision-language-models&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	hijkzzz/Awesome-LLM-Strawberry A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓...	50	Established	6,896	—
2	chrisliu298/awesome-llm-unlearning A resource repository for machine unlearning in large language models	45	Emerging	551	—
3	worldbench/awesome-spatial-intelligence 🌐 Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training...	44	Emerging	142	HTML
4	worldbench/awesome-vla-for-ad 🌐 Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future	44	Emerging	331	HTML
5	zjukg/KG-MM-Survey Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey	42	Emerging	479	—
6	sou350121/VLA-Handbook 本项目旨在为致力于进入VLA(Vision-Language-Action)领域的算法工程师提供一份全中文、实战导向的学习/面试手册。不同于通用的...	39	Emerging	73	HTML
7	RManLuo/Awesome-LLM-KG Awesome papers about unifying LLMs and KGs	38	Emerging	2,572	—
8	worldbench/DriveBench [ICCV 2025] Are VLMs Ready for Autonomous Driving? An Empirical Study from...	36	Emerging	232	Python
9	PeterGriffinJin/Awesome-Language-Model-on-Graphs A curated list of papers and resources based on "Large Language Models on...	35	Emerging	981	—
10	he-h/rhythm [NeurIPS 2025] RHYTHM: Reasoning with Hierarchical Temporal Tokenization for...	33	Emerging	8	Python
11	EmulationAI/awesome-large-audio-models Collection of resources on the applications of Large Language Models (LLMs)...	32	Emerging	726	—
12	MIT-SPARK/LP2 Long-term Human Trajectory Prediction using 3D DSGs	31	Emerging	43	Python
13	llmbev/talk2bev Talk2BEV: Language-Enhanced Bird's Eye View Maps (ICRA'24)	31	Emerging	119	Python
14	PJLab-ADG/awesome-knowledge-driven-AD A curated list of awesome knowledge-driven autonomous driving (continually updated)	31	Emerging	496	—
15	THUMNLab/awesome-large-graph-model Papers about large graph models.	31	Emerging	291	—
16	WLiK/LLM4Rec-Awesome-Papers A list of awesome papers and resources of recommender system on large...	30	Emerging	2,229	—
17	SuperBruceJia/Awesome-Large-Vision-Language-Model Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model	29	Experimental	42	—
18	LJungang/Awesome-Video-Reasoning-Landscape 🔥An open-source survey of the latest video reasoning tasks, paradigms, and...	29	Experimental	153	—
19	tongnie/awesome-llm4tr Exploring the Roles of Large Language Models in Reshaping Transportation...	28	Experimental	71	—
20	NotYuSheng/Multimodal-Large-Language-Model Localized Multimodal Large Language Model (MLLM) integrated with Streamlit...	28	Experimental	5	Python
21	vincentlux/Awesome-Multimodal-LLM Reading list for Multimodal Large Language Models	28	Experimental	69	—
22	basiclab/TTSG Traffic Scene Generation from Natural Language Description for Autonomous...	27	Experimental	55	Python
23	Xiaohao-Liu/Awesome-Multi-Token-Prediction A curated list of papers, tools, and resources on Multi-Token Prediction...	26	Experimental	54	—
24	cocacola-lab/Awesome-Transformer-in-Transportation Papers & resources linked to Transformer-based research mainly for...	26	Experimental	6	—
25	archersama/awesome-recommend-system-pretraining-papers Paper List for Recommend-system PreTrained Models	25	Experimental	347	—
26	OpenTSLab/TimeOmni [ICLR 2026] Official implementation of SciTS: Scientific Time Series...	24	Experimental	10	Python
27	Atomic-man007/Awesome_Multimodel_LLM Awesome_Multimodel is a curated GitHub repository that provides a...	24	Experimental	364	—
28	YingqingHe/Awesome-LLMs-meet-Multimodal-Generation 🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image,...	24	Experimental	540	HTML
29	westlake-repl/MicroLens A Large Short-video Recommendation Dataset with Raw Text/Audio/Image/Videos...	24	Experimental	257	Python
30	Nehs6xy3hgdguzjs/Awesome-Video-Reasoning 🎥 Explore cutting-edge research focused on reasoning with video models,...	23	Experimental	1	—
31	edujbarrios/awesome-vision-ai-stack A curated, builder-first list of Vision Language Models (VLMs), local...	22	Experimental	—	Markdown
32	davendw49/Awesome-Long-Context-Language-Modeling Papers of Long Context Language Model	21	Experimental	10	—
33	tongnie/IMPEL TRE'25: Joint Estimation and Prediction of City-wide Delivery Demand: A...	21	Experimental	7	Python
34	AdityaLab/MM4TSA A professional list on Multi-Modalities For Time Series Analysis (MM4TSA)...	20	Experimental	77	—
35	thetuantrinh/Radar-Language-Models-Survey Survey of Radar–Language Models for semantic radar perception and reasoning.	20	Experimental	1	—
36	ThomasVonWu/Awesome-VLMs-Strawberry A collection of VLMs papers, blogs, and projects, with a focus on VLMs in...	18	Experimental	11	—
37	Tangkfan/Awesome-Temporal-Video-Grounding paper list on Video Moment Retrieval (VMR), or Temporal Video Grounding...	17	Experimental	36	—
38	bailynlove/Awesome-OCR-Vision-Based-Context-Compression Awesome list of paper on vision-based context compression	14	Experimental	3	—
39	showlab/Awesome-Long-Context A curated list of resources about long-context in large-language models and...	14	Experimental	32	—
40	leo038/robot_manipulation_survey 机械臂抓取工作汇总调研。	13	Experimental	6	—
41	chrisliu298/awesome-sparse-autoencoders A resource repository of sparse autoencoders for large language models	13	Experimental	8	—
42	xiexukang/awesome-speech-resources Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis,...	13	Experimental	2	—
43	HKUDS/Awesome-LLM4Urban-Papers [ACM TIST] "LLM4Urban: Urban Computing in the Era of Large Language Models"	12	Experimental	46	—

Comparisons in this category

Awesome-Language-Model-on-Graphs and awesome-large-graph-model (35 vs 31)