luisrui/Modality-Interference-in-MLLMs

The source code for the paper "Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models"

/ 100

Experimental

This project helps AI researchers and machine learning engineers improve the reliability of their Multimodal Large Language Models (MLLMs). It provides tools to diagnose why MLLMs sometimes get confused by irrelevant information from different input types (like an image with a text question) and offers methods to train more robust models. The input is existing MLLM models and training data, and the output is a fine-tuned MLLM that performs better on tasks requiring it to focus on only one type of input.

No commits in the last 6 months.

Use this if you are developing or deploying MLLMs and need to ensure they can accurately distinguish between relevant and irrelevant information across different modalities, especially for tasks that should rely on a single input type.

Not ideal if you are not working with Multimodal Large Language Models or if your primary concern is not model robustness against irrelevant modal inputs.

AI Research Machine Learning Engineering Multimodal AI Model Robustness Natural Language Processing

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 4 / 25

Maturity 7 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

jingyaogong/minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

SkyworkAI/Skywork-R1V

Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in...

roboflow/vision-ai-checkup

Take your LLM to the optometrist.

OpenGVLab/InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

InternLM/InternLM-XComposer

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video...

Explore LLM Tools

All categories Trending LLM Tool directory Insights