luisrui/Modality-Interference-in-MLLMs

The source code for the paper "Diagnosing and Mitigating Modality Interference in Multimodal Large Language Models"

13
/ 100
Experimental

This project helps AI researchers and machine learning engineers improve the reliability of their Multimodal Large Language Models (MLLMs). It provides tools to diagnose why MLLMs sometimes get confused by irrelevant information from different input types (like an image with a text question) and offers methods to train more robust models. The input is existing MLLM models and training data, and the output is a fine-tuned MLLM that performs better on tasks requiring it to focus on only one type of input.

No commits in the last 6 months.

Use this if you are developing or deploying MLLMs and need to ensure they can accurately distinguish between relevant and irrelevant information across different modalities, especially for tasks that should rely on a single input type.

Not ideal if you are not working with Multimodal Large Language Models or if your primary concern is not model robustness against irrelevant modal inputs.

AI Research Machine Learning Engineering Multimodal AI Model Robustness Natural Language Processing
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 4 / 25
Maturity 7 / 25
Community 0 / 25

How are scores calculated?

Stars

7

Forks

Language

Python

License

Last pushed

Sep 24, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/luisrui/Modality-Interference-in-MLLMs"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.