itsqyh/Awesome-LMMs-Mechanistic-Interpretability

A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explore how LMMs represent, transform, and align multimodal information internally.

/ 100

Emerging

192 stars.

No License No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 6 / 25

How are scores calculated?

Stars

192

Forks

Language

—

License

—

Category

llm-interpretability-explainability

Last pushed

Mar 04, 2026

Commits (30d)

GitHub

Llm Interpretability Explainability · 29 models

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/itsqyh/Awesome-LMMs-Mechanistic-Interpretability"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

MadryLab/context-cite

Attribute (or cite) statements generated by LLMs back to in-context information.

microsoft/augmented-interpretable-models

Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.

Trustworthy-ML-Lab/CB-LLMs

[ICLR 25] A novel framework for building intrinsically interpretable LLMs with...

poloclub/LLM-Attributor

LLM Attributor: Attribute LLM's Generated Text to Training Data

THUDM/LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Explore Transformer Models

All categories Trending Transformer directory Insights