Transformer Interpretability Mechanistic Transformer Models

Tools for understanding transformer internals through visualization, attribution analysis, and mechanistic reverse-engineering of learned circuits and representations. Does NOT include general explainability frameworks, dataset analysis tools, or applications built on transformers.

There are 57 transformer interpretability mechanistic models tracked. 3 score above 50 (established tier). The highest-rated is inseq-team/inseq at 67/100 with 462 stars and 739 monthly downloads.

Get all 57 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=transformer-interpretability-mechanistic&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	inseq-team/inseq Interpretability for sequence generation models 🐛 🔍	67	Established	462	Python
2	jessevig/bertviz BertViz: Visualize Attention in Transformer Models	65	Established	7,945	Python
3	EleutherAI/knowledge-neurons A library for finding knowledge neurons in pretrained transformer models.	50	Established	159	Python
4	hila-chefer/Transformer-MM-Explainability [ICCV 2021- Oral] Official PyTorch implementation for Generic...	46	Emerging	903	Jupyter Notebook
5	cdpierse/transformers-interpret Model explainability that works seamlessly with 🤗 transformers. Explain your...	44	Emerging	1,413	Jupyter Notebook
6	taufeeque9/codebook-features Sparse and discrete interpretability tool for neural networks	42	Emerging	64	Python
7	icon-lab/BolT Fused Window Transformers for fMRI Time Series Analysis...	38	Emerging	34	Python
8	DFKI-NLP/thermostat Collection of NLP model explanations and accompanying analysis tools	36	Emerging	144	Jsonnet
9	tongnie/ImputeFormer [KDD 2024] "ImputeFormer: Low Rankness-Induced Transformers for...	31	Emerging	51	Python
10	xmed-lab/TAM [ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs	31	Emerging	180	Python
11	Sandipan99/IndMask IndMask: Inductive Explanation for Multivariate Time Series Black-box Model	31	Emerging	5	Python
12	bvanaken/visbert VisBERT: Demo web app for "How Does BERT Answer Questions?"	29	Experimental	11	JavaScript
13	jakobtroidl/neuron-shape-reasoning PyTorch Implementation of Global Neuron Shape Reasoning with Point Affinity...	27	Experimental	13	Jupyter Notebook
14	andreped/vit-explainer 🔥 Demonstrating Explainable AI with Vision Transformer in web app	26	Experimental	3	Python
15	ApocryphalEditor/SRM-mapping-framework A framework for mapping the internal geometry of transformer representations...	25	Experimental	2	Python
16	gsarti/lcl23-xnlm-lab Materials for the Lab "Explaining Neural Language Models from Internal...	25	Experimental	13	Jupyter Notebook
17	Lumi-node/model-garage Open the hood on neural networks. Component-level model surgery, analysis,...	25	Experimental	3	Python
18	munnabhaiiii981/llm-attention-visualizer 🔍 Visualize attention patterns in transformer models to better understand...	24	Experimental	—	Python
19	ayaka14732/TrAVis TrAVis: Visualise BERT attention in your browser	24	Experimental	58	Python
20	s4um1l/aya-cross-lingual-probe Mechanistic interpretability of cross-lingual concept representations in...	23	Experimental	5	Python
21	mims-harvard/TimeX Time series explainability via self-supervised model behavior consistency	23	Experimental	54	Python
22	ovshake/rat Reverse Attention Tracer: A lightweight API to visualize which words...	22	Experimental	4	Python
23	rashomon-gh/attention-visualiser a module to visualise attention layer activations from transformer based...	22	Experimental	3	Python
24	poppingtonic/transformer-visualization Mechanistic Interpretability Tutorials, Results and research log as I learn...	22	Experimental	9	Jupyter Notebook
25	rubencart/LIIR-TextGraphs-14 Code for KU Leuven LIIR lab's submission to the TextGraphs-14 shared task on...	22	Experimental	1	Python
26	designer-coderajay/logit-lens-explorer Mechanistic interpretability tool visualizing GPT-2's layer-by-layer...	21	Experimental	2	Python
27	khairulislam/Timeseries-Explained Interpreting Deep Learning timeseries models using Local Interpretation methods	20	Experimental	12	Jupyter Notebook
28	skyline-GTRr32/OKI-TRACE OKI TRACE: Local LLM observability. See step-by-step, layer-by-layer what...	20	Experimental	1	Python
29	mytechnotalent/mechanistic_interpretability Mechanistic Interpretability (MI) is a subfield of AI alignment and safety...	20	Experimental	1	Jupyter Notebook
30	MaxwellCalkin/interpretability-toolkit Practical mechanistic interpretability tools — activation caching, linear...	19	Experimental	—	Python
31	JihoonJeong/Neural-MRI Model Resonance Imaging — visualize LLM internals like a brain MRI	19	Experimental	—	TypeScript
32	Benjoyo/next-token-visualization 🧠 Visualize token-by-token sampling with chat templates, nucleus filtering,...	19	Experimental	—	HTML
33	Alvoradozerouno/ORION-MIT-Interpretability-Bridge ORION MIT Interpretability Bridge — MIT research + consciousness...	19	Experimental	—	—
34	designer-coderajay/induction-head-detector Mechanistic interpretability tool to detect induction heads in GPT-2 using...	16	Experimental	1	Python
35	davor10105/relative-absolute-magnitude-propagation Explain the outputs of your Vision Transformers, Residual Networks and...	16	Experimental	4	Python
36	sandipan211/LoCATe-GAT Official PyTorch implementation of the IEEE TETCI 2024 paper LoCATe-GAT	15	Experimental	7	Python
37	tegridydev/mechamap MechaMap - Toolkit for Mechanistic Interpretability (MI) Research	15	Experimental	6	Python
38	luckyspaceOK/llm-attention-visualizer 🔍 Visualize attention patterns in transformer models to better understand...	15	Experimental	—	Python
39	sinaabbasi1/NormXLogit The official repo for the EMNLP 2025 paper "NormXLogit: The Head-on-Top Never Lies"	15	Experimental	—	Jupyter Notebook
40	erfanashams/steve Speech Self-Attention Exploratory Visual Environment	14	Experimental	4	Python
41	DFKI-NLP/SMV Code and data for the ACL 2023 NLReasoning Workshop paper "Saliency Map...	14	Experimental	9	Python
42	zzak00/nlp_with_transformers_visualizations Visualize NLP	14	Experimental	9	—
43	Shravani018/interpreting-transformer-hallucinations Mechanistic interpretability of transformer hallucinations via attention...	14	Experimental	—	HTML
44	garimamittal13/csai_S26 Neuroimaging preprocessing, brain decoding, and visual brain encoding using...	14	Experimental	—	Jupyter Notebook
45	fracapuano/brainformer A transformer-based approach to predicting MEG readings from EEG sensory...	13	Experimental	5	Python
46	amrohendawi/unraveling-bert-article In this article, the factors affecting BERT's transferability is explained...	12	Experimental	3	HTML
47	chizkidd/bert-masked-attention-visualizer Visualizing and analyzing BERT self-attention heads during masked language modeling.	12	Experimental	1	Python
48	dedely/XAI4EO Towards Explainable AI4EO: an explainable DL approach for crop type mapping...	12	Experimental	4	Python
49	Krasnomakov/openMaze_XAI Explainable AI, attention visualization in LLM	11	Experimental	—	HTML
50	rey-reypixel/NeuroWeave A client-side simulation of NLP Transformer models. Visualizes...	11	Experimental	—	TypeScript
51	Param-Uttarwar/neural-network-visualizer Easy-to-use UI based tool that visualizes the internal layers and...	11	Experimental	—	Python
52	HillaryDanan/relativistic-interpretability A geometric framework for understanding neural network reasoning through...	11	Experimental	—	Python
53	jha-lab/dini [Nature-SR'22] DINI: Data Imputation using Neural Inversion	11	Experimental	2	Python
54	jacoboromerodiaz/context-mixing-audio-text Attribution framework for analyzing audio–text context mixing in...	11	Experimental	—	Jupyter Notebook
55	gszfwsb/AutoGnothi Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering...	11	Experimental	24	Python
56	VDuchauffour/transformers-visualizer Explain your 🤗 transformers without effort! Plot the internal behavior of your model.	10	Experimental	1	Python
57	alejoacelas/bayesian-transformers Interpretability on 1-layer Transformer models that converge on the...	10	Experimental	1	Jupyter Notebook