NLP Model Interpretability NLP Tools

Tools and frameworks for explaining, visualizing, and understanding the decisions of NLP and ML models through techniques like feature attribution, concept activation vectors, attention analysis, and model-agnostic explanations. Does NOT include general model evaluation, performance metrics, or bias detection frameworks without interpretability focus.

There are 28 nlp model interpretability tools tracked. 4 score above 50 (established tier). The highest-rated is rmovva/HypotheSAEs at 65/100 with 77 stars and 194 monthly downloads.

Get all 28 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=nlp-model-interpretability&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 rmovva/HypotheSAEs

HypotheSAEs: hypothesizing interpretable relationships in text datasets...

65
Established
2 interpretml/interpret-text

A library that incorporates state-of-the-art explainers for text-based...

53
Established
3 jalammar/ecco

Explain, analyze, and visualize NLP language models. Ecco creates...

52
Established
4 fdalvi/NeuroX

A Python library that encapsulates various methods for neuron interpretation...

51
Established
5 MultiplEYE-COST/wg1-experiment-implementation

In this repository we keep the code for the implementation of the...

36
Emerging
6 alexdyysp/ESIM-pytorch

中国高校计算机大赛--大数据挑战赛

34
Emerging
7 NeuroLIAA/reading-et

Eye-tracking during reading of short stories

33
Emerging
8 adaamko/POTATO

XAI based human-in-the-loop framework for automatic rule-learning.

32
Emerging
9 RiccardoSpolaor/Verbal-Explanations-of-Spatio-Temporal-Graph-Neural-Networks-for-Traffic-Forecasting

An eXplainable AI system to elucidate short-term speed forecasts in traffic...

31
Emerging
10 octanove/expats

EXPATS: A Toolkit for Explainable Automated Text Scoring

29
Experimental
11 ymcui/mrc-model-analysis

Multilingual Multi-Aspect Explainability Analyses on Machine Reading...

26
Experimental
12 mohsenfayyaz/DecompX

DecompX: Explaining Transformers Decisions by Propagating Token...

25
Experimental
13 robinvanschaik/interpret-flair

A small repository to test Captum Explainable AI with a trained Flair...

23
Experimental
14 jwliao1209/Explainable-NLP

2022 AI CUP Explainable Information Tagging Competition for Natural Language...

23
Experimental
15 ravipatelxyz/nlp-ethics

In depth evaluation of the ETHICS utilitarianism task dataset. An assessment...

23
Experimental
16 synapticore-io/ethics-model

A modern, modular PyTorch framework for ethical text analysis, manipulation...

22
Experimental
17 hint-lab/doctrack

Dataset for EMNLP'23 Paper "DocTrack: A Visually-Rich Document Dataset...

21
Experimental
18 MichiganNLP/micromodels

Micromodels -- A framework for accurate, explainable, data efficient, and...

20
Experimental
19 fursovia/tcav_nlp

"Interpretability Beyond Feature Attribution: Quantitative Testing with...

19
Experimental
20 mainlp/gaze-guided-text-generation

Code and data for the paper "Controlling Reading Ease with Gaze-Guided Text...

19
Experimental
21 avijit-thawani/numeracy-literacy

Code for paper: "Numeracy Enhances the Literacy of Language Models"

16
Experimental
22 christianwarmuth/explainable-predictive-process-monitoring-with-text

On the Potential of Textual Data for Explainable Predictive Process...

13
Experimental
23 amcrisan/interactive-model-cards

An experimental project that examines whether interactivity can augment...

12
Experimental
24 shresthasingh1501/legal_document_analysis

Legal document analysis using BERT and FlanT5

12
Experimental
25 MortadhaMannai/XAI_ConstrainedAttentionVerifier

Code for the NLDB 2023 paper. Work partially funded by grant...

12
Experimental
26 DIME-XAI/dime-xai

Implementation of Dual Interpretable Model-agnostic Explanations for Rasa...

11
Experimental
27 tayyab-nlp/flaubert-gender-attribution-analysis

Attribution-based analysis of French grammatical gender encoding in FlauBERT...

11
Experimental
28 christophsk/classifier-lit

PAIR Code's Language Interpretability Tool (LIT) for Text Classification

10
Experimental