Bias Measurement Evaluation NLP Tools
Tools and datasets for detecting, measuring, and quantifying bias in NLP models and language systems. Includes benchmarks, metrics, and evaluation methods for assessing fairness across different demographic groups and intersectional categories. Does NOT include general bias mitigation techniques, debiasing methods without evaluation focus, or application-specific bias detection (e.g., hate speech or toxic comment detection).
There are 37 bias measurement evaluation tools tracked. The highest-rated is dccuchile/wefe at 46/100 with 183 stars.
Get all 37 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=bias-measurement-evaluation&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
dccuchile/wefe
WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework... |
|
Emerging |
| 2 |
dreji18/Fairness-in-AI
Detecting Bias and ensuring Fairness in AI solutions |
|
Emerging |
| 3 |
amazon-science/bold
Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in... |
|
Emerging |
| 4 |
dhfbk/variationist
Variationist: Exploring Multifaceted Variation and Bias in Written Language... |
|
Emerging |
| 5 |
soarsmu/BiasFinder
BiasFinder | IEEE TSE | Metamorphic Test Generation to Uncover Bias for... |
|
Emerging |
| 6 |
microsoft/SafeNLP
Safety Score for Pre-Trained Language Models |
|
Experimental |
| 7 |
grecosalvatore/nlpguard
NLPGuard: A Framework for Mitigating the use of Protected Attributes in NLP |
|
Experimental |
| 8 |
darenr/gender-bias
Real-time Javascipt gender bias detector |
|
Experimental |
| 9 |
jasonshaoshun/SAL
code for "Spectral Removal of Guarded Attribute Information" |
|
Experimental |
| 10 |
princeton-nlp/MABEL
EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data"... |
|
Experimental |
| 11 |
kidologi/AI_lForge
🤖 Detect and mitigate bias in machine learning with the AI_lForge toolkit,... |
|
Experimental |
| 12 |
CAMeL-Lab/gender-rewriting-shared-task
Evaluation code and data for the gender rewriting shared task |
|
Experimental |
| 13 |
krangelie/bias-in-german-nlg
Master thesis: Exploring bias in German NLG (GPT-3 & GerPT-2). Applies... |
|
Experimental |
| 14 |
feyzaakyurek/bbnli
Bias Benchmark for Natural Language Inference. Code repo for the Findings of... |
|
Experimental |
| 15 |
tinotavingeyi-droid/ubuntu-xai
An open-source research platform for evaluating AI bias, fairness, and... |
|
Experimental |
| 16 |
candacelax/bias-in-vision-and-language
Code for paper "Measuring Social Biases in Grounded Vision and Language Embeddings" |
|
Experimental |
| 17 |
erica-dessi/Modelli-linguistici-e-discriminazione-nascosta-il-bias-di-genere-nelle-professioni
La presente tesi esplora il fenomeno del bias di genere nei Large Language... |
|
Experimental |
| 18 |
cs329yangzhong/WIKIBIAS
Code and data for EMNLP2021 paper: WIKIBIAS: Detecting Multi-Span Subjective... |
|
Experimental |
| 19 |
yipenglai/Wikipedia-Gender-Bias
Measure gender bias in English Wikipedia biographies through text analysis in R |
|
Experimental |
| 20 |
sathvikn/word_embedding_bias
Companion to my blog post: How Biases in Language get Perpetuated by Technology |
|
Experimental |
| 21 |
minnesotanlp/Quantifying-Annotation-Disagreement
Official implementation of Wan et al's paper "Everyone's Voice Matters:... |
|
Experimental |
| 22 |
VSteinborn/s_jsd-multilingual-bias
Code and data for the paper "An Information-Theoretic Approach and Dataset... |
|
Experimental |
| 23 |
google-research-datasets/nlp-fairness-for-india
Contains data resources to replicate results from the paper... |
|
Experimental |
| 24 |
iampeti/Thesis_Gender_Bias
📊 Investigate gender bias in clinical research through statistical analysis... |
|
Experimental |
| 25 |
PieTempesti98/biases_in_hiring_decisions
Review of the most studied biases in the hiring process made by Pietro... |
|
Experimental |
| 26 |
groovychoons/GlobalBias
The official repo for the GlobalBias dataset and associated paper: 'Who is... |
|
Experimental |
| 27 |
jasonshaoshun/AMSAL
code for "Erasure of Unaligned Attributes from Neural Representations" |
|
Experimental |
| 28 |
hyoungjo/lipstick-on-a-pig
Debiasing methods on contextualised embeddings are ineffective - CS475 |
|
Experimental |
| 29 |
martinsjaavik/llm-bias-norwegian
Master thesis on subtler biases |
|
Experimental |
| 30 |
feyzaakyurek/bias-textgen
Code for the paper "Challenges in Measuring Bias in Open-Ended Language... |
|
Experimental |
| 31 |
CAMeL-Lab/gender-rewriting
Code, models, and data for "User-Centric Gender Rewriting". NAACL 2022. |
|
Experimental |
| 32 |
venkatasg/interpersonal-bias
Code and data for the paper ' How people talk about each other: Modeling... |
|
Experimental |
| 33 |
Ahmad-AlSubaie/CS499-DL-debaising
Repository for research done into the methods used to debias ML models.... |
|
Experimental |
| 34 |
B-VARUN-REDDY/FairwAI-Bias-Detection
Submission for the FairwAI Hospitality Intern Challenge. This project... |
|
Experimental |
| 35 |
asimokby/formality-bias-analysis
This repo contains the annotations and other artifacts of the paper titled:... |
|
Experimental |
| 36 |
VSteinborn/politeness-attacks
Code and data for the paper "Politeness Stereotypes and Attack Vectors:... |
|
Experimental |
| 37 |
iamshnoo/soc_bias
Reproduction for NAACL paper on Socially Aware Bias Measurements for Hindi |
|
Experimental |