Llm Bias Evaluation Transformer Models

There are 7 llm bias evaluation models tracked. The highest-rated is google-deepmind/long-form-factuality at 48/100 with 672 stars.

Get all 7 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-bias-evaluation&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 google-deepmind/long-form-factuality

Benchmarking long-form factuality in large language models. Original code...

48
Emerging
2 sandylaker/ib-edl

Calibrating LLMs with Information-Theoretic Evidential Deep Learning (ICLR 2025)

30
Emerging
3 nightdessert/Retrieval_Head

open-source code for paper: Retrieval Head Mechanistically Explains...

26
Experimental
4 EternityYW/BiasEval-LLM-MentalHealth

Unveiling and Mitigating Bias in Mental Health Analysis with Large Language Models

25
Experimental
5 aigc-apps/PertEval

[NeurIPS '24 Spotlight] PertEval: Unveiling Real Knowledge Capacity of LLMs...

24
Experimental
6 bowen-upenn/llm_token_bias

[EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet...

23
Experimental
7 fannie1208/FactTest

[ICML2025] "FactTest: Factuality Testing in Large Language Models with...

16
Experimental