jsbaan/calibration-on-disagreement-data

Code accompanying the EMNLP 2022 paper "Stop Measuring Calibration When Humans Disagree" in which we show problems with popular calibration metrics like ECE in settings where more than one answer is acceptable, and argue for several metrics that take into account the full human judgement distribution.

/ 100

Experimental

No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 4 / 25

Maturity 1 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Category

model-confidence-calibration

Last pushed

Nov 18, 2022

Commits (30d)

GitHub

Model Confidence Calibration · 21 frameworks

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/jsbaan/calibration-on-disagreement-data"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

dholzmueller/probmetrics

Post-hoc calibration methods and metrics for classification

facebookincubator/MCGrad

MCGrad is a scalable and easy-to-use tool for multicalibration. It ensures your ML model...

gpleiss/temperature_scaling

A simple way to calibrate your neural network.

yfzhang114/Generalization-Causality

关于domain generalization，domain adaptation，causality，robutness，prompt，optimization，generative...

hollance/reliability-diagrams

Reliability diagrams visualize whether a classifier model needs calibration

Explore ML Frameworks

All categories Trending ML Framework directory Insights