GustyCube/ERR-EVAL
Benchmark for evaluating AI epistemic reliability - testing how well LLMs handle uncertainty, avoid hallucinations, and acknowledge what they don't know.
Stars
9
Forks
1
Language
Python
License
MIT
Category
Last pushed
Jan 02, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/GustyCube/ERR-EVAL"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Cloud-CV/EvalAI
:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
fireindark707/Python-Schema-Matching
A python tool using XGboost and sentence-transformers to perform schema matching task on tables.
graphbookai/graphbook
Visual AI development framework for training and inference of ML models, scaling pipelines, and...
Alir3z4/tb-query
A CLI tool and MCP (Model Context Protocol) server for querying and analyzing TensorBoard event...
RAILethicsHub/rail-score
Python SDK