g4ix/advLab1-HITS

Project for an advanced lab investigating LLM benchmarks from an IR perspective. Instead of focusing on model performance, we evaluated benchmark robustness, identifying which questions truly differentiate models and whether leaderboard rankings reflect real differences or are dominated by easy, high-hubness items.

10
/ 100
Experimental

No commits in the last 6 months.

No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 1 / 25
Maturity 7 / 25
Community 0 / 25

How are scores calculated?

Stars

1

Forks

Language

Jupyter Notebook

License

Last pushed

Oct 10, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/g4ix/advLab1-HITS"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.