logpai/bughub

A collection of free-text bug reports for duplicate issue identification

38
/ 100
Emerging

Provides curated datasets across 12+ major open-source projects (Mozilla, Eclipse, Apache ecosystem) with chronologically-split train/test partitions specifically designed for NLP-based duplicate detection research. Includes complementary bug localization datasets linking reports to source files, plus comprehensive evaluation metrics on duplicate prevalence (6.8%-38.4% across projects) and resolution timelines. Benchmarks against established approaches from two decades of peer-reviewed research in automated bug triage and information retrieval.

123 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 20 / 25

How are scores calculated?

Stars

123

Forks

27

Language

License

Last pushed

Mar 18, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/logpai/bughub"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.