logpai/bughub
A collection of free-text bug reports for duplicate issue identification
Provides curated datasets across 12+ major open-source projects (Mozilla, Eclipse, Apache ecosystem) with chronologically-split train/test partitions specifically designed for NLP-based duplicate detection research. Includes complementary bug localization datasets linking reports to source files, plus comprehensive evaluation metrics on duplicate prevalence (6.8%-38.4% across projects) and resolution timelines. Benchmarks against established approaches from two decades of peer-reviewed research in automated bug triage and information retrieval.
123 stars. No commits in the last 6 months.
Stars
123
Forks
27
Language
—
License
—
Category
Last pushed
Mar 18, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/logpai/bughub"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
MinLee0210/Smart-Evaluation-Solution
A project for AngelHack competition - h4ckhcmc 2024
shadmehr-salehi/AI-Hackathon-2023
Solution for Hackathon Problem Sets ( NLP )
chartes/masterHN-hackathons2026
Résultats de la semaine de compétitions et hackathons du master Humanités numériques - année 2026
knmlprz/BITEHack
2 miejsce na BITEHack 2022 w kategorii AI
s1ri1337/SIH2K22
Our entry for NDRF's problem statement GS900 in the Smart India Hackathon 2022 where we finished...