styfeng/DataAug4NLP

Collection of papers and resources for data augmentation for NLP.

29
/ 100
Experimental

Organized by 15+ NLP task categories (text classification, translation, QA, sequence tagging, parsing, dialogue, multimodal, and more), this curated repository systematically maps augmentation techniques to specific problem domains with linked implementations and benchmark datasets. Grounded in an ACL 2021 survey paper, it combines peer-reviewed research with a community contribution model via pull requests, enabling practitioners to identify task-appropriate augmentation strategies from backtranslation and synonym replacement to recent LLM-based approaches like GPT3Mix and automated augmentation methods.

831 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 1 / 25
Community 18 / 25

How are scores calculated?

Stars

831

Forks

76

Language

License

Last pushed

Aug 12, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/styfeng/DataAug4NLP"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.