styfeng/DataAug4NLP
Collection of papers and resources for data augmentation for NLP.
Organized by 15+ NLP task categories (text classification, translation, QA, sequence tagging, parsing, dialogue, multimodal, and more), this curated repository systematically maps augmentation techniques to specific problem domains with linked implementations and benchmark datasets. Grounded in an ACL 2021 survey paper, it combines peer-reviewed research with a community contribution model via pull requests, enabling practitioners to identify task-appropriate augmentation strategies from backtranslation and synonym replacement to recent LLM-based approaches like GPT3Mix and automated augmentation methods.
831 stars. No commits in the last 6 months.
Stars
831
Forks
76
Language
—
License
—
Category
Last pushed
Aug 12, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/styfeng/DataAug4NLP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
varunkumar-dev/TransformersDataAugmentation
Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper
Akshint0407/Automated-Answer-Checker
AI-powered grading system for educators 🔹 Streamlit web app that automates answer sheet...
Anjum48/commonlitreadabilityprize
4th Place solution for the Kaggle CommonLit Readability Prize
yuchen0515/2022-Competition-CUDAOutOfMemory
Our team placed 6th out of 119 teams in E.SUN AI Open Competition Summer 2022 - ASR...
omerfarooq223/AutoGrader-Agent
AI agent that grades student assignments from a ZIP file using LLMs — generates rubrics, detects...