Data Augmentation NLP NLP Tools
Tools and frameworks for generating synthetic training data, augmenting existing datasets, and applying transformation techniques to improve NLP model performance. Does NOT include general data preprocessing, cleaning, or annotation tools.
There are 25 data augmentation nlp tools tracked. 2 score above 50 (established tier). The highest-rated is dsfsi/textaugment at 65/100 with 433 stars and 2,436 monthly downloads.
Get all 25 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=data-augmentation-nlp&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
dsfsi/textaugment
TextAugment: Text Augmentation Library |
|
Established |
| 2 |
425776024/nlpcda
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda |
|
Established |
| 3 |
searchableai/KitanaQA
KitanaQA: Adversarial training and data augmentation for neural... |
|
Emerging |
| 4 |
google-research/uda
Unsupervised Data Augmentation (UDA) |
|
Emerging |
| 5 |
SanghunYun/UDA_pytorch
UDA(Unsupervised Data Augmentation) implemented by pytorch |
|
Emerging |
| 6 |
toriving/KoEDA
Korean Easy Data Augmentation |
|
Emerging |
| 7 |
AlexKay28/zarnitsa
:cloud_with_lightning: Zarnitsa package for data augmentation ops |
|
Emerging |
| 8 |
KennethEnevoldsen/augmenty
Augmenty is an augmentation library based on spaCy for augmenting texts. |
|
Emerging |
| 9 |
zhanlaoban/EDA_NLP_for_Chinese
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。 |
|
Emerging |
| 10 |
lancopku/text-autoaugment
[EMNLP 2021] Text AutoAugment: Learning Compositional Augmentation Policy... |
|
Emerging |
| 11 |
quincyliang/nlp-data-augmentation
Data Augmentation for NLP. NLP数据增强 |
|
Experimental |
| 12 |
patrick-batman/Unsupervised-Hypothesis-Creation
unsupervised creation of contradictory, entailing sentences from a given... |
|
Experimental |
| 13 |
chck/AugLy-jp
Data Augmentation for Japanese Text on AugLy |
|
Experimental |
| 14 |
remydecoupes/GeoNLPlify
:earth_africa: :book: A NLP library for data augmentation focusing on... |
|
Experimental |
| 15 |
k4black/fast-aug
Fast Augmentation library for NLP |
|
Experimental |
| 16 |
zhaominyiz/EPiDA
Official Code for 'EPiDA: An Easy Plug-in Data Augmentation Framework for... |
|
Experimental |
| 17 |
kajyuuen/daaja
This repository has implementations of data augmentation for NLP for Japanese. |
|
Experimental |
| 18 |
ChetanMJ/NL2SQL-Data-Augmentation
Data augmentation techniques help improve performance by generating data of... |
|
Experimental |
| 19 |
pemagrg1/nlp-data-augmentation
Augmentating Textual Data Using NLP Libraries. |
|
Experimental |
| 20 |
Ritvik19/Text-Data-Augmentation
State of the Art Text Data Augmentation for Natural Language Processing Applications |
|
Experimental |
| 21 |
aryashah2k/NLP-Data-Augmentation
Implementing 5 Different Approaches To Augmenting Data For Natural Language... |
|
Experimental |
| 22 |
dheeraj7596/CONDA
Generate synthetic training data using small LMs. |
|
Experimental |
| 23 |
masoudMZB/Text-Wizard-Fatsapi-NLP-project
NLP Visualization/Augmentation techniques using fast api to implement. |
|
Experimental |
| 24 |
dextergui/NLarge
NLarge - Dataset Augmentation Tool |
|
Experimental |
| 25 |
sminerport/TextAugmentor
This repo offers a Python script using NLPAug library & RTT to augment text... |
|
Experimental |