nlpcda and EDA_NLP_for_Chinese

These are ecosystem siblings—B is an implementation of the EDA (Easy Data Augmentation) paper for Chinese that inspired A's more polished, production-ready package which incorporates EDA as one of several augmentation techniques.

nlpcda
62
Established
EDA_NLP_for_Chinese
42
Emerging
Maintenance 0/25
Adoption 17/25
Maturity 25/25
Community 20/25
Maintenance 0/25
Adoption 10/25
Maturity 8/25
Community 24/25
Stars: 1,878
Forks: 172
Downloads: 405
Commits (30d): 0
Language: Python
License: Apache-2.0
Stars: 1,385
Forks: 236
Downloads:
Commits (30d): 0
Language: Python
License:
Stale 6m
No License Stale 6m No Package No Dependents

About nlpcda

425776024/nlpcda

一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda

Provides nine distinct augmentation strategies for Chinese NLP—including entity/synonym/homophone replacement, character deletion and transposition, and generative methods via SimBERT—while preserving semantic meaning through targeted filtering (e.g., dates/numbers remain unchanged). Offers specialized support for NER tasks in BIO format, back-translation augmentation via Baidu/Google APIs, and integrates custom lexicon injection via jieba tokenizer. Designed to improve model generalization and robustness across classification, NER, and retrieval tasks without sacrificing label integrity.

About EDA_NLP_for_Chinese

zhanlaoban/EDA_NLP_for_Chinese

An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。

Related comparisons

Scores updated daily from GitHub, PyPI, and npm data. How scores work