nlpcda and EDA_NLP_for_Chinese

These are ecosystem siblings—B is an implementation of the EDA (Easy Data Augmentation) paper for Chinese that inspired A's more polished, production-ready package which incorporates EDA as one of several augmentation techniques.

nlpcda

Established

EDA_NLP_for_Chinese

Emerging

Maintenance 0/25

Adoption 17/25

Maturity 25/25

Community 20/25

Maintenance 0/25

Adoption 10/25

Maturity 8/25

Community 24/25

Stars: 1,878

Forks: 172

Downloads: 405

Commits (30d): 0

Language: Python

License: Apache-2.0

Stars: 1,385

Forks: 236

Downloads: —

Commits (30d): 0

Language: Python

License: —

Stale 6m

No License Stale 6m No Package No Dependents

About nlpcda

425776024/nlpcda

一键中文数据增强包； NLP数据增强、bert数据增强、EDA：pip install nlpcda

Provides nine distinct augmentation strategies for Chinese NLP—including entity/synonym/homophone replacement, character deletion and transposition, and generative methods via SimBERT—while preserving semantic meaning through targeted filtering (e.g., dates/numbers remain unchanged). Offers specialized support for NER tasks in BIO format, back-translation augmentation via Baidu/Google APIs, and integrates custom lexicon injection via jieba tokenizer. Designed to improve model generalization and robustness across classification, NER, and retrieval tasks without sacrificing label integrity.

About EDA_NLP_for_Chinese

zhanlaoban/EDA_NLP_for_Chinese

An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。

Related comparisons

nlpcda and nlp-data-augmentation

Scores updated daily from GitHub, PyPI, and npm data. How scores work