nlpcda and EDA_NLP_for_Chinese
These are ecosystem siblings—B is an implementation of the EDA (Easy Data Augmentation) paper for Chinese that inspired A's more polished, production-ready package which incorporates EDA as one of several augmentation techniques.
About nlpcda
425776024/nlpcda
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda
Provides nine distinct augmentation strategies for Chinese NLP—including entity/synonym/homophone replacement, character deletion and transposition, and generative methods via SimBERT—while preserving semantic meaning through targeted filtering (e.g., dates/numbers remain unchanged). Offers specialized support for NER tasks in BIO format, back-translation augmentation via Baidu/Google APIs, and integrates custom lexicon injection via jieba tokenizer. Designed to improve model generalization and robustness across classification, NER, and retrieval tasks without sacrificing label integrity.
About EDA_NLP_for_Chinese
zhanlaoban/EDA_NLP_for_Chinese
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work