ymcui/Chinese-BERT-wwm
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
Implements whole word masking at the Chinese word level rather than character level, using the HarBin LTP segmentation tool to mask complete words during pretraining instead of randomly masking individual subwords. Provides multiple model variants including BERT-wwm, RoBERTa-wwm-ext, and compressed versions (RBT3-6, RBTL3) trained on 5.4B tokens of extended Chinese corpora, all compatible with Hugging Face Transformers and PaddleHub for direct integration into downstream NLP tasks.
10,184 stars. No commits in the last 6 months.
Stars
10,184
Forks
1,393
Language
Python
License
Apache-2.0
Category
Last pushed
Jul 15, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ymcui/Chinese-BERT-wwm"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
sileod/tasknet
Easy modernBERT fine-tuning and multi-task learning
codertimo/BERT-pytorch
Google AI 2018 BERT pytorch implementation
920232796/bert_seq2seq
pytorch实现 Bert 做seq2seq任务,使用unilm方案,现在也可以做自动摘要,文本分类,情感分析,NER,词性标注等任务,支持t5模型,支持GPT2进行文章续写。
JayYip/m3tl
BERT for Multitask Learning
graykode/toeicbert
TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.