qiangsiwei/bert_distill
BERT distillation(基于BERT的蒸馏实验 )
Implements knowledge distillation from BERT into lightweight student models (TextCNN, BiLSTM/GRU) using both Keras and PyTorch, following the approach in "Distilling Task-Specific Knowledge from BERT into Simple Neural Networks." The framework uses a 1:8:1 labeled-to-unlabeled-to-test data split and supports data augmentation techniques (masking, n-gram sampling) to improve student model performance on sentiment classification tasks, achieving ~87-88% accuracy compared to BERT's ~90-91% on the clothing dataset.
314 stars. No commits in the last 6 months.
Stars
314
Forks
82
Language
Python
License
—
Category
Last pushed
Jul 30, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/qiangsiwei/bert_distill"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
airaria/TextBrewer
A PyTorch-based knowledge distillation toolkit for natural language processing
sunyilgdx/NSP-BERT
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original...
princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
kssteven418/LTP
[KDD'22] Learned Token Pruning for Transformers
georgian-io/Transformers-Domain-Adaptation
:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains