qiangsiwei/bert_distill

BERT distillation（基于BERT的蒸馏实验）

/ 100

Emerging

Implements knowledge distillation from BERT into lightweight student models (TextCNN, BiLSTM/GRU) using both Keras and PyTorch, following the approach in "Distilling Task-Specific Knowledge from BERT into Simple Neural Networks." The framework uses a 1:8:1 labeled-to-unlabeled-to-test data split and supports data augmentation techniques (masking, n-gram sampling) to improve student model performance on sentiment classification tasks, achieving ~87-88% accuracy compared to BERT's ~90-91% on the clothing dataset.

314 stars. No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 23 / 25

How are scores calculated?

Stars

314

Forks

Language

Python

License

—

Higher-rated alternatives

airaria/TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

sunyilgdx/NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original...

princeton-nlp/CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

kssteven418/LTP

[KDD'22] Learned Token Pruning for Transformers

georgian-io/Transformers-Domain-Adaptation

:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains

Explore NLP Tools

All categories Trending NLP directory Insights