Knowledge Distillation Compression NLP Tools
Tools and methods for distilling large NLP models into smaller, faster models through knowledge transfer, model compression, and pruning techniques. Does NOT include general model optimization, quantization-only approaches, or unrelated NLP applications.
There are 41 knowledge distillation compression tools tracked. The highest-rated is airaria/TextBrewer at 41/100 with 1,697 stars.
Get all 41 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=knowledge-distillation-compression&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
airaria/TextBrewer
A PyTorch-based knowledge distillation toolkit for natural language processing |
|
Emerging |
| 2 |
sunyilgdx/NSP-BERT
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through... |
|
Emerging |
| 3 |
kssteven418/LTP
[KDD'22] Learned Token Pruning for Transformers |
|
Emerging |
| 4 |
princeton-nlp/CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models... |
|
Emerging |
| 5 |
qiangsiwei/bert_distill
BERT distillation(基于BERT的蒸馏实验 ) |
|
Emerging |
| 6 |
georgian-io/Transformers-Domain-Adaptation
:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains |
|
Emerging |
| 7 |
microsoft/LiST
Lite Self-Training |
|
Emerging |
| 8 |
Alibaba-NLP/MultilangStructureKD
[ACL 2020] Structure-Level Knowledge Distillation For Multilingual Sequence Labeling |
|
Emerging |
| 9 |
KarineAyrs/knowledge-distillation-semantic-search
KDSS is the framework for knowledge distillation from LLMs |
|
Emerging |
| 10 |
LiyuanLucasLiu/LD-Net
Language Model Pruning for Sequence Labeling |
|
Emerging |
| 11 |
mit-han-lab/neurips-micronet
[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion |
|
Emerging |
| 12 |
cambridgeltl/mirror-bert
[EMNLP'21] Mirror-BERT: Converting Pretrained Language Models to universal... |
|
Emerging |
| 13 |
lancopku/DynamicKD
Code for EMNLP 2021 main conference paper "Dynamic Knowledge Distillation... |
|
Experimental |
| 14 |
cambridgeltl/MirrorWiC
[CoNLL'21] MirrorWiC: On Eliciting Word-in-Context Representationsfrom... |
|
Experimental |
| 15 |
PrithivirajDamodaran/Alt-ZSC
Alternate Implementation for Zero Shot Text Classification: Instead of... |
|
Experimental |
| 16 |
INK-USC/sparse-distillation
Code for "Sparse Distillation: Speeding Up Text Classification by Using... |
|
Experimental |
| 17 |
elephantmipt/bert-distillation
Distillation of BERT model with catalyst framework |
|
Experimental |
| 18 |
TheLucasSchwarz/zeroshotENGINE
zeroshot-engine: Zero-Shot Text Classification with LLMs in Python |
|
Experimental |
| 19 |
JingqingZ/KG4ZeroShotText
Source code of the paper 'Integrating Semantic Knowledge to Tackle Zero-shot... |
|
Experimental |
| 20 |
alinlab/MASKER
MASKER: Masked Keyword Regularization for Reliable Text Classification (AAAI 2021) |
|
Experimental |
| 21 |
wmkouw/ssa-nlp
Sequential subspace alignment for temporal domain adaptation in natural... |
|
Experimental |
| 22 |
gyunggyung/DistilKoBiLSTM
Distilling Task-Specific Knowledge from Teacher Model into BiLSTM |
|
Experimental |
| 23 |
kiankd/corel2019
Code for AAAI 2019 Network Interpretability workshop paper |
|
Experimental |
| 24 |
albertan017/HICL
The official implementation of the paper HICL: Hashtag-Driven In-Context... |
|
Experimental |
| 25 |
foxar124/distillery
like homebrew but with less fizz. install binaries as fast and as easy as... |
|
Experimental |
| 26 |
xv44586/Knowledge-Distillation-NLP
some demos of Knowledge Distillation in NLP |
|
Experimental |
| 27 |
sunprinceS/Hierarchical-Attention-Model
:page_facing_up: HierAttModel for Question Answering |
|
Experimental |
| 28 |
ritaranx/NeST
[AAAI 2023] This is the code for our paper `Neighborhood-Regularized... |
|
Experimental |
| 29 |
cheneydon/efficient-bert
This repository contains the code for the paper in Findings of EMNLP 2021:... |
|
Experimental |
| 30 |
roeeaharoni/unsupervised-domain-clusters
Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters... |
|
Experimental |
| 31 |
alexandra-chron/hierarchical-domain-adaptation
Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained... |
|
Experimental |
| 32 |
cheneydon/hrkd
This repository contains the code for the paper in EMNLP 2021: "HRKD:... |
|
Experimental |
| 33 |
AdrianBZG/SFAVEL
[ICLR 2024] Unsupervised Pretraining for Fact Verification by Language Model... |
|
Experimental |
| 34 |
amazon-science/wqa-cerberus
[EMNLP 2022 (Long, Findings)] CERBERUS: Multi-head Student Model to distill... |
|
Experimental |
| 35 |
yzhan238/PIEClass
The source code used for paper "PIEClass: Weakly-Supervised Text... |
|
Experimental |
| 36 |
Shawn-Guo-CN/Multiple-Generation-Based-Knowledge-Distillation
Multiple Generation Based Knowledge Distillation: A Roadmap |
|
Experimental |
| 37 |
yashmanne/intra-distillation
Repository aiming to reproduce EMNLP 2022 paper "The Importance of Being... |
|
Experimental |
| 38 |
leszkolukasz/training-1.58bit-llms-via-distillation
Repository for mini-paper "Training 1.58bit LLMs via Distillation" |
|
Experimental |
| 39 |
domiwk/didots
This is the repository for the paper "DiDOTS: Knowledge Distillation from... |
|
Experimental |
| 40 |
Md-Emon-Hasan/DistilBERT-model-with-HF-Transformer
📝 DistilBERT, a lightweight Transformer model from Hugging Face, for various... |
|
Experimental |
| 41 |
taissirboukrouba/Structured-Information-Retrieval-with-LLMs
Academic Sequence Labelling Between DistillBERT & Encoder-only Transformer |
|
Experimental |