Knowledge Distillation Compression NLP Tools

Tools and methods for distilling large NLP models into smaller, faster models through knowledge transfer, model compression, and pruning techniques. Does NOT include general model optimization, quantization-only approaches, or unrelated NLP applications.

There are 41 knowledge distillation compression tools tracked. The highest-rated is airaria/TextBrewer at 41/100 with 1,697 stars.

Get all 41 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=knowledge-distillation-compression&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 airaria/TextBrewer

A PyTorch-based knowledge distillation toolkit for natural language processing

41
Emerging
2 sunyilgdx/NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through...

38
Emerging
3 kssteven418/LTP

[KDD'22] Learned Token Pruning for Transformers

37
Emerging
4 princeton-nlp/CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models...

37
Emerging
5 qiangsiwei/bert_distill

BERT distillation(基于BERT的蒸馏实验 )

34
Emerging
6 georgian-io/Transformers-Domain-Adaptation

:no_entry: [DEPRECATED] Adapt Transformer-based language models to new text domains

34
Emerging
7 microsoft/LiST

Lite Self-Training

32
Emerging
8 Alibaba-NLP/MultilangStructureKD

[ACL 2020] Structure-Level Knowledge Distillation For Multilingual Sequence Labeling

31
Emerging
9 KarineAyrs/knowledge-distillation-semantic-search

KDSS is the framework for knowledge distillation from LLMs

31
Emerging
10 LiyuanLucasLiu/LD-Net

Language Model Pruning for Sequence Labeling

31
Emerging
11 mit-han-lab/neurips-micronet

[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion

31
Emerging
12 cambridgeltl/mirror-bert

[EMNLP'21] Mirror-BERT: Converting Pretrained Language Models to universal...

30
Emerging
13 lancopku/DynamicKD

Code for EMNLP 2021 main conference paper "Dynamic Knowledge Distillation...

29
Experimental
14 cambridgeltl/MirrorWiC

[CoNLL'21] MirrorWiC: On Eliciting Word-in-Context Representationsfrom...

29
Experimental
15 PrithivirajDamodaran/Alt-ZSC

Alternate Implementation for Zero Shot Text Classification: Instead of...

28
Experimental
16 INK-USC/sparse-distillation

Code for "Sparse Distillation: Speeding Up Text Classification by Using...

28
Experimental
17 elephantmipt/bert-distillation

Distillation of BERT model with catalyst framework

28
Experimental
18 TheLucasSchwarz/zeroshotENGINE

zeroshot-engine: Zero-Shot Text Classification with LLMs in Python

26
Experimental
19 JingqingZ/KG4ZeroShotText

Source code of the paper 'Integrating Semantic Knowledge to Tackle Zero-shot...

26
Experimental
20 alinlab/MASKER

MASKER: Masked Keyword Regularization for Reliable Text Classification (AAAI 2021)

26
Experimental
21 wmkouw/ssa-nlp

Sequential subspace alignment for temporal domain adaptation in natural...

25
Experimental
22 gyunggyung/DistilKoBiLSTM

Distilling Task-Specific Knowledge from Teacher Model into BiLSTM

25
Experimental
23 kiankd/corel2019

Code for AAAI 2019 Network Interpretability workshop paper

24
Experimental
24 albertan017/HICL

The official implementation of the paper HICL: Hashtag-Driven In-Context...

24
Experimental
25 foxar124/distillery

like homebrew but with less fizz. install binaries as fast and as easy as...

23
Experimental
26 xv44586/Knowledge-Distillation-NLP

some demos of Knowledge Distillation in NLP

23
Experimental
27 sunprinceS/Hierarchical-Attention-Model

:page_facing_up: HierAttModel for Question Answering

20
Experimental
28 ritaranx/NeST

[AAAI 2023] This is the code for our paper `Neighborhood-Regularized...

20
Experimental
29 cheneydon/efficient-bert

This repository contains the code for the paper in Findings of EMNLP 2021:...

19
Experimental
30 roeeaharoni/unsupervised-domain-clusters

Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters...

17
Experimental
31 alexandra-chron/hierarchical-domain-adaptation

Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained...

17
Experimental
32 cheneydon/hrkd

This repository contains the code for the paper in EMNLP 2021: "HRKD:...

17
Experimental
33 AdrianBZG/SFAVEL

[ICLR 2024] Unsupervised Pretraining for Fact Verification by Language Model...

13
Experimental
34 amazon-science/wqa-cerberus

[EMNLP 2022 (Long, Findings)] CERBERUS: Multi-head Student Model to distill...

13
Experimental
35 yzhan238/PIEClass

The source code used for paper "PIEClass: Weakly-Supervised Text...

12
Experimental
36 Shawn-Guo-CN/Multiple-Generation-Based-Knowledge-Distillation

Multiple Generation Based Knowledge Distillation: A Roadmap

11
Experimental
37 yashmanne/intra-distillation

Repository aiming to reproduce EMNLP 2022 paper "The Importance of Being...

11
Experimental
38 leszkolukasz/training-1.58bit-llms-via-distillation

Repository for mini-paper "Training 1.58bit LLMs via Distillation"

11
Experimental
39 domiwk/didots

This is the repository for the paper "DiDOTS: Knowledge Distillation from...

10
Experimental
40 Md-Emon-Hasan/DistilBERT-model-with-HF-Transformer

📝 DistilBERT, a lightweight Transformer model from Hugging Face, for various...

10
Experimental
41 taissirboukrouba/Structured-Information-Retrieval-with-LLMs

Academic Sequence Labelling Between DistillBERT & Encoder-only Transformer

10
Experimental