LLM Knowledge Distillation LLM Tools
Tools and frameworks for compressing large language models into smaller, efficient student models through knowledge distillation techniques. Includes distillation algorithms, teacher-student training pipelines, and methods for knowledge transfer. Does NOT include general model pruning, quantization, or fine-tuning without a teacher model.
There are 15 llm knowledge distillation tools tracked. The highest-rated is LLM-Tuning-Safety/LLMs-Finetuning-Safety at 35/100 with 344 stars.
Get all 15 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-knowledge-distillation&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
LLM-Tuning-Safety/LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10... |
|
Emerging |
| 2 |
kyegomez/Sophia
Effortless plugin and play Optimizer to cut model training costs by 50%. ... |
|
Emerging |
| 3 |
uthmandevsec/Self-Distillation
🤖 Enable continual learning by reproducing the On-Policy Self-Distillation... |
|
Experimental |
| 4 |
appier-research/robust-llm-finetunes
Accepted to NeurIPS 2025 |
|
Experimental |
| 5 |
jmcentire/apprentice
Train cheap models on expensive ones. Automatically. With receipts. |
|
Experimental |
| 6 |
phonism/LLMNotes
LLM 学习笔记:Transformer 架构、强化学习 (RLHF/DPO/PPO)、分布式训练、推理优化。含完整数学推导与Slides。 |
|
Experimental |
| 7 |
hemantjuyal/LLM-Distillation-Lab
An experiment demonstrating instruction-following distillation, enabling the... |
|
Experimental |
| 8 |
waelantar/ATTS_Complete_Free_Package
ATTS: Adaptive Test-Time Scaling - A validated framework for optimizing LLM... |
|
Experimental |
| 9 |
kyj93790/VILA
[COLM 2025] Improving Fisher Information Estimation and Efficiency for... |
|
Experimental |
| 10 |
Hong-Lab-UMN-ECE/RoSTE
[ICML 2025] Official code for the paper "RoSTE: An Efficient... |
|
Experimental |
| 11 |
liuyz0/DepthScaling
Inverse Depth Scaling From Most Layers Being Similar |
|
Experimental |
| 12 |
ikun-llm/ikun-Distill
知识蒸馏 | Knowledge Distillation from teacher model 🎓 |
|
Experimental |
| 13 |
EM7m4/Distill-R1
Combine reinforcement learning with online teacher-student distillation to... |
|
Experimental |
| 14 |
XelfXendr/peft_unlearning
Repository exploring the use of parameter-efficient finetuning methods for... |
|
Experimental |
| 15 |
amazon-science/mada_optimizer_search
Code the ICML 2024 paper: "MADA: Meta-Adaptive Optimizers through... |
|
Experimental |