Instruction Tuning Datasets Transformer Models

There are 23 instruction tuning datasets models tracked. The highest-rated is DaoD/INTERS at 40/100 with 207 stars.

Get all 23 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=instruction-tuning-datasets&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 DaoD/INTERS

This is the repository for our paper "INTERS: Unlocking the Power of Large...

40
Emerging
2 Haiyang-W/TokenFormer

[ICLR2025 SpotlightšŸ”„] Official Implementation of TokenFormer: Rethinking...

34
Emerging
3 declare-lab/instruct-eval

This repository contains code to quantitatively evaluate instruction-tuned...

34
Emerging
4 hkust-nlp/deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

33
Emerging
5 kehanlu/DeSTA2

Code and model for ICASSP 2025 Paper "Developing Instruction-Following...

32
Emerging
6 TIGER-AI-Lab/VisualWebInstruct

The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction...

29
Experimental
7 UCSC-REAL/TokenCleaning

[ICML 2025] Official implementation of paper "Token Cleaning: Fine-Grained...

29
Experimental
8 FengheTan9/LLM4Seg

[MICCAI 2025] Official code for "Pre-Trained LLM is a Semantic-Aware and...

29
Experimental
9 zhilizju/Awesome-instruction-tuning

A curated list of awesome instruction tuning datasets, models, papers and...

29
Experimental
10 declare-lab/Auto-Scaling

[Arxiv 2024] Official Implementation of the paper: "Towards Robust...

28
Experimental
11 TamSiuhin/P2P

source code for "Instant Personalized Large Language Model Adaptation via...

28
Experimental
12 RenzeLou/Muffin

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

28
Experimental
13 cxcscmu/Montessori-Instruct

Official repository for Montessori-Instruct: Generate Influential Training...

27
Experimental
14 hplt-project/monolingual-multilingual-instruction-tuning

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

24
Experimental
15 SinclairCoder/Instruction-Tuning-Papers

Reading list of Instruction-tuning. A trend starts from Natrural-Instruction...

23
Experimental
16 gentaiscool/few-shot-lm

The source code of "Language Models are Few-shot Multilingual Learners" (MRL...

22
Experimental
17 OSU-NLP-Group/QA4RE

[ACL'23 Findings] "Aligning Instruction Tasks Unlocks Large Language Models...

21
Experimental
18 liziniu/GEM

Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large...

21
Experimental
19 ZifanL/TSDS

Implementation of TSDS: Data Selection for Task-Specific Model Finetuning....

20
Experimental
20 zhuang-li/SCAR

[ACL 2025 main] SCAR: Data Selection via Style Consistency-Aware Response...

20
Experimental
21 yeyimilk/llm-zero-shot-classifiers

Large Language Models are zero-shot text classifiers; Smart Expert System:...

19
Experimental
22 OFA-Sys/DiverseEvol

Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning

17
Experimental
23 dkopi/Bitune

Implementation of Bitune: Bidirectional Instruction-Tuning

17
Experimental