VITA-Group/Data-Efficient-Scaling

[ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang

13
/ 100
Experimental

This project helps machine learning researchers and practitioners efficiently train very large language models even when there isn't enough training data. It takes smaller, pre-trained models and uses them to kickstart the training of much larger models. The output is a large transformer model that performs well despite data scarcity.

No commits in the last 6 months.

Use this if you are developing or training large language models (like BERT or RoBERTa) and are concerned about their performance when you have a limited amount of training data.

Not ideal if you are working with smaller models or have abundant training data for your large model, as the primary benefit is addressing data scarcity for gigantic models.

large-language-models model-training natural-language-processing data-efficiency deep-learning-research
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 0 / 25

How are scores calculated?

Stars

14

Forks

Language

Python

License

Last pushed

Jan 04, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/VITA-Group/Data-Efficient-Scaling"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.