Synthetic Data Generation Transformer Models

There are 3 synthetic data generation models tracked. The highest-rated is VikParuchuri/textbook_quality at 36/100 with 509 stars.

Get all 3 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=synthetic-data-generation&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 VikParuchuri/textbook_quality

Generate textbook-quality synthetic LLM pretraining data

36
Emerging
2 BhabhaAI/dataformer

Solving data for LLMs - Create quality synthetic datasets!

31
Emerging
3 iiis-ai/TemplateMath

[ICLR 2025 DATA-FM] Training and Evaluating Language Models with...

18
Experimental