Glavin001/Data2AITextbook
🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)
No commits in the last 6 months.
Stars
25
Forks
3
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Oct 15, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Glavin001/Data2AITextbook"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
InternScience/GraphGen
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
rasinmuhammed/misata
High-performance open-source synthetic data engine. Uses LLMs for schema design and vectorized...
timothepearce/synda
A CLI for generating synthetic data
ziegler-ingo/CRAFT
[TACL, EMNLP 2025 Oral] Code, datasets, and checkpoints for the paper "CRAFT Your Dataset:...
ZhuLinsen/FastDatasets
A powerful tool for creating high-quality training datasets for Large Language Models...