datawhalechina/tiny-universe
《大模型白盒子构建指南》:一个全手搓的Tiny-Universe
Implements core LLM components from first principles using PyTorch—including Tiny Diffusion for image generation, Tiny Llama3 for pretraining, and Tiny Transformer architecture—alongside practical systems for RAG, Agent orchestration, and evaluation. Focuses on interpretable, minimal implementations with detailed code comments that decouple learning from high-level frameworks like Hugging Face, enabling independent system modification. Covers the complete pipeline from tokenizer training through inference, GraphRAG construction, and domain-specific evaluation metrics.
4,598 stars.
Stars
4,598
Forks
450
Language
Jupyter Notebook
License
—
Category
Last pushed
Feb 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/datawhalechina/tiny-universe"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
WangRongsheng/awesome-LLM-resources
🧑🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the...
katanaml/sparrow
Structured data extraction and instruction calling with ML, LLM and Vision LLM
luhengshiwo/LLMForEverybody
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
LazyAGI/LazyLLM
Easiest and laziest way for building multi-agent LLMs applications.
SylphAI-Inc/AdalFlow
AdalFlow: The library to build & auto-optimize LLM applications.