LLM Training Experimentation Transformer Models

Repositories for training, fine-tuning, and experimenting with large language models including tutorials, frameworks, and custom implementations. Does NOT include deployment tools, specific downstream applications (chatbots, summarization), or model evaluation/analysis.

There are 151 llm training experimentation models tracked. 2 score above 70 (verified tier). The highest-rated is PaddlePaddle/PaddleNLP at 79/100 with 12,929 stars and 41,348 monthly downloads. 2 of the top 10 are actively maintained.

Get all 151 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-training-experimentation&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	PaddlePaddle/PaddleNLP Easy-to-use and powerful LLM and SLM library with awesome model zoo.	79	Verified	12,929	Python
2	meta-llama/llama-cookbook Welcome to the Llama Cookbook! This is your go to guide for Building with...	73	Verified	18,252	Jupyter Notebook
3	arcee-ai/mergekit Tools for merging pretrained large language models.	59	Established	6,857	Python
4	changyeyu/LLM-RL-Visualized 🌟100+ 原创 LLM / RL 原理图📚，《大模型算法》作者巨献！💥（100+ LLM/RL Algorithm Maps ）	58	Established	3,766	Python
5	mindspore-lab/step_into_llm MindSpore online courses: Step into LLM	57	Established	484	Jupyter Notebook
6	kyegomez/LFM2 A simple and minimal open source implementation of "Introducing LFM2: The...	56	Established	23	Python
7	kyegomez/LFM An open source implementation of LFMs from Liquid AI: Liquid Foundation Models	56	Established	207	Python
8	BeastByteAI/scikit-llm Seamlessly integrate LLMs into scikit-learn.	55	Established	3,490	Python
9	ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best...	54	Established	720	Jupyter Notebook
10	IbrahimSobh/llms Large Language Models: In this repository Language models are introduced...	52	Established	394	Jupyter Notebook
11	bobazooba/xllm 🦖 X—LLM: Cutting Edge & Easy LLM Finetuning	52	Established	408	Python
12	Leeroo-AI/mergoo A library for easily merging multiple LLM experts, and efficiently train the...	52	Established	507	Python
13	r2d4/rellm Exact structure out of any language model completion.	50	Established	514	Python
14	iusztinpaul/hands-on-llms 🦖 𝗟𝗲𝗮𝗿𝗻 about 𝗟𝗟𝗠𝘀, 𝗟𝗟𝗠𝗢𝗽𝘀, and 𝘃𝗲𝗰𝘁𝗼𝗿 𝗗𝗕𝘀 for free by designing, training,...	49	Emerging	3,401	Jupyter Notebook
15	socialfoundations/folktexts Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on...	48	Emerging	25	Jupyter Notebook
16	datawhalechina/base-llm 从 NLP 到 LLM 的算法全栈教程，在线阅读地址：https://datawhalechina.github.io/base-llm/	46	Emerging	421	Jupyter Notebook
17	young-geng/EasyLM Large language models (LLMs) made easy, EasyLM is a one stop solution for...	46	Emerging	2,522	Python
18	Tzohar/PassLLM World's most accurate password guessing AI tool. A PyTorch implementation of...	45	Emerging	85	Python
19	HamedBabaei/LLMs4OM LLMs4OM: Matching Ontologies with Large Language Models	42	Emerging	42	Python
20	EvilFreelancer/impruver A set of scripts and configurations for pretraining of Large Language Models (LLM)	42	Emerging	36	Python
21	HamedBabaei/LLMs4OL LLMs4OL:‌ Large Language Models for Ontology Learning	41	Emerging	150	Python
22	gjbex/Deploying-LLMs-locally Material for a training on AI tools	41	Emerging	18	Jupyter Notebook
23	johnmai-dev/NotebookMLX 📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)	40	Emerging	339	Jupyter Notebook
24	souzatharsis/tamingLLMs Taming LLMs: A Practical Guide to LLM Pitfalls with Open Source Software	40	Emerging	340	Jupyter Notebook
25	declare-lab/red-instruct Codes and datasets of the paper Red-Teaming Large Language Models using...	39	Emerging	108	Python
26	hitz-zentroa/GoLLIE Guideline following Large Language Model for Information Extraction	39	Emerging	431	Python
27	SolomonB14D3/knowledge-fidelity Behavioral auditing & repair toolkit for LLMs. Measures 8 dimensions via...	39	Emerging	3	Python
28	janelu9/EasyLLM Running Large Language Model easily.	39	Emerging	13	Python
29	kyaiooiayk/Awesome-LLM-Large-Language-Models-Notes What can I do with a LLM model?	38	Emerging	157	Jupyter Notebook
30	Curated-Awesome-Lists/awesome-llms-fine-tuning Explore a comprehensive collection of resources, tutorials, papers, tools,...	38	Emerging	505	—
31	WhereIsAI/BiLLM Tool for converting LLMs from uni-directional to bi-directional by removing...	38	Emerging	65	Python
32	stylellm/stylellm_models StyleLLM文风大模型：基于大语言模型的文本风格迁移项目。Text style transfer base on Large Language...	38	Emerging	352	—
33	coderonion/awesome-llm-and-aigc 🚀🚀🚀A collection of some awesome public projects about Large Language...	37	Emerging	804	—
34	nrimsky/LM-exp LLM experiments done during SERI MATS - focusing on activation steering /...	37	Emerging	103	Jupyter Notebook
35	virtualramblas/Domain-Specific-Small-Language-Models Repository for the companion Colab notebook of the Domain-Specific Small...	37	Emerging	29	Jupyter Notebook
36	chanind/linear-relational Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs)...	36	Emerging	10	Python
37	PaddlePaddle/PALM a Fast, Flexible, Extensible and Easy-to-use NLP Large-scale Pretraining and...	36	Emerging	185	Python
38	dobriban/Principles-of-AI-LLMs Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring...	36	Emerging	44	—
39	JayZhang42/SLED SLED: Self Logits Evolution Decoding for Improving Factuality in Large...	35	Emerging	119	Python
40	LISA-ITMO/LLM-resume-moderator Автоматизирует модерацию резюме на русском языке с помощью LLM. Для...	35	Emerging	5	Jupyter Notebook
41	ausboss/Local-LLM-Langchain Load local LLMs effortlessly in a Jupyter notebook for testing purposes...	34	Emerging	213	Jupyter Notebook
42	ictnlp/TruthX Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large...	34	Emerging	143	Python
43	Jackksonns/CoVALend CoVALend: a compliance-aware micro-lending default prediction pipeline with...	33	Emerging	2	Python
44	JinXins/Awesome-Token-Merge-for-MLLMs A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.	31	Emerging	86	—
45	cahlen/conversation-dataset-generator Craft conversational datasets (JSONL format with rich metadata) using LLMs....	30	Emerging	12	Python
46	danielsobrado/llm_notebooks Concepts and examples on using and training LLMs	30	Emerging	48	Jupyter Notebook
47	rickiepark/the-lm-book <대규모 언어 모델, 핵심만 빠르게!>(인사이트, 2025)의 코드 저장소	30	Emerging	5	Jupyter Notebook
48	zwhe99/X-SIR [ACL 2024] Can Watermarks Survive Translation? On the Cross-lingual...	29	Experimental	42	Python
49	wschella/llm-reliability Code for the paper "Larger and more instructable language models become less...	29	Experimental	31	Jupyter Notebook
50	lfunderburk/automate-tech-post LLM application: fine tuned model to generate social media posts from...	28	Experimental	13	Jupyter Notebook
51	Furyton/awesome-language-model-analysis This paper list focuses on the theoretical and empirical analysis of...	28	Experimental	98	Python
52	apanariello4/merge-and-rebase Model merging, task-vector rebasin, and fine-tuning for vision and LLM models.	28	Experimental	18	Python
53	RobinSmits/Dutch-LLMs Various training, inference and validation code and results related to Open...	27	Experimental	33	Jupyter Notebook
54	CLDiego/SPE_GeoHackathon_2025 Foundational bootcamp on LLM usage (prompting & inference) → tooling &...	27	Experimental	8	Jupyter Notebook
55	CristiVlad25/ai-papers Tracing the evolution of AI and large language models from early neural...	27	Experimental	10	—
56	an-yongqi/systematic-outliers [ICLR 2025] Systematic Outliers in Large Language Models.	27	Experimental	9	Python
57	kvignesh1420/cot-icl-lab [ACL 2025] Official implementation of the "CoT-ICL Lab" framework	27	Experimental	11	Python
58	crux82/u-deppllama Dependency parsing with Large Language Models	26	Experimental	5	Python
59	North-Shore-AI/tinkex_cookbook Elixir port of tinker-cookbook: training and evaluation recipes for the...	26	Experimental	3	Elixir
60	yubainu/sibainu-engine Real-time hallucination detection for LLMs via Geometric Drift Analysis in...	25	Experimental	3	Python
61	jacksonchen1998/LLaMA-Paper-List Collection of papers using LLaMA as backbone model	25	Experimental	46	—
62	Basel-anaya/LoreWeaver LoreWeaver is a Novel Generation Multimodal LLM based on Mistral 7B LLM	24	Experimental	3	Jupyter Notebook
63	piratheon/LiquidBunny-llm A bunch of script to train your own offsec LLM	24	Experimental	2	Python
64	piratheon/LB-llm_training_scripts A bunch of script to train your own offsec LLM	24	Experimental	2	Python
65	Koziev/LM-pretrain Char-level language model pretraining code and scripts	24	Experimental	3	Python
66	tripathiarpan20/self-improvement-4all Private self-improvement coaching with open-source LLMs	24	Experimental	16	Python
67	phonism/llm4cp Large Language Model for Competitive Programming	23	Experimental	2	Python
68	GovOn-Org/GovOn On-device AI 민원 처리 및 분석 시스템 \| LLM 경량화 & 파인튜닝 \| 현장미러형 연계 프로젝트 - 산업체 수요 기반 현장 실무 역량 강화	23	Experimental	1	Python
69	bosszii2709/ai-dataset-generator 🤖 Generate tailored AI training datasets quickly and easily, transforming...	23	Experimental	1	Python
70	mickymultani/LLM-Architecture Visualize some important concepts related to LLM architectures.	23	Experimental	6	Jupyter Notebook
71	christopherdanie/GovOn Develop an on-device AI system that processes and analyzes complaints using...	22	Experimental	—	Python
72	LaxmanNandi/MCH-Research Conservation law for LLM context sensitivity: ΔRCI × Var_Ratio ≈ K(domain)....	22	Experimental	—	Python
73	Betswish/Cross-Lingual-Consistency Easy-to-use framework for evaluating cross-lingual consistency of factual...	22	Experimental	27	Python
74	mantzaris/KeemenaLM.jl Language Models in Julia lang (transformers/GPT/decoders/chat etc)	22	Experimental	—	Julia
75	tehw0lf/writing-style-analyzer Analyze and profile writing styles in German and English text using local...	22	Experimental	—	Python
76	shuhulx/MergeLens Pre-merge diagnostic framework for LLM model merging — analyze...	22	Experimental	—	Python
77	j341nono/llemb Unified embedding extraction for decoder-only LLMs with support for pooling...	21	Experimental	—	Python
78	igorbenav/practical-language-models An open book that teaches language models starting from the learning problem...	21	Experimental	2	Python
79	JianxXiong/AAPO Implementation of AAPO (Arxiv: 2505.14264v2) paper	21	Experimental	16	Python
80	ChanLiang/CONNER [EMNLP 2023] Beyond Factuality: A Comprehensive Evaluation of Large Language...	21	Experimental	33	Python
81	ictnlp/LSG The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers...	20	Experimental	15	Python
82	SolomonB14D3/confidence-cartography-toolkit Teacher-forced confidence analysis for language models. pip install...	20	Experimental	1	Python
83	hitz-zentroa/This-is-not-a-Dataset We introduce a large semi-automatically generated dataset of ~400,000...	20	Experimental	13	Python
84	twitter-research/lmsoc Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining	20	Experimental	13	Jupyter Notebook
85	isaacus-dev/terge An easy-to-use Python library for merging PyTorch models.	20	Experimental	12	Python
86	ExplainableML/in-context-impersonation [NeurIPS 2023 Spotlight] In-Context Impersonation Reveals Large Language...	19	Experimental	22	Python
87	U4RASD/dalla-model-training Dalla training recipe using Huggingface SFT trainer	19	Experimental	8	Python
88	hquzhuguofeng/LLM-RoadMap ⭐️⭐️⭐️LLMs RoadMap，帮助各位从transformers仓库视角了解NLP传统任务，模型高效微调，低精度微调，分布式模型训练等工程内容	19	Experimental	5	Jupyter Notebook
89	HROlive/Deep-Learning-Week This 5 day online course was co-organised by LRZ and NVIDIA Deep Learning...	19	Experimental	4	Jupyter Notebook
90	kyegomez/ai-reading-list This collection brings together the highest-signal research papers in modern...	19	Experimental	6	—
91	mirulili/3Ch-Jamo-Watermark Capstone Project 2025 (Yonsei Univ.)	19	Experimental	—	Python
92	VARUN3WARE/pplm-watermark A research implementation of statistical text watermarking for large...	19	Experimental	—	Python
93	julienbrasseur/llm-hallucination-detector A lightweight library for extracting and analysing LLM internal representations	19	Experimental	—	Jupyter Notebook
94	machinelearningzuu/experiments-on-large-language-models This Repository Contains Different Experiments on LLMs with Hugging Face,...	18	Experimental	3	Jupyter Notebook
95	HKUNLP/multilingual-transfer Code for paper ”Language Versatilists vs. Specialists: An Empirical...	17	Experimental	15	Python
96	h3nock/ai-deep-dive An open-source interactive learning platform for understanding LLMs through...	17	Experimental	3	Python
97	Yash-Kavaiya/30-Days-LLM-Mastery-Course 30-Days-LLM-Mastery-Course: A comprehensive, hands-on course diving deep...	17	Experimental	4	TypeScript
98	juancmacias/Small_Lenguage_Model Píldora formativa sobre SLM (Small Lenguage Model)	17	Experimental	1	Python
99	NLPForUA/ZNO Structured test tasks and model tuning scripts for multiple subjects from...	16	Experimental	11	Python
100	augstentatious/TRuCAL TRuCAL: Truth-Recursive universal Correction Attention Layer An open-source...	16	Experimental	1	Jupyter Notebook
101	Aminbcf/LLM-Polished-Version This is lighter version of the llm i built as part pf my intership at expert...	16	Experimental	1	Python
102	chazciii/rd-net Inference-time drift experiment demonstrating reduced repetition collapse in...	16	Experimental	1	Python
103	rraghavkaushik/NLP-Reading-List A curated collection of NLP and LLM resources. Covers essential papers and...	16	Experimental	10	—
104	AlinaMustaqeem/open-LLM Kickstart with LLMs	16	Experimental	3	Jupyter Notebook
105	jwliao1209/TWLLM-Tutor 📘 Taiwan-LLM Tutor: Large Language Models for Taiwanese Secondary Education	15	Experimental	20	Python
106	nexageapps/LLM Hands-on notebooks to understand and build Large Language Models (LLMs) from...	15	Experimental	1	Jupyter Notebook
107	ivangabriele-playground/Trump-0.0-minus42B A really dumb and opinionated LLM — exclusively trained on Donald J. Trump's...	15	Experimental	—	Python
108	dettinjo/LLM-Fact-Auditor A post-processing pipeline to fact-check, entity-link, and verify answers...	15	Experimental	—	Python
109	S1LV3RJ1NX/mal-code This repository contains the code for all the book that I am writing `My...	15	Experimental	—	Python
110	NJUxlj/llm-hub Popular Large Language Model's modeling file and finetune+pretrain scripts,...	15	Experimental	2	Python
111	maximkha/The_Race_for_Intelligent_AI An article that describes the current state of AI and the next steps to...	15	Experimental	2	—
112	one-some/lazy-transformers-merge Merge transformers without using like a bajillion GB of RAM	14	Experimental	10	Python
113	samratrajsharma/LLMs Experimental implementations of core Large Language Model components...	14	Experimental	—	Jupyter Notebook
114	NLPForUA/UA-LLM The entry point for adapting, training, evaluating, and leveraging various...	14	Experimental	13	Python
115	HEMANGANI/LLM-Recommendation-Systems This project fine-tunes large language models (LLMs) for text-based...	14	Experimental	7	Jupyter Notebook
116	ewdlop/LMNotes Language model	13	Experimental	—	Python
117	ekunnii/adversarial-feedback-chatbot EMNLP 2020 finding paper "Learning Improvised Chatbots from Adversarial...	13	Experimental	5	Jupyter Notebook
118	tph-kds/vqa-llm A Based Large Language Model (LLM) for VQA based on a custom model applying...	13	Experimental	5	Jupyter Notebook
119	CyberMaryVer/llm-notebooks All the tutorials related to LLM	12	Experimental	3	Jupyter Notebook
120	crux82/advances-in-ai-2024 Materials used during the Lecture about LLMs held in the Summer School...	12	Experimental	4	Jupyter Notebook
121	anakin87/llama2-haystack Using Llama2 with Haystack, the NLP/LLM framework.	12	Experimental	16	Jupyter Notebook
122	raideno/awesome-motion A curated list of motion related resources.	12	Experimental	1	—
123	Itadori91/best-of-ai-open-source Curated collection of 150+ exceptional open-source AI projects with a...	12	Experimental	1	—
124	NotShrirang/LLM-Garden Implementing different LLM architectures in single repo	12	Experimental	1	Python
125	SolomonB14D3/confidence-cartography Teacher-forced confidence as a false-belief sensor for language models.	12	Experimental	1	Python
126	Da9TH5e/PyPilot A 𝐌𝐢𝐧𝐢-𝐀𝐈 𝐀𝐬𝐬𝐢𝐬𝐭𝐚𝐧𝐭 but in a python package for now (⚠︎ 𝘴𝘵𝘪𝘭𝘭 𝘪𝘯 𝘦𝘢𝘳𝘭𝘺 𝘥𝘦𝘷𝘦𝘭𝘰𝘱𝘮𝘦𝘯𝘵)	12	Experimental	1	Python
127	FawwazAhmd/msc-group-project MSc group project evaluating instruction-tuned LLMs for legal clause...	12	Experimental	1	Python
128	nath54/ChunkedDiffusion_LLM Chunked Diffusion LLM is an innovative machine learning project exploring a...	12	Experimental	1	Python
129	kaustpradalab/LLM-sycophancy [AAAI'26 Main🎉] Official code of "When Truth Is Overridden: Uncovering the...	11	Experimental	5	Python
130	Anonym0usWork1221/JaraConverse-TransformersBased This JaraConverse model is a cutting-edge Transformer-based supervised...	11	Experimental	2	Python
131	mattzzz/shakeLLM Exploration of LLMs using complete works of Shakespeare	11	Experimental	—	Python
132	wahab-cide/african_languages_llm_project Training multilingual language models on African languages including...	11	Experimental	—	Python
133	avirupc/nlp A curated collection of my learning path in NLP and LLMs. Contains my notes,...	11	Experimental	—	Jupyter Notebook
134	Blue-No1/open-weight-collection Tracking open-weight LLMs for research, experiments, and inference comparisons.	11	Experimental	—	—
135	Adityaram0001/LLM-DeepLearning A deep dive into the theory and practice of Large Language Models. This...	11	Experimental	—	—
136	priyanshujiiii/awesome_LLM A curated list of papers, datasets, and resources on Large Language Models (LLMs)	11	Experimental	—	—
137	maris205/DNAHL DNAHL Model- DNA sequence and Human Language mixed large language model	11	Experimental	2	Jupyter Notebook
138	CAI991108/Machine-Learning-and-Language-Model This project explores GPT-2 and Llama models through pre-training,...	11	Experimental	2	Python
139	gokhaneraslan/llm-dataset-generator Custom dataset generator from text and pdf	11	Experimental	—	Python
140	Blue-No1/llm-research-notes Notes & experiments on LLMs, open-weight models, multimodal systems, and...	11	Experimental	—	Python
141	Shehrozkashif/AI-For-Organizations Mitigating Hellucination in Private LLMs	11	Experimental	—	HTML
142	minorprojects/Stable-CAT Stable Causal Attention Transformer(StableCAT) is a tiny, minimal modern ...	11	Experimental	2	Python
143	Skwert001/hlft-legality-engine Legality-gated evaluation for LLMs, a structural fix for hallucinations that...	11	Experimental	—	Python
144	Alvaro8gb/Pheno-LLM Step-forward structuring disease phenotypic entities with LLMs for disease...	10	Experimental	1	Jupyter Notebook
145	Francesco-Sovrano/llms_for_vulnerability_detection_are_lost_in_the_end Replication package of the paper 'Large Language Models for In-File...	10	Experimental	1	—
146	mukeshmithrakumar/LLM-POC-2024 Popular Large Language Models from scratch - 2024	10	Experimental	1	Jupyter Notebook
147	BjornMelin/nlp-engineering-hub 📚 Enterprise NLP systems and LLM applications. Features custom language...	10	Experimental	1	—
148	priyanka387/LangChain-Vector-Databases-in-Production LLMs are deep learning models with billions of parameters that excel at a...	10	Experimental	1	Python
149	TimKoornstra/learn-like-an-llm Learn Like An LLM is an interactive tool that helps users understand...	10	Experimental	1	Python
150	2006coder/LLMs-words-defs-vs-dictionaries-defs evaluate AI's integrity	10	Experimental	1	Python
151	thanoskaravangelis/llm-experimentation Large Languade Model local chat in a Docker container, plus some NLP and...	10	Experimental	1	Jupyter Notebook

Comparisons in this category

PaddleNLP and EasyLLM (79 vs 39) step_into_llm and base-llm (57 vs 46) LFM2 and LFM (56 vs 56)