LLM Reasoning Research LLM Tools

Research frameworks, benchmarks, and methods for evaluating and advancing LLM reasoning capabilities across domains. Includes reasoning architectures, inference-time scaling, RL-based reasoning training, and reasoning-specific datasets. Does NOT include general LLM fine-tuning, application tools that use reasoning, or non-LLM reasoning systems.

There are 79 llm reasoning research tools tracked. 1 score above 70 (verified tier). The highest-rated is open-thought/reasoning-gym at 73/100 with 1,367 stars and 47,629 monthly downloads. 1 of the top 10 are actively maintained.

Get all 79 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-reasoning-research&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	open-thought/reasoning-gym [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning...	73	Verified	1,367	Python
2	Hmbown/Hegelion Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)	50	Established	137	Python
3	princeton-nlp/tree-of-thought-llm [NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large...	46	Emerging	5,873	Python
4	ZichengXu/Decoding-Tree-Sketching Decoding Tree Sketching (DTS): a training-free & model agonistic & plug-in...	45	Emerging	67	Python
5	LLM360/Reasoning360 A repo for open research on building large reasoning models	44	Emerging	140	Python
6	bowang-lab/BioReason BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM...	44	Emerging	374	Jupyter Notebook
7	TsinghuaC3I/Awesome-RL-for-LRMs A Survey of Reinforcement Learning for Large Reasoning Models	43	Emerging	2,368	TeX
8	manglu097/Thoth [ICLR 2026] Unleashing Scientific Reasoning for Bio-experimental Protocol...	41	Emerging	65	Python
9	Peiyang-Song/Awesome-LLM-Reasoning-Failures Repo for "Large Language Model Reasoning Failures"	41	Emerging	165	—
10	WeiboAI/VibeThinker Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model...	40	Emerging	575	Python
11	PPPP-kaqiu/Awesome-Parallel-Reasoning Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. ...	39	Emerging	49	HTML
12	jieyilong/tree-of-thought-puzzle-solver The Tree of Thoughts (ToT) framework for solving complex reasoning tasks using LLMs	38	Emerging	371	Python
13	Agent-RL/ReCall ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning...	37	Emerging	1,343	Python
14	Wang-ML-Lab/TokUR [ICLR 2026] TokUR: Token-Level Uncertainty Estimation for Large Language...	37	Emerging	4	Python
15	mohammad-gh009/DrugReasoner Predicting drug approval with reasoning.	36	Emerging	11	Python
16	PRIME-RL/PRIME Scalable RL solution for advanced reasoning of language models	36	Emerging	1,813	Python
17	MiniMax-AI/SynLogic [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable...	36	Emerging	198	Python
18	TiMEM-AI/timem Temporal-Hierarchical Memory Consolidation for Long-Horizon Conversational Agents	35	Emerging	82	Python
19	sileod/reasoning_core Procedural symbolic reasoning data generators suite for synthetic pretraining	35	Emerging	34	Python
20	sileod/reasoning-core Procedural symbolic reasoning data generators suite for synthetic pretraining	35	Emerging	35	Python
21	Strong-AI-Lab/Logical-and-abstract-reasoning Evaluation on Logical Reasoning and Abstract Reasoning Challenges	34	Emerging	29	Python
22	diagram-of-thought/diagram-of-thought Official implementation of paper "On the Diagram of Thought"...	32	Emerging	193	—
23	The-Martyr/Awesome-Multimodal-Reasoning Latest Advances on (RL based) Multimodal Reasoning and Generation in...	32	Emerging	48	—
24	amazon-science/TISER [ACL 2025] Learning to Reason Over Time: Timeline Self-Reflection for...	31	Emerging	9	—
25	madaan/llm-reasoning-tutorial Resources for few-shot reasoning tutorial	29	Experimental	15	Jupyter Notebook
26	intuit-ai-research/SPUQ SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models	29	Experimental	15	Python
27	satori-reasoning/Satori [ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought...	29	Experimental	109	Python
28	sdpkjc/SATQuest 🏞 A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs	28	Experimental	5	Python
29	geon0325/TimeCAP Source code for the AAAI 2025 paper "TimeCAP: Learning to Contextualize,...	28	Experimental	50	Python
30	Osilly/Awesome-Interleaving-Reasoning Interleaving Reasoning: Next-Generation Reasoning Systems for AGI	26	Experimental	260	—
31	luban-agi/Awesome-LLM-reasoning A curated paper list on LLM reasoning.	26	Experimental	90	—
32	Alsace08/Meta-Reasoning Code and Data Repo for ACL'24 Paper "Meta-Reasoning: Semantics-Symbol...	26	Experimental	7	—
33	plusnli/MITS [PAKDD 2026 oral] MITS: Enhanced Tree Search Reasoning for LLMs via...	25	Experimental	3	Python
34	Yinghao-Li/Minesweeper-for-LLM Code for paper: Assessing Logical Puzzle Solving in Large Language Models:...	25	Experimental	5	Python
35	LAMDASZ-ML/Awesome-LLM-Reasoning-with-NeSy ✨✨Latest Advances on Neuro-Symbolic Learning in the era of Large Language Models	25	Experimental	274	—
36	klietus/SignalZero Local first symbolic reasoning stack for large language models. Inference...	25	Experimental	3	Python
37	JunyiYe/FaultyMathProblem From Blind Solvers to Logical Thinkers: Benchmarking LLMs’ Logical Integrity...	24	Experimental	4	—
38	multimodal-art-projection/LatentCoT-Horizon 📖 This is a repository for organizing papers, codes, and other resources...	24	Experimental	367	—
39	OSU-NLP-Group/cobalt Code and data for the paper "Bridging Online and Offline RL: Contextual...	24	Experimental	9	Python
40	cui-shaobo/defeasibility-in-causality exploring the defeasibility inside causality	24	Experimental	4	Python
41	atfortes/LLMSymbolicReasoningBench Synthetic data generation for evaluating LLM symbolic and logic reasoning	24	Experimental	22	Python
42	Taishi-N324/Awesome-RL-Reasoning Awesome-RL-Reasoning	24	Experimental	14	—
43	sail-sg/CLoT CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box:...	23	Experimental	322	Python
44	Simula-COMPLEX/tbrullm Technical Briefing on LLM-Assisted Uncertainty Analysis	22	Experimental	—	HTML
45	Skrapma4872/S3Q-Reasoning 📝 Enhance large language model outputs by revealing assumptions with a...	22	Experimental	—	Python
46	BDML-lab/llm-inductive-reasoning-survey This is the repository for the paper ‘A Survey of Inductive Reasoning for...	22	Experimental	46	—
47	fblgit/tree-of-knowledge ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel...	22	Experimental	56	—
48	krystalan/DRT Deep Reasoning Translation (DRT) Project	22	Experimental	240	—
49	Mihir3009/GridPuzzle An evaluation dataset comprising of 274 grid-based puzzles with different...	21	Experimental	8	—
50	zhiyuanhubj/UoT [NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances...	21	Experimental	106	Python
51	141forever/inductive-reasoning-papers The Paper Collection of Inductive Reasoning from 2015 to 2025	21	Experimental	22	—
52	Pomilon-Intelligence-Lab/CRSM CRSM (Continuous Reasoning State Model): An asynchronous "System 2"...	20	Experimental	1	Python
53	centre-for-humanities-computing/thoughtminers Graphical and probabilistic reasoning in embedding spaces	19	Experimental	—	Python
54	kang-ml/LogicTree [EMNLP 2025 Main] LogicTree: Structured Proof Exploration for Coherent and...	19	Experimental	6	Python
55	Ruiyang-061X/Uncertainty-o ✨ Official code for our paper: "Uncertainty-o: One Model-agnostic Framework...	19	Experimental	18	Python
56	Pro-GenAI/S3Q-Reasoning Scratchpad 3Q Reasoning: Improving Truthfulness and Reducing Hallucination...	18	Experimental	4	Python
57	PurCL/ProSec Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"	17	Experimental	17	Python
58	msmrexe/neurosymbolic-vqa-program-generator A comprehensive implementation of a Neurosymbolic framework for Visual...	17	Experimental	2	Python
59	sylvain-wei/24-Game-Reasoning 超简单复现Deepseek-R1-Zero和Deepseek-R1，以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL，以激发LLM的自主验...	16	Experimental	34	Python
60	JIA-Lab-research/MoTCoder This is the official code repository of MoTCoder: Elevating Large Language...	16	Experimental	85	Python
61	OSU-NLP-Group/llm-planning-eval [ACL'24] Code and data of paper "When is Tree Search Useful for LLM...	16	Experimental	54	Python
62	naivoder/MCTSr Monte Carlo Tree Search Self-Refine (MCTSr)	15	Experimental	22	Python
63	SagnikMukherjee/PARC Premise-Augmented Reasoning Chains Improve Error Identification in Math...	15	Experimental	5	Python
64	jxhuang0508/Awesome-LLM-Reasoning-OpenAI-o1 Awesome LLM papers, news and projects about learning to reason with LLM,...	15	Experimental	27	—
65	Siesher/Generator_for_reasoning 🧠 Reasoning data generator for LLM training	15	Experimental	1	Jupyter Notebook
66	ihasq/OpenReasoning Turn Ultralight Bogo Model Into SOTA Reasoning Expert	15	Experimental	—	—
67	prodesk98/SQL-LLM-Distillation-GRPO Inspired by mathematical reasoning models like DeepSeekMath, this framework...	15	Experimental	6	Jupyter Notebook
68	Letian2003/C-VQA Counterfactual Reasoning VQA Dataset	15	Experimental	28	Python
69	THUNLP-MT/symbol2language Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language...	15	Experimental	6	—
70	VITA-Group/o1-planning [NeurIPS'24 LanGame workshop] On The Planning Abilities of OpenAI's o1...	14	Experimental	42	Python
71	mukhal/GRACE [EMNLP '23] Discriminator-Guided Chain-of-Thought Reasoning	14	Experimental	50	Python
72	zzcnewly/ContPhy-Gen Codebase and tutorial of ContPhy dataset generation for ICML 2024 paper...	14	Experimental	10	C#
73	safouaneelg/zeroshot-reasoning Ollama structured output for visual zeroshot reasoning	14	Experimental	4	HTML
74	231sm/Eval_Multi-Step_Reasoning Comprehensive Evaluation On Answer Calibration For Multi-Step Reasoning	12	Experimental	4	Python
75	nourdesoukizz/Reasoning-Rationalizing we investigate whether models can maintain correct reasoning when exposed to...	11	Experimental	—	Jupyter Notebook
76	OthoXIII/theoreme-innommables Theorem of the Unnameable [⧉/⧉ₛ] — Epistemological framework for binary...	11	Experimental	—	Python
77	ParthaPRay/neuro-symbolic_abductive_reasoning_ollama_fault_diagnosis This repo presents codes that allows user to run localized Ollama based...	11	Experimental	—	Python
78	tirthankar95/grid-puzzle-reasoning Tries to improve the Reasoning capabilities of LLM on datasets like GridPuzzle & AIME	11	Experimental	—	Jupyter Notebook
79	jpordoy/-Dynamic-Multi-Chain-Multi-Path-Reasoning-with-Consensus Multi-path reasoning with dynamic chains and consensus scoring for improved...	10	Experimental	1	Jupyter Notebook