LLM Reasoning Research LLM Tools
Research frameworks, benchmarks, and methods for evaluating and advancing LLM reasoning capabilities across domains. Includes reasoning architectures, inference-time scaling, RL-based reasoning training, and reasoning-specific datasets. Does NOT include general LLM fine-tuning, application tools that use reasoning, or non-LLM reasoning systems.
There are 79 llm reasoning research tools tracked. 1 score above 70 (verified tier). The highest-rated is open-thought/reasoning-gym at 73/100 with 1,367 stars and 47,629 monthly downloads. 1 of the top 10 are actively maintained.
Get all 79 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-reasoning-research&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
open-thought/reasoning-gym
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning... |
|
Verified |
| 2 |
Hmbown/Hegelion
Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis) |
|
Established |
| 3 |
princeton-nlp/tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large... |
|
Emerging |
| 4 |
ZichengXu/Decoding-Tree-Sketching
Decoding Tree Sketching (DTS): a training-free & model agonistic & plug-in... |
|
Emerging |
| 5 |
LLM360/Reasoning360
A repo for open research on building large reasoning models |
|
Emerging |
| 6 |
bowang-lab/BioReason
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM... |
|
Emerging |
| 7 |
TsinghuaC3I/Awesome-RL-for-LRMs
A Survey of Reinforcement Learning for Large Reasoning Models |
|
Emerging |
| 8 |
manglu097/Thoth
[ICLR 2026] Unleashing Scientific Reasoning for Bio-experimental Protocol... |
|
Emerging |
| 9 |
Peiyang-Song/Awesome-LLM-Reasoning-Failures
Repo for "Large Language Model Reasoning Failures" |
|
Emerging |
| 10 |
WeiboAI/VibeThinker
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model... |
|
Emerging |
| 11 |
PPPP-kaqiu/Awesome-Parallel-Reasoning
Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. ... |
|
Emerging |
| 12 |
jieyilong/tree-of-thought-puzzle-solver
The Tree of Thoughts (ToT) framework for solving complex reasoning tasks using LLMs |
|
Emerging |
| 13 |
Agent-RL/ReCall
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning... |
|
Emerging |
| 14 |
Wang-ML-Lab/TokUR
[ICLR 2026] TokUR: Token-Level Uncertainty Estimation for Large Language... |
|
Emerging |
| 15 |
mohammad-gh009/DrugReasoner
Predicting drug approval with reasoning. |
|
Emerging |
| 16 |
PRIME-RL/PRIME
Scalable RL solution for advanced reasoning of language models |
|
Emerging |
| 17 |
MiniMax-AI/SynLogic
[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable... |
|
Emerging |
| 18 |
TiMEM-AI/timem
Temporal-Hierarchical Memory Consolidation for Long-Horizon Conversational Agents |
|
Emerging |
| 19 |
sileod/reasoning_core
Procedural symbolic reasoning data generators suite for synthetic pretraining |
|
Emerging |
| 20 |
sileod/reasoning-core
Procedural symbolic reasoning data generators suite for synthetic pretraining |
|
Emerging |
| 21 |
Strong-AI-Lab/Logical-and-abstract-reasoning
Evaluation on Logical Reasoning and Abstract Reasoning Challenges |
|
Emerging |
| 22 |
diagram-of-thought/diagram-of-thought
Official implementation of paper "On the Diagram of Thought"... |
|
Emerging |
| 23 |
The-Martyr/Awesome-Multimodal-Reasoning
Latest Advances on (RL based) Multimodal Reasoning and Generation in... |
|
Emerging |
| 24 |
amazon-science/TISER
[ACL 2025] Learning to Reason Over Time: Timeline Self-Reflection for... |
|
Emerging |
| 25 |
madaan/llm-reasoning-tutorial
Resources for few-shot reasoning tutorial |
|
Experimental |
| 26 |
intuit-ai-research/SPUQ
SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models |
|
Experimental |
| 27 |
satori-reasoning/Satori
[ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought... |
|
Experimental |
| 28 |
sdpkjc/SATQuest
🏞 A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs |
|
Experimental |
| 29 |
geon0325/TimeCAP
Source code for the AAAI 2025 paper "TimeCAP: Learning to Contextualize,... |
|
Experimental |
| 30 |
Osilly/Awesome-Interleaving-Reasoning
Interleaving Reasoning: Next-Generation Reasoning Systems for AGI |
|
Experimental |
| 31 |
luban-agi/Awesome-LLM-reasoning
A curated paper list on LLM reasoning. |
|
Experimental |
| 32 |
Alsace08/Meta-Reasoning
Code and Data Repo for ACL'24 Paper "Meta-Reasoning: Semantics-Symbol... |
|
Experimental |
| 33 |
plusnli/MITS
[PAKDD 2026 oral] MITS: Enhanced Tree Search Reasoning for LLMs via... |
|
Experimental |
| 34 |
Yinghao-Li/Minesweeper-for-LLM
Code for paper: Assessing Logical Puzzle Solving in Large Language Models:... |
|
Experimental |
| 35 |
LAMDASZ-ML/Awesome-LLM-Reasoning-with-NeSy
✨✨Latest Advances on Neuro-Symbolic Learning in the era of Large Language Models |
|
Experimental |
| 36 |
klietus/SignalZero
Local first symbolic reasoning stack for large language models. Inference... |
|
Experimental |
| 37 |
JunyiYe/FaultyMathProblem
From Blind Solvers to Logical Thinkers: Benchmarking LLMs’ Logical Integrity... |
|
Experimental |
| 38 |
multimodal-art-projection/LatentCoT-Horizon
📖 This is a repository for organizing papers, codes, and other resources... |
|
Experimental |
| 39 |
OSU-NLP-Group/cobalt
Code and data for the paper "Bridging Online and Offline RL: Contextual... |
|
Experimental |
| 40 |
cui-shaobo/defeasibility-in-causality
exploring the defeasibility inside causality |
|
Experimental |
| 41 |
atfortes/LLMSymbolicReasoningBench
Synthetic data generation for evaluating LLM symbolic and logic reasoning |
|
Experimental |
| 42 |
Taishi-N324/Awesome-RL-Reasoning
Awesome-RL-Reasoning |
|
Experimental |
| 43 |
sail-sg/CLoT
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box:... |
|
Experimental |
| 44 |
Simula-COMPLEX/tbrullm
Technical Briefing on LLM-Assisted Uncertainty Analysis |
|
Experimental |
| 45 |
Skrapma4872/S3Q-Reasoning
📝 Enhance large language model outputs by revealing assumptions with a... |
|
Experimental |
| 46 |
BDML-lab/llm-inductive-reasoning-survey
This is the repository for the paper ‘A Survey of Inductive Reasoning for... |
|
Experimental |
| 47 |
fblgit/tree-of-knowledge
ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel... |
|
Experimental |
| 48 |
krystalan/DRT
Deep Reasoning Translation (DRT) Project |
|
Experimental |
| 49 |
Mihir3009/GridPuzzle
An evaluation dataset comprising of 274 grid-based puzzles with different... |
|
Experimental |
| 50 |
zhiyuanhubj/UoT
[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances... |
|
Experimental |
| 51 |
141forever/inductive-reasoning-papers
The Paper Collection of Inductive Reasoning from 2015 to 2025 |
|
Experimental |
| 52 |
Pomilon-Intelligence-Lab/CRSM
CRSM (Continuous Reasoning State Model): An asynchronous "System 2"... |
|
Experimental |
| 53 |
centre-for-humanities-computing/thoughtminers
Graphical and probabilistic reasoning in embedding spaces |
|
Experimental |
| 54 |
kang-ml/LogicTree
[EMNLP 2025 Main] LogicTree: Structured Proof Exploration for Coherent and... |
|
Experimental |
| 55 |
Ruiyang-061X/Uncertainty-o
✨ Official code for our paper: "Uncertainty-o: One Model-agnostic Framework... |
|
Experimental |
| 56 |
Pro-GenAI/S3Q-Reasoning
Scratchpad 3Q Reasoning: Improving Truthfulness and Reducing Hallucination... |
|
Experimental |
| 57 |
PurCL/ProSec
Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment" |
|
Experimental |
| 58 |
msmrexe/neurosymbolic-vqa-program-generator
A comprehensive implementation of a Neurosymbolic framework for Visual... |
|
Experimental |
| 59 |
sylvain-wei/24-Game-Reasoning
超简单复现Deepseek-R1-Zero和Deepseek-R1,以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL,以激发LLM的自主验... |
|
Experimental |
| 60 |
JIA-Lab-research/MoTCoder
This is the official code repository of MoTCoder: Elevating Large Language... |
|
Experimental |
| 61 |
OSU-NLP-Group/llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM... |
|
Experimental |
| 62 |
naivoder/MCTSr
Monte Carlo Tree Search Self-Refine (MCTSr) |
|
Experimental |
| 63 |
SagnikMukherjee/PARC
Premise-Augmented Reasoning Chains Improve Error Identification in Math... |
|
Experimental |
| 64 |
jxhuang0508/Awesome-LLM-Reasoning-OpenAI-o1
Awesome LLM papers, news and projects about learning to reason with LLM,... |
|
Experimental |
| 65 |
Siesher/Generator_for_reasoning
🧠 Reasoning data generator for LLM training |
|
Experimental |
| 66 |
ihasq/OpenReasoning
Turn Ultralight Bogo Model Into SOTA Reasoning Expert |
|
Experimental |
| 67 |
prodesk98/SQL-LLM-Distillation-GRPO
Inspired by mathematical reasoning models like DeepSeekMath, this framework... |
|
Experimental |
| 68 |
Letian2003/C-VQA
Counterfactual Reasoning VQA Dataset |
|
Experimental |
| 69 |
THUNLP-MT/symbol2language
Speak It Out: Solving Symbol-Related Problems with Symbol-to-Language... |
|
Experimental |
| 70 |
VITA-Group/o1-planning
[NeurIPS'24 LanGame workshop] On The Planning Abilities of OpenAI's o1... |
|
Experimental |
| 71 |
mukhal/GRACE
[EMNLP '23] Discriminator-Guided Chain-of-Thought Reasoning |
|
Experimental |
| 72 |
zzcnewly/ContPhy-Gen
Codebase and tutorial of ContPhy dataset generation for ICML 2024 paper... |
|
Experimental |
| 73 |
safouaneelg/zeroshot-reasoning
Ollama structured output for visual zeroshot reasoning |
|
Experimental |
| 74 |
231sm/Eval_Multi-Step_Reasoning
Comprehensive Evaluation On Answer Calibration For Multi-Step Reasoning |
|
Experimental |
| 75 |
nourdesoukizz/Reasoning-Rationalizing
we investigate whether models can maintain correct reasoning when exposed to... |
|
Experimental |
| 76 |
OthoXIII/theoreme-innommables
Theorem of the Unnameable [⧉/⧉ₛ] — Epistemological framework for binary... |
|
Experimental |
| 77 |
ParthaPRay/neuro-symbolic_abductive_reasoning_ollama_fault_diagnosis
This repo presents codes that allows user to run localized Ollama based... |
|
Experimental |
| 78 |
tirthankar95/grid-puzzle-reasoning
Tries to improve the Reasoning capabilities of LLM on datasets like GridPuzzle & AIME |
|
Experimental |
| 79 |
jpordoy/-Dynamic-Multi-Chain-Multi-Path-Reasoning-with-Consensus
Multi-path reasoning with dynamic chains and consensus scoring for improved... |
|
Experimental |