TsinghuaC3I/Awesome-RL-for-LRMs
A Survey of Reinforcement Learning for Large Reasoning Models
Comprehensive taxonomy of reinforcement learning techniques applied to reasoning models, organizing papers across reward design (generative, dense, unsupervised), policy optimization algorithms (policy gradient, critic-based, off-policy), and sampling strategies. Covers training resources spanning static corpora (code, math, STEM) and dynamic environments (rule-based, code-based, game-based), plus RL infrastructure and implementation frameworks. Extends to diverse applications including coding agents, browser automation, multimodal understanding, robotics, and scientific tasks with curated paper collections and linked implementations.
2,368 stars.
Stars
2,368
Forks
127
Language
TeX
License
MIT
Category
Last pushed
Nov 09, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/TsinghuaC3I/Awesome-RL-for-LRMs"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
open-thought/reasoning-gym
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Hmbown/Hegelion
Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)
LLM360/Reasoning360
A repo for open research on building large reasoning models
bowang-lab/BioReason
BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | NeurIPS '25
ZichengXu/Decoding-Tree-Sketching
Decoding Tree Sketching (DTS): a training-free & model agonistic & plug-in framework for LLM...