TsinghuaC3I/Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

/ 100

Established

Comprehensive taxonomy of reinforcement learning techniques applied to reasoning models, organizing papers across reward design (generative, dense, unsupervised), policy optimization algorithms (policy gradient, critic-based, off-policy), and sampling strategies. Covers training resources spanning static corpora (code, math, STEM) and dynamic environments (rule-based, code-based, game-based), plus RL infrastructure and implementation frameworks. Extends to diverse applications including coding agents, browser automation, multimodal understanding, robotics, and scientific tasks with curated paper collections and linked implementations.

2,368 stars.

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

2,368

Forks

127

Language

TeX

License

MIT

Related tools

open-thought/reasoning-gym

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Hmbown/Hegelion

Dialectical reasoning architecture for LLMs (Thesis → Antithesis → Synthesis)

LLM360/Reasoning360

A repo for open research on building large reasoning models

bowang-lab/BioReason

BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | NeurIPS '25

ZichengXu/Decoding-Tree-Sketching

Decoding Tree Sketching (DTS): a training-free & model agonistic & plug-in framework for LLM...

Explore LLM Tools

All categories Trending LLM Tool directory Insights