TsinghuaC3I/Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

50
/ 100
Established

Comprehensive taxonomy of reinforcement learning techniques applied to reasoning models, organizing papers across reward design (generative, dense, unsupervised), policy optimization algorithms (policy gradient, critic-based, off-policy), and sampling strategies. Covers training resources spanning static corpora (code, math, STEM) and dynamic environments (rule-based, code-based, game-based), plus RL infrastructure and implementation frameworks. Extends to diverse applications including coding agents, browser automation, multimodal understanding, robotics, and scientific tasks with curated paper collections and linked implementations.

2,368 stars.

No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

2,368

Forks

127

Language

TeX

License

MIT

Last pushed

Nov 09, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/TsinghuaC3I/Awesome-RL-for-LRMs"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.