LLM Agent Training Gyms LLM Tools
Gymnasium-style environments and frameworks for training LLM agents through reinforcement learning, multi-turn decision-making, and self-play. Does NOT include general RL frameworks, agent orchestration platforms, or applications using pre-trained agents.
There are 50 llm agent training gyms tools tracked. 3 score above 50 (established tier). The highest-rated is Gen-Verse/LatentMAS at 53/100 with 800 stars. 1 of the top 10 are actively maintained.
Get all 50 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-agent-training-gyms&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
Gen-Verse/LatentMAS
Latent Collaboration in Multi-Agent Systems |
|
Established |
| 2 |
ai4co/reevo
[NeurIPS 2024] ReEvo: Large Language Models as Hyper-Heuristics with... |
|
Established |
| 3 |
SALT-NLP/collaborative-gym
Framework and toolkits for building and evaluating collaborative agents that... |
|
Established |
| 4 |
lean-dojo/LeanCopilot
LLMs as Copilots for Theorem Proving in Lean |
|
Emerging |
| 5 |
sethkarten/LLM-Economist
Official repository of the 2025 paper, LLM Economist: Large Population... |
|
Emerging |
| 6 |
WooooDyy/AgentGym-RL
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for... |
|
Emerging |
| 7 |
datphamvn/HSEvo
[AAAI-25] HSEvo: Elevating Automatic Heuristic Design with Diversity-Driven... |
|
Emerging |
| 8 |
FusionBrainLab/gigaevo-core
Evolutionary algorithm that uses Large Language Models (LLMs) to... |
|
Emerging |
| 9 |
WooooDyy/AgentGym
Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large... |
|
Emerging |
| 10 |
GeminiLight/gen-mentor
[WWW '25 Oral - GenMentor] Official code of our paper "LLM-powered... |
|
Emerging |
| 11 |
proger/haloop
Agent toolkit for 100 hours of speech and 10 GiB of text |
|
Emerging |
| 12 |
axon-rl/gem
A Gym for Agentic LLMs |
|
Emerging |
| 13 |
Alibaba-Quark/SSP
Search Self-Play: Pushing the Frontier of Agent Capability without Supervision |
|
Emerging |
| 14 |
zju-vipa/Odyssey
Odyssey: Empowering Minecraft Agents with Open-World Skills |
|
Emerging |
| 15 |
spiral-rl/spiral
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent... |
|
Emerging |
| 16 |
wellecks/llmstep
llmstep: [L]LM proofstep suggestions in Lean 4. |
|
Emerging |
| 17 |
zjunlp/MachineSoM
[ACL 2024] Exploring Collaboration Mechanisms for LLM Agents: A Social... |
|
Emerging |
| 18 |
moment-timeseries-foundation-model/TimeSeriesGym
Official code for TimeSeriesGym: A Scalable Benchmark for (Time Series)... |
|
Emerging |
| 19 |
thu-nics/MARSHAL
MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs |
|
Emerging |
| 20 |
codezakh/DataEnvGym
A testbed for agents and environments that can automatically improve models... |
|
Emerging |
| 21 |
atasoglu/toolsgen
A modular Python library for synthesizing tool-calling datasets from JSON... |
|
Emerging |
| 22 |
bin123apple/InfantAgent
[NeurIPS 2025] A multimodal agent that can interact with its own PC in a... |
|
Experimental |
| 23 |
wshi83/MedAgentGym
[ICLR'26] MedAgentGYM: Training LLM Agents for Code-Based Medical Reasoning at Scale |
|
Experimental |
| 24 |
OpenMLRL/LLM_Collab_Code_Generation
LLM Collaboration for Code Generation |
|
Experimental |
| 25 |
Human-Oriented-ATP/motivated-proof-facilitator
A graphical interface that makes it convenient to construct "motivated... |
|
Experimental |
| 26 |
hg0428/Mar-PS
A Multi-Agent Reasoning Problem Solver. You build teams and they work... |
|
Experimental |
| 27 |
Reason-Wang/ToolGen
[ICLR 2025] The official implementation of paper "ToolGen: Unified Tool... |
|
Experimental |
| 28 |
blyhm/AgentGym-RL
🤖 Train LLM agents for multi-turn decision-making with AgentGym-RL,... |
|
Experimental |
| 29 |
MichaelvanLaar/proof-of-thought
TypeScript port of https://github.com/DebarghaG/proofofthought by DebarghaG. |
|
Experimental |
| 30 |
NKAI-Decision-Team/HEP-LLM-play-StarCraftII
Hierarchical Expert Prompt for Large-Language-Models: An Approch Defeat... |
|
Experimental |
| 31 |
OpenDFM/ibsen
[ACL 2024] Official code for "IBSEN: Director-Actor Agent Collaboration for... |
|
Experimental |
| 32 |
KevinHaylett/CorpusAncora
Geofinitism: The Geometry of Language and Thought |
|
Experimental |
| 33 |
TobyYang7/TwinMarket
[NeurIPS 2025] A multi-agent framework that leverages LLMs to simulate... |
|
Experimental |
| 34 |
zjunlp/predict-before-execute
Can We Predict Before Executing Machine Learning Agents? |
|
Experimental |
| 35 |
JLanghamLopez/prisoners-dilemma
The Iterated Prisoners Dilemma for LLM Agents |
|
Experimental |
| 36 |
HATS-ICT/PersonaEvolve
[EMNLP 2025 Main] Official Repo for Paper: "Implicit Behavioral Alignment of... |
|
Experimental |
| 37 |
fannie1208/W4S
[COLM2025] "Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors" |
|
Experimental |
| 38 |
liuxiaotong/knowlyr-gym
Gymnasium-style RL framework for LLM agent training — MDP environments,... |
|
Experimental |
| 39 |
iphysresearch/evo-mcts
Official implementation of "Automated Algorithmic Discovery for... |
|
Experimental |
| 40 |
iainjclark/synthetic-anthropology-cognition-lab
Research lab notebook and code for synthetic anthropology experiments using... |
|
Experimental |
| 41 |
papachristoumarios/llm-network-formation
Supplementary Code and Data for "Network Formation and Dynamics among Multi-LLMs" |
|
Experimental |
| 42 |
Tsumugii24/HAMLET
[ICLR 2026] Official code implementation for paper HAMLET: A Hierarchical... |
|
Experimental |
| 43 |
chirindaopensource/bias_adjusted_LLM_agents_human_like_decision_making
End-to-End Python framework implementing bias-adjusted LLM agents for... |
|
Experimental |
| 44 |
yuliu625/Simulate-the-Prisoners-Dilemma-with-Agents
An AutoGen-based simulation framework for the Prisoner's Dilemma. Explore... |
|
Experimental |
| 45 |
Seldre99/HeRoN
Python code to implement HeRoN, a mediated RL–LLM framework to create NPCs... |
|
Experimental |
| 46 |
opendilab/OpenPaL
Building open-ended embodied agent in battle royale FPS game |
|
Experimental |
| 47 |
reveurmichael/space_mining
SpaceMining: a novel RL environment beyond LLM priors |
|
Experimental |
| 48 |
lgy0404/LearnAct
Official code repo for the paper "LearnAct: Few-Shot Mobile GUI Agent with a... |
|
Experimental |
| 49 |
AntonioSabbatellaUni/LLM-Multi-Agent-Optimization-Framework
Official implementation of MALBO (arXiv:2511.11788). Optimizes Multi-Agent... |
|
Experimental |
| 50 |
SachinVarghese/telma
Toolkit Evaluator for Language Model Agents |
|
Experimental |