wshi83/MedAgentGym

[ICLR'26] MedAgentGYM: Training LLM Agents for Code-Based Medical Reasoning at Scale

/ 100

Emerging

This training environment is designed to improve how large language models (LLMs) can reason and generate code for medical tasks. It takes anonymized electronic health record (EHR) data and medical task descriptions, then evaluates the LLM's ability to produce correct, executable code for medical reasoning problems. Researchers and AI developers focused on medical AI would use this to build more capable AI assistants for healthcare.

Use this if you are a researcher or AI developer working on training or fine-tuning large language models to perform complex, code-based medical reasoning tasks.

Not ideal if you are a clinician looking for a ready-to-use medical diagnostic tool, as this is an environment for developing and evaluating AI, not a clinical application.

medical AI development healthcare natural language processing LLM training electronic health record analysis medical reasoning

No License No Package No Dependents

Maintenance 10 / 25

Adoption 9 / 25

Maturity 7 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Compare

MedAgentGym and AgentGym-RL

Higher-rated alternatives

ai4co/reevo

[NeurIPS 2024] ReEvo: Large Language Models as Hyper-Heuristics with Reflective Evolution

lean-dojo/LeanCopilot

LLMs as Copilots for Theorem Proving in Lean

WooooDyy/AgentGym-RL

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon...

sethkarten/LLM-Economist

Official repository of the 2025 paper, LLM Economist: Large Population Models and Mechanism...

WooooDyy/AgentGym

Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based...

Explore Transformer Models

All categories Trending Transformer directory Insights