AISmithLab/HumanStudy-Bench

HumanStudy-Bench: Towards AI Agent Design for Participant Simulation

41
/ 100
Emerging

Combines an Execution Engine that reconstructs full experimental protocols from published studies with standardized evaluation metrics (Probability Alignment Score, Effect Consistency Score) to measure whether LLM agents reach identical scientific conclusions as human participants. Supports modular agent design through customizable persona and prompt presets, enabling systematic comparison of configuration choices independent of base model capabilities. Includes 12 foundational studies spanning cognition and social psychology with over 6,000 trials, plus automated tooling to add new studies from research PDFs.

No Package No Dependents
Maintenance 13 / 25
Adoption 5 / 25
Maturity 9 / 25
Community 14 / 25

How are scores calculated?

Stars

12

Forks

3

Language

Python

License

MIT

Last pushed

Mar 08, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/agents/AISmithLab/HumanStudy-Bench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.