promptbench and Modelbench

These are competitors: PromptBench offers a mature, widely-adopted unified framework for systematically evaluating LLM robustness across adversarial prompts and datasets, while Modelbench appears to be an early-stage alternative attempting to provide similar benchmarking capabilities for prompts and models, though with negligible adoption.

promptbench
70
Verified
Modelbench
23
Experimental
Maintenance 10/25
Adoption 16/25
Maturity 25/25
Community 19/25
Maintenance 13/25
Adoption 1/25
Maturity 9/25
Community 0/25
Stars: 2,785
Forks: 219
Downloads: 288
Commits (30d): 0
Language: Python
License: MIT
Stars: 1
Forks:
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
No risk flags
No Package No Dependents

About promptbench

microsoft/promptbench

A unified evaluation framework for large language models

About Modelbench

joshualamerton/Modelbench

Concept: benchmarking harness for prompts, models, and agent strategies

Related comparisons

Scores updated daily from GitHub, PyPI, and npm data. How scores work