promptbench and Modelbench

These are competitors: PromptBench offers a mature, widely-adopted unified framework for systematically evaluating LLM robustness across adversarial prompts and datasets, while Modelbench appears to be an early-stage alternative attempting to provide similar benchmarking capabilities for prompts and models, though with negligible adoption.

promptbench

Verified

Modelbench

Experimental

Maintenance 10/25

Adoption 16/25

Maturity 25/25

Community 19/25

Maintenance 13/25

Adoption 1/25

Maturity 9/25

Community 0/25

Stars: 2,785

Forks: 219

Downloads: 288

Commits (30d): 0

Language: Python

License: MIT

Stars: 1

Forks: —

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

No risk flags

No Package No Dependents

About promptbench

microsoft/promptbench

A unified evaluation framework for large language models

About Modelbench

joshualamerton/Modelbench

Concept: benchmarking harness for prompts, models, and agent strategies

Related comparisons

promptbench and prompt-evaluator

Scores updated daily from GitHub, PyPI, and npm data. How scores work