LLMeBench and LLF-Bench
Both projects are direct competitors, offering distinct benchmarks for evaluating large language models, with LLMeBench focusing on general LLM performance and LLF-Bench specializing in learning agents guided by language feedback.
Maintenance
2/25
Adoption
12/25
Maturity
17/25
Community
19/25
Maintenance
2/25
Adoption
9/25
Maturity
16/25
Community
18/25
Stars: 105
Forks: 21
Downloads: 17
Commits (30d): 0
Language: Python
License: —
Stars: 95
Forks: 18
Downloads: —
Commits (30d): 0
Language: Python
License: MIT
No License
Stale 6m
Stale 6m
No Package
No Dependents
About LLMeBench
qcri/LLMeBench
Benchmarking Large Language Models
About LLF-Bench
microsoft/LLF-Bench
A benchmark for evaluating learning agents based on just language feedback
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work