lmms-eval and evaluation-guidebook
The comprehensive multimodal evaluation toolkit (A) and the LLM evaluation guidebook (B) are complementary, with (A) providing the practical implementation for a broad range of multimodal tasks and (B) offering theoretical knowledge and insights specifically for large language model evaluation, which could inform the use and interpretation of tool (A) for text-based tasks.
Maintenance
23/25
Adoption
20/25
Maturity
25/25
Community
22/25
Maintenance
6/25
Adoption
10/25
Maturity
16/25
Community
17/25
Stars: 3,883
Forks: 539
Downloads: 9,061
Commits (30d): 30
Language: Python
License: —
Stars: 2,075
Forks: 121
Downloads: —
Commits (30d): 0
Language: Jupyter Notebook
License: —
No risk flags
No Package
No Dependents
About lmms-eval
EvolvingLMMs-Lab/lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
About evaluation-guidebook
huggingface/evaluation-guidebook
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work