lmms-eval and evaluation-guidebook

The comprehensive multimodal evaluation toolkit (A) and the LLM evaluation guidebook (B) are complementary, with (A) providing the practical implementation for a broad range of multimodal tasks and (B) offering theoretical knowledge and insights specifically for large language model evaluation, which could inform the use and interpretation of tool (A) for text-based tasks.

lmms-eval
90
Verified
evaluation-guidebook
49
Emerging
Maintenance 23/25
Adoption 20/25
Maturity 25/25
Community 22/25
Maintenance 6/25
Adoption 10/25
Maturity 16/25
Community 17/25
Stars: 3,883
Forks: 539
Downloads: 9,061
Commits (30d): 30
Language: Python
License:
Stars: 2,075
Forks: 121
Downloads:
Commits (30d): 0
Language: Jupyter Notebook
License:
No risk flags
No Package No Dependents

About lmms-eval

EvolvingLMMs-Lab/lmms-eval

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

About evaluation-guidebook

huggingface/evaluation-guidebook

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Scores updated daily from GitHub, PyPI, and npm data. How scores work