DAMO-NLP-SG/M3Exam
Data and code for paper "M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models"
103 stars. No commits in the last 6 months.
Stars
103
Forks
13
Language
Python
License
—
Category
Last pushed
Jun 15, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/DAMO-NLP-SG/M3Exam"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MMMU-Benchmark/MMMU
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal...
pat-jj/DeepRetrieval
[COLM’25] DeepRetrieval — 🔥 Training Search Agent by RLVR with Retrieval Outcome
lupantech/MathVista
MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts
ise-uiuc/magicoder
[ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct
x66ccff/liveideabench
[𝐍𝐚𝐭𝐮𝐫𝐞 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬] 🤖💡 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea...