VascoSch92/bench-lab
The goal is to develop a unified framework for evaluating LLMs, agents, and RAG systems across well-known and custom benchmarks, while providing users with statistical tools to understand and improve their systems.
Stars
3
Forks
—
Language
Python
License
—
Category
Last pushed
Jan 26, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/VascoSch92/bench-lab"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
modelscope/evalscope
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation...
Kareem-Rashed/rubric-eval
Independent framework to test, benchmark, and evaluate LLMs & AI agents locally.
izam-mohammed/ragrank
🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it...
justplus/llm-eval
大语言模型评估平台,支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。
dokimos-dev/dokimos
Evaluation Framework for LLM applications in Java and Kotlin