vibrantlabsai/ragas
Supercharge Your LLM Application Evaluations 🚀
This tool helps AI engineers and product managers objectively assess the quality of their Large Language Model (LLM) applications. It takes your LLM application's outputs and evaluates them using a set of pre-defined metrics or custom criteria. The result is a clear, data-driven score and feedback, allowing you to identify weaknesses and improve your AI's performance.
12,927 stars. Used by 6 other packages. Available on PyPI.
Use this if you are building or managing an LLM application and need to systematically measure its effectiveness and generate comprehensive test data without subjective manual reviews.
Not ideal if you are looking for a general-purpose analytics tool for traditional software or only need qualitative, human-in-the-loop feedback for your AI outputs.
Stars
12,927
Forks
1,294
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 24, 2026
Commits (30d)
0
Dependencies
19
Reverse dependents
6
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/vibrantlabsai/ragas"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Related tools
EvolvingLMMs-Lab/lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Giskard-AI/giskard-oss
🐢 Open-Source Evaluation & Testing library for LLM Agents
EuroEval/EuroEval
The robust European language model benchmark.
evalplus/evalplus
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024