vibrantlabsai/ragas

Supercharge Your LLM Application Evaluations 🚀

70
/ 100
Verified

This tool helps AI engineers and product managers objectively assess the quality of their Large Language Model (LLM) applications. It takes your LLM application's outputs and evaluates them using a set of pre-defined metrics or custom criteria. The result is a clear, data-driven score and feedback, allowing you to identify weaknesses and improve your AI's performance.

12,927 stars. Used by 6 other packages. Available on PyPI.

Use this if you are building or managing an LLM application and need to systematically measure its effectiveness and generate comprehensive test data without subjective manual reviews.

Not ideal if you are looking for a general-purpose analytics tool for traditional software or only need qualitative, human-in-the-loop feedback for your AI outputs.

LLM-evaluation AI-product-management machine-learning-operations AI-testing RAG-systems
Maintenance 10 / 25
Adoption 15 / 25
Maturity 25 / 25
Community 20 / 25

How are scores calculated?

Stars

12,927

Forks

1,294

Language

Python

License

Apache-2.0

Last pushed

Feb 24, 2026

Commits (30d)

0

Dependencies

19

Reverse dependents

6

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/vibrantlabsai/ragas"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.