izam-mohammed/ragrank

🎯 Your free LLM evaluation toolkit helps you assess the accuracy of facts, how well it understands context, its tone, and more. This helps you see how good your LLM applications are.

/ 100

Established

Specialized for RAG pipeline evaluation with metrics like response relevancy, context understanding, and factual accuracy. Built as a Python toolkit that integrates with OpenAI's API by default but supports custom LLM models, enabling flexible assessment workflows through a dataset-to-metrics evaluation pattern. Provides structured evaluation results exportable to dataframes for analysis and integration with downstream data processing pipelines.

No Package No Dependents

Maintenance 10 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Compare

ragrank and evalscope

Related tools

modelscope/evalscope

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation...

Kareem-Rashed/rubric-eval

Independent framework to test, benchmark, and evaluate LLMs & AI agents locally.

justplus/llm-eval

大语言模型评估平台，支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。

relari-ai/continuous-eval

Data-Driven Evaluation for LLM-Powered Applications

Addepto/contextcheck

MIT-licensed Framework for LLMs, RAGs, Chatbots testing. Configurable via YAML and integrable...

Explore RAG Tools

All categories Trending RAG directory Insights