rageval and RAG-evaluation-harnesses
These are competitors, as both repositories provide an evaluation suite specifically designed for Retrieval-Augmented Generation (RAG) methods, making them alternative choices for the same purpose.
About rageval
gomate-community/rageval
Evaluation tools for Retrieval-augmented Generation (RAG) methods.
Provides modular evaluation across six RAG pipeline stages—query rewriting, retrieval, compression, evidence verification, generation, and validation—with 30+ metrics spanning answer correctness (F1, ROUGE, EM), groundedness (citation precision/recall), and context adequacy. Supports both LLM-based and string-matching evaluators, with pluggable integrations for OpenAI APIs or open-source models via vllm. Includes benchmark implementations on ASQA and other QA datasets with reproducible evaluation scripts.
About RAG-evaluation-harnesses
RulinShao/RAG-evaluation-harnesses
An evaluation suite for Retrieval-Augmented Generation (RAG).
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work