rageval and RAG-evaluation-harnesses

These are competitors, as both repositories provide an evaluation suite specifically designed for Retrieval-Augmented Generation (RAG) methods, making them alternative choices for the same purpose.

rageval
36
Emerging
RAG-evaluation-harnesses
28
Experimental
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 10/25
Maintenance 2/25
Adoption 6/25
Maturity 9/25
Community 11/25
Stars: 170
Forks: 10
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stars: 23
Forks: 3
Downloads:
Commits (30d): 0
Language: Python
License: MIT
Stale 6m No Package No Dependents
Stale 6m No Package No Dependents

About rageval

gomate-community/rageval

Evaluation tools for Retrieval-augmented Generation (RAG) methods.

Provides modular evaluation across six RAG pipeline stages—query rewriting, retrieval, compression, evidence verification, generation, and validation—with 30+ metrics spanning answer correctness (F1, ROUGE, EM), groundedness (citation precision/recall), and context adequacy. Supports both LLM-based and string-matching evaluators, with pluggable integrations for OpenAI APIs or open-source models via vllm. Includes benchmark implementations on ASQA and other QA datasets with reproducible evaluation scripts.

About RAG-evaluation-harnesses

RulinShao/RAG-evaluation-harnesses

An evaluation suite for Retrieval-Augmented Generation (RAG).

Scores updated daily from GitHub, PyPI, and npm data. How scores work