rag-evaluator and RAG-evaluation-harnesses

rag-evaluator

Established

RAG-evaluation-harnesses

Experimental

Maintenance 0/25

Adoption 12/25

Maturity 25/25

Community 19/25

Maintenance 2/25

Adoption 6/25

Maturity 9/25

Community 11/25

Stars: 42

Forks: 18

Downloads: 65

Commits (30d): 0

Language: Python

License: MIT

Stars: 23

Forks: 3

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

Stale 6m

Stale 6m No Package No Dependents

About rag-evaluator

AIAnytime/rag-evaluator

A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).

Computes eleven evaluation metrics including BLEU, ROUGE, BERT Score, METEOR, and MAUVE to assess generated responses across semantic similarity, fluency, readability, and bias dimensions. Provides both a Python API for programmatic evaluation and a Streamlit web interface for interactive analysis. Designed for end-to-end RAG pipeline assessment without requiring external model APIs.

About RAG-evaluation-harnesses

RulinShao/RAG-evaluation-harnesses

An evaluation suite for Retrieval-Augmented Generation (RAG).

Related comparisons

rag-evaluator and open-rag-eval rag-evaluator and rageval rag-evaluator and open-rag-eval rag-evaluator and rageval

Scores updated daily from GitHub, PyPI, and npm data. How scores work