rag-evaluator and RAG-evaluation-harnesses

rag-evaluator
56
Established
RAG-evaluation-harnesses
28
Experimental
Maintenance 0/25
Adoption 12/25
Maturity 25/25
Community 19/25
Maintenance 2/25
Adoption 6/25
Maturity 9/25
Community 11/25
Stars: 42
Forks: 18
Downloads: 65
Commits (30d): 0
Language: Python
License: MIT
Stars: 23
Forks: 3
Downloads:
Commits (30d): 0
Language: Python
License: MIT
Stale 6m
Stale 6m No Package No Dependents

About rag-evaluator

AIAnytime/rag-evaluator

A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).

Computes eleven evaluation metrics including BLEU, ROUGE, BERT Score, METEOR, and MAUVE to assess generated responses across semantic similarity, fluency, readability, and bias dimensions. Provides both a Python API for programmatic evaluation and a Streamlit web interface for interactive analysis. Designed for end-to-end RAG pipeline assessment without requiring external model APIs.

About RAG-evaluation-harnesses

RulinShao/RAG-evaluation-harnesses

An evaluation suite for Retrieval-Augmented Generation (RAG).

Scores updated daily from GitHub, PyPI, and npm data. How scores work