DocAILab/XRAG

XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanced Retrieval-Augmented Generation

/ 100

Established

Provides modular benchmarking for RAG systems through pluggable retrievers (vector, BM25, hybrid, tree-based), embeddings, and LLMs with comprehensive evaluation metrics spanning traditional (F1, NDCG), LLM-based (faithfulness, correctness), and deep evaluation dimensions. Implements agentic RAG workflows via five orchestrator types (sequential, conditional, iterative, parallel, hybrid) and integrates with OpenAI APIs, local models (Qwen, LLaMA via Ollama), and vector databases for end-to-end evaluation pipelines.

120 stars.

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

120

Forks

Language

Python

License

Apache-2.0

Featured in

You're Shipping AI You Can't Measure

Related tools

HZYAI/RagScore

⚡️ The "1-Minute RAG Audit" — Generate QA datasets & evaluate RAG systems in Colab, Jupyter, or...

vectara/open-rag-eval

RAG evaluation without the need for "golden answers"

AIAnytime/rag-evaluator

A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).

microsoft/benchmark-qed

Automated benchmarking of Retrieval-Augmented Generation (RAG) systems

2501Pr0ject/RAGnarok-AI

Local-first RAG evaluation framework for LLM applications. 100% local, no API keys required.

Explore RAG Tools

All categories Trending RAG directory Insights