HZYAI/RagScore
⚡️ The "1-Minute RAG Audit" — Generate QA datasets & evaluate RAG systems in Colab, Jupyter, or CLI. Privacy-first, async, visual reports.
Supports both local (Ollama) and cloud LLM providers with structured JSON-based QA generation, enabling fully private evaluation for sensitive domains. Multi-metric evaluation breaks down RAG performance across five dimensions—correctness, completeness, relevance, conciseness, and faithfulness—all computed in a single LLM call, with audience-targeted QA generation to tailor assessments for specific user groups (developers, customers, auditors). Works as a Python API, CLI tool, or MCP server with async processing and integrates with any RAG endpoint via HTTP.
30 stars and 1,052 monthly downloads. Used by 1 other package. Available on PyPI.
Stars
30
Forks
5
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Monthly downloads
1,052
Commits (30d)
0
Dependencies
8
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/HZYAI/RagScore"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Related tools
vectara/open-rag-eval
RAG evaluation without the need for "golden answers"
2501Pr0ject/RAGnarok-AI
Local-first RAG evaluation framework for LLM applications. 100% local, no API keys required.
DocAILab/XRAG
XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanced...
AIAnytime/rag-evaluator
A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).
microsoft/benchmark-qed
Automated benchmarking of Retrieval-Augmented Generation (RAG) systems