xmpuspus/kb-arena
Benchmark 7 retrieval strategies on your own docs — naive vector, contextual, QnA pairs, knowledge graph, RAPTOR, PageIndex, and hybrid. Find which KB architecture fits your data.
Implements 8 retrieval strategies (including BM25, knowledge graphs, and RAPTOR) that run in parallel with pluggable LLM backends (Anthropic, OpenAI, Ollama) and auto-generates multi-tier benchmark questions from your documents. Ships a bundled React dashboard with strategy Arena mode for blind A/B comparison, cost tracking per strategy, and CI/CD integration via `--fail-below` thresholds—designed specifically for architecture selection rather than pipeline evaluation.
Available on PyPI.
Stars
6
Forks
2
Language
Python
License
MIT
Category
Last pushed
Mar 20, 2026
Monthly downloads
518
Commits (30d)
0
Dependencies
19
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/xmpuspus/kb-arena"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
beir-cellar/beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across...
superlinear-ai/raglite
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL
HKUDS/LightRAG
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
illuin-tech/vidore-benchmark
Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.
HKUDS/RAG-Anything
"RAG-Anything: All-in-One RAG Framework"