beir-cellar/beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Supports multiple retrieval paradigms including lexical (BM25), dense (neural embeddings), sparse, and reranking-based approaches through a unified evaluation interface. Provides standardized implementations for diverse retrieval architectures and integrates with Hugging Face models, enabling zero-shot evaluation across heterogeneous tasks spanning different domains and query-document characteristics. Includes preprocessing pipelines for custom datasets and computes multiple metrics (NDCG, MAP, Recall, Precision, MRR) at standard cutoff points.
2,105 stars and 33,641 monthly downloads. Used by 7 other packages. Available on PyPI.
Stars
2,105
Forks
235
Language
Python
License
Apache-2.0
Category
Last pushed
Oct 16, 2025
Monthly downloads
33,641
Commits (30d)
0
Dependencies
3
Reverse dependents
7
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/beir-cellar/beir"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
HKUDS/LightRAG
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
superlinear-ai/raglite
🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL
HKUDS/RAG-Anything
"RAG-Anything: All-in-One RAG Framework"
illuin-tech/vidore-benchmark
Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.
DataScienceUIBK/Rankify
🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented...