RulinShao/retrieval-scaling
Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".
Implements an efficient pipeline for building and serving retrieval-augmented language models at massive scale, supporting both dense retrievers (Contriever, SentenceTransformers) and sparse methods (BM25) with optional semantic chunking and memory-efficient passage loading via hashing. Provides distributed datastore construction and retrieval with optimized indexing (IVF-Flat, IVF-PQ) for sub-30ms API latency, plus integrated evaluation across perplexity and downstream tasks via adapted lm-evaluation-harness with published 1.4T-token and 140B-token datastores on Hugging Face.
224 stars.
Stars
224
Forks
18
Language
Python
License
MIT
Category
Last pushed
Dec 16, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/RulinShao/retrieval-scaling"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
denser-org/denser-retriever
An enterprise-grade AI retriever designed to streamline AI integration into your applications,...
rayliuca/T-Ragx
Enhancing Translation with RAG-Powered Large Language Models
neuml/rag
🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with...
NovaSearch-Team/RAG-Retrieval
Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT, ReRanker.
MozerWang/Loong
[EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA