pat-jj/s3

[EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (RLVR for Search with Minimal Data)

54
/ 100
Established

Trains search agents independently from generators using reinforcement learning, enabling modular RAG systems that work with any black-box LLM without retraining the language model itself. Integrates vLLM for efficient LLM serving, PySerini for retrieval, and FAISS for dense vector search, with support for multiple corpora (Wikipedia, MedCorp) and retrieval backends (e5 embeddings, BM25). Includes pre-configured baselines (RAG, DeepRetrieval, Search-R1, IRCoT) and provides checkpoint initialization via naive RAG caching to minimize training data requirements.

820 stars.

No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 15 / 25
Community 23 / 25

How are scores calculated?

Stars

820

Forks

137

Language

Python

License

Apache-2.0

Last pushed

Nov 05, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/pat-jj/s3"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.