pat-jj/s3
[EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (RLVR for Search with Minimal Data)
Trains search agents independently from generators using reinforcement learning, enabling modular RAG systems that work with any black-box LLM without retraining the language model itself. Integrates vLLM for efficient LLM serving, PySerini for retrieval, and FAISS for dense vector search, with support for multiple corpora (Wikipedia, MedCorp) and retrieval backends (e5 embeddings, BM25). Includes pre-configured baselines (RAG, DeepRetrieval, Search-R1, IRCoT) and provides checkpoint initialization via naive RAG caching to minimize training data requirements.
820 stars.
Stars
820
Forks
137
Language
Python
License
Apache-2.0
Category
Last pushed
Nov 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/pat-jj/s3"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
deepsense-ai/ragbits
Building blocks for rapid development of GenAI applications
infiniflow/ragflow
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses...
GiovanniPasq/agentic-rag-for-dummies
A modular Agentic RAG built with LangGraph — learn Retrieval-Augmented Generation Agents in minutes.
truefoundry/cognita
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications...
NVIDIA/context-aware-rag
Context-Aware RAG library for Knowledge Graph ingestion and retrieval functions.