pat-jj/s3

[EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (RLVR for Search with Minimal Data)

/ 100

Established

Trains search agents independently from generators using reinforcement learning, enabling modular RAG systems that work with any black-box LLM without retraining the language model itself. Integrates vLLM for efficient LLM serving, PySerini for retrieval, and FAISS for dense vector search, with support for multiple corpora (Wikipedia, MedCorp) and retrieval backends (e5 embeddings, BM25). Includes pre-configured baselines (RAG, DeepRetrieval, Search-R1, IRCoT) and provides checkpoint initialization via naive RAG caching to minimize training data requirements.

820 stars.

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 23 / 25

How are scores calculated?

Stars

820

Forks

137

Language

Python

License

Apache-2.0

Related tools

deepsense-ai/ragbits

Building blocks for rapid development of GenAI applications

infiniflow/ragflow

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses...

GiovanniPasq/agentic-rag-for-dummies

A modular Agentic RAG built with LangGraph — learn Retrieval-Augmented Generation Agents in minutes.

truefoundry/cognita

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications...

NVIDIA/context-aware-rag

Context-Aware RAG library for Knowledge Graph ingestion and retrieval functions.

Explore RAG Tools

All categories Trending RAG directory Insights