Retrieval-Augmented Generation NLP Tools
Tools and frameworks implementing RAG systems that combine document retrieval with LLM-based generation for knowledge-base question answering, semantic search, and context-aware responses. Does NOT include general information retrieval, semantic search without LLM integration, or knowledge graph construction without retrieval components.
There are 31 retrieval-augmented generation tools tracked. 3 score above 50 (established tier). The highest-rated is web-arena-x/webarena at 55/100 with 1,398 stars.
Get all 31 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=retrieval-augmented-generation&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
web-arena-x/webarena
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents" |
|
Established |
| 2 |
nabeelxy/syara
SYARA: Super YARA Rules for GenAI Era |
|
Established |
| 3 |
princeton-nlp/WebShop
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with... |
|
Established |
| 4 |
X-LANCE/Mobile-Env
A Universal Platform for Training and Evaluation of Mobile Interaction |
|
Emerging |
| 5 |
shbernal/pdfanki
Create Anki decks from PDF/EPUB files using NLP with LLMs. |
|
Emerging |
| 6 |
dinhanhx/cpu-ish-rag
A very CPU-friendly RAG implementation |
|
Experimental |
| 7 |
princeton-nlp/lwm
We develop world models that can be adapted with natural language.... |
|
Experimental |
| 8 |
zimingyou01/DatawiseAgent
[EMNLP 2025] DatawiseAgent: A Notebook-Centric LLM Agent Framework for... |
|
Experimental |
| 9 |
dinhanhx/cakewalk-rag
A very simple RAG implementation |
|
Experimental |
| 10 |
qdon314/obsidian-vault-RAG
Production-Oriented Retrieval-Augmented-Generation against Regulatory Corpora |
|
Experimental |
| 11 |
poojakira/semantic-rag-engine
Production-grade RAG pipeline with LangChain, ChromaDB, and semantic search.... |
|
Experimental |
| 12 |
TheAleAle/rag-light
A lightweight local RAG system. |
|
Experimental |
| 13 |
jitinkrishnan/NASA-SE
A Virtual Assistant for NASA's Systems Engineers (AAAI-MAKE '19 '20) |
|
Experimental |
| 14 |
antononcube/Python-LLMTextualAnswer
Python package for finding textual answers via LLMs. |
|
Experimental |
| 15 |
khoj-ai/lantern
Lantern manages a waitlist for Khoj. It used to be a lot more, but now it's simple! |
|
Experimental |
| 16 |
hager51/Chatbot
Question Answering System For COVID-19 Questions Using NLP Techniques |
|
Experimental |
| 17 |
DominicMukilan/ithkuil-grammar-copilot
RAG+validation system demonstrating LLM accuracy improvement from 65% to 95% |
|
Experimental |
| 18 |
Navy10021/AegisRAG
A Retrieval-Augmented Threat Analysis Framework with Meta-Evaluation and... |
|
Experimental |
| 19 |
ank090/OpenAi-Models
Open Ai is AI based company which is responsible to create some of the most... |
|
Experimental |
| 20 |
deletexiumu/dar-ontology-reasoning
Ontology-driven clinical reasoning for rare disease perioperative... |
|
Experimental |
| 21 |
Dhanunjaya18/AI-Classroom-Doubt-Detector
Developed an AI-powered Classroom Doubt Detection system that analyzes... |
|
Experimental |
| 22 |
pnvang/llm-agent-rails
Rails engine for LLM-powered agents — slot filling, tool orchestration, and... |
|
Experimental |
| 23 |
pahul0303/llassist
A tool for processing and analyzing research articles using NLP and Large... |
|
Experimental |
| 24 |
KrishnaNarkhede/NLP-DataProcessor
LLM Based Data Analyst Assistant : AI-powered data analysis platform that... |
|
Experimental |
| 25 |
NMHelmy/taming-llms-with-groq-api
A Groq API-powered LLM content classifier that analyzes text sentiment... |
|
Experimental |
| 26 |
isurulkh/Small-Language-Model
SmallDisMed is a fine-tuned GPT-2 model pre-trained on a medical dataset for... |
|
Experimental |
| 27 |
AnushkaAn/Research_tool
An AI-powered financial research tool that extracts structured insights,... |
|
Experimental |
| 28 |
geistmond/wizard-language
Game world event language for multi-agent system. |
|
Experimental |
| 29 |
wyu97/RACo
Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified... |
|
Experimental |
| 30 |
fx2y/LinguaFlow
LinguaFlow - A customizable and conversable system for next-generation large... |
|
Experimental |
| 31 |
0AlphaZero0/ROR-proto-EMBL
ROR prototype developed during the FREYA project |
|
Experimental |