Retrieval-Augmented Generation NLP Tools

Tools and frameworks implementing RAG systems that combine document retrieval with LLM-based generation for knowledge-base question answering, semantic search, and context-aware responses. Does NOT include general information retrieval, semantic search without LLM integration, or knowledge graph construction without retrieval components.

There are 31 retrieval-augmented generation tools tracked. 3 score above 50 (established tier). The highest-rated is web-arena-x/webarena at 55/100 with 1,398 stars.

Get all 31 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=retrieval-augmented-generation&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 web-arena-x/webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

55
Established
2 nabeelxy/syara

SYARA: Super YARA Rules for GenAI Era

52
Established
3 princeton-nlp/WebShop

[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with...

50
Established
4 X-LANCE/Mobile-Env

A Universal Platform for Training and Evaluation of Mobile Interaction

37
Emerging
5 shbernal/pdfanki

Create Anki decks from PDF/EPUB files using NLP with LLMs.

33
Emerging
6 dinhanhx/cpu-ish-rag

A very CPU-friendly RAG implementation

27
Experimental
7 princeton-nlp/lwm

We develop world models that can be adapted with natural language....

26
Experimental
8 zimingyou01/DatawiseAgent

[EMNLP 2025] DatawiseAgent: A Notebook-Centric LLM Agent Framework for...

26
Experimental
9 dinhanhx/cakewalk-rag

A very simple RAG implementation

23
Experimental
10 qdon314/obsidian-vault-RAG

Production-Oriented Retrieval-Augmented-Generation against Regulatory Corpora

23
Experimental
11 poojakira/semantic-rag-engine

Production-grade RAG pipeline with LangChain, ChromaDB, and semantic search....

22
Experimental
12 TheAleAle/rag-light

A lightweight local RAG system.

22
Experimental
13 jitinkrishnan/NASA-SE

A Virtual Assistant for NASA's Systems Engineers (AAAI-MAKE '19 '20)

20
Experimental
14 antononcube/Python-LLMTextualAnswer

Python package for finding textual answers via LLMs.

19
Experimental
15 khoj-ai/lantern

Lantern manages a waitlist for Khoj. It used to be a lot more, but now it's simple!

17
Experimental
16 hager51/Chatbot

Question Answering System For COVID-19 Questions Using NLP Techniques

16
Experimental
17 DominicMukilan/ithkuil-grammar-copilot

RAG+validation system demonstrating LLM accuracy improvement from 65% to 95%

16
Experimental
18 Navy10021/AegisRAG

A Retrieval-Augmented Threat Analysis Framework with Meta-Evaluation and...

16
Experimental
19 ank090/OpenAi-Models

Open Ai is AI based company which is responsible to create some of the most...

14
Experimental
20 deletexiumu/dar-ontology-reasoning

Ontology-driven clinical reasoning for rare disease perioperative...

14
Experimental
21 Dhanunjaya18/AI-Classroom-Doubt-Detector

Developed an AI-powered Classroom Doubt Detection system that analyzes...

14
Experimental
22 pnvang/llm-agent-rails

Rails engine for LLM-powered agents — slot filling, tool orchestration, and...

12
Experimental
23 pahul0303/llassist

A tool for processing and analyzing research articles using NLP and Large...

12
Experimental
24 KrishnaNarkhede/NLP-DataProcessor

LLM Based Data Analyst Assistant : AI-powered data analysis platform that...

12
Experimental
25 NMHelmy/taming-llms-with-groq-api

A Groq API-powered LLM content classifier that analyzes text sentiment...

11
Experimental
26 isurulkh/Small-Language-Model

SmallDisMed is a fine-tuned GPT-2 model pre-trained on a medical dataset for...

11
Experimental
27 AnushkaAn/Research_tool

An AI-powered financial research tool that extracts structured insights,...

11
Experimental
28 geistmond/wizard-language

Game world event language for multi-agent system.

11
Experimental
29 wyu97/RACo

Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified...

11
Experimental
30 fx2y/LinguaFlow

LinguaFlow - A customizable and conversable system for next-generation large...

11
Experimental
31 0AlphaZero0/ROR-proto-EMBL

ROR prototype developed during the FREYA project

10
Experimental