Ingestion RAG Tools

There are 37 ingestion tools tracked. The highest-rated is veyliss/ai-localbase at 48/100 with 133 stars.

Get all 37 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=rag&subcategory=ingestion&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 veyliss/ai-localbase

一个本地优先的AI知识库系统(RAG),用于把本地文档接入辅导搜索与大模型对话流程。目前支持md、txt、pdf(文本)类型

48
Emerging
2 Cloud2BR-MSFTLearningHub/RAG-ChatBot-Implementation

This repository contains example of a RAG chat bot with a basic architecture...

45
Emerging
3 LakshmiSravyaVedantham/RAG-Based-Chatbot-with-Streamlit

Chat with any document (PDF, CSV, DOCX) using RAG — LangChain + Streamlit + OpenAI

39
Emerging
4 LLMSystems/file2md

file2md is a versatile tool for converting multiple file formats to Markdown.

28
Experimental
5 Tendo33/markio

a powerful document processing service that seamlessly converts a wide range...

28
Experimental
6 tushar10sh/NimbusPDF

Private, self-hosted PDF reader with an offline-capable AI assistant. 🔒 100%...

24
Experimental
7 however-yir/ai-demo

Spring AI demo backend with chat, tool calling, multimodal input, PDF RAG,...

23
Experimental
8 trenknerpeter/mdspin

Document to Markdown converter for AI workflows — try it at https://mdspin.app

22
Experimental
9 Nufeen/pdf-rag

Local RAG over pdf collection

22
Experimental
10 VesperArch/rag-ingestion-benchmark

Benchmark: GopherDoc (Go) vs LangChain (Python) — 340× throughput, 3.3× less...

22
Experimental
11 mulkatz/mulder

Config-driven Document Intelligence Platform on GCP. PDFs → Knowledge Graph,...

22
Experimental
12 PatienceQi/sge_lightrag

SGE: Structure-Guided Extraction for GraphRAG — faithful graph construction...

22
Experimental
13 harshbhanushali26/hArI

AI-powered PDF & CSV analysis assistant using Groq LLM, ChromaDB, and RAG...

22
Experimental
14 Sanya003/Scribe

I read your PDFs so you don’t have to. 👀

21
Experimental
15 CHIRABRATA/vagacore

VagaCore — Context-aware NLP engine for extracting structured, time-aware...

19
Experimental
16 sniperx-19/rag-chatbot

Chat with multiple PDFs locally

19
Experimental
17 DhruvShah510/ai-meeting-assistant

AI-powered meeting assistant that summarizes transcripts, extracts action...

19
Experimental
18 jmatias2411/RAG

🧠 Consulta tus PDFs con IA local usando LangChain, Ollama y Streamlit. Sube...

19
Experimental
19 shivaacodes/document-rag-service

FastAPI RAG microservice for document ingestion and contextual content...

19
Experimental
20 a2Fsa2k/eigen

ms edge pdf viewer but simply superior

19
Experimental
21 drewid74/ai_skills

AI skills and workflow templates for Claude Code, Copilot, Gemini, any AI...

18
Experimental
22 leosantos2003/Sabia-QA-System-on-Scientific-Articles

Question-Answer RAG-based system with Sabiá on scientific articles in PDF format.

17
Experimental
23 EtheXReal/basiclaw-rag

RAG demo that turns the Hong Kong Basic Law PDF into a FAISS + Redis...

17
Experimental
24 sam-k0/ExamGen

Generate exam questions based on slides, notes or other PDFs. Answer and...

17
Experimental
25 reezuleanu/pdf_deconstructor

Decompose a PDF file based on its headers for RAG ingestion.

17
Experimental
26 MsheesAI/CortexDocs

A smart PDF summarization tool built with AI to convert large documents into...

16
Experimental
27 deBUGger404/navexa-docs

Navexa Docs — documentation site for the Navexa PDF/document processing...

16
Experimental
28 DevPedroGomes/voice_rag

Voice RAG — Upload PDFs and ask questions with voice-powered answers....

16
Experimental
29 AnshumanMahanta/Cyra-Analytics

Cyra Analytics is RAG-based CSV Analyzer for automated dataset profiling and...

15
Experimental
30 am2998/RAG-cli

Local-first RAG CLI that ingests documents, stores embeddings in Qdrant, and...

15
Experimental
31 seantlee88/ai-operations-copilot

AI document assistant that summarizes estimates, extracts costs, timelines,...

14
Experimental
32 willweimike/RAGAgent

Agentic PDF RAG with LangGraph & Ollama

14
Experimental
33 wahhabriaz/rag-chat-pro

RAG chatbot for PDF Q&A with switchable AI providers and Streamlit UI

14
Experimental
34 Rehman110-F/docmind

Full-stack RAG application — chat with your PDF documents using Google...

14
Experimental
35 aseseri/agent-society-user-simulation

LLM-based User Simulation Agent for the AgentSociety Challenge. Features...

13
Experimental
36 MekdelawitGebre/student-notes-rag

RAG application that lets students upload PDF notes and ask questions using...

13
Experimental
37 Hema4640/AI-Document-Assistant

AI-powered PDF Question Answering System using RAG, LangChain, and ChromaDB

13
Experimental