Document Chunking Embedding Pipelines Vector Databases

There are 65 document chunking embedding pipelines tools tracked. The highest-rated is Siddhant-K-code/distill at 46/100 with 136 stars.

Get all 65 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=vector-db&subcategory=document-chunking-embedding-pipelines&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 Siddhant-K-code/distill

Reliable LLM outputs start with clean context. Deterministic deduplication,...

46
Emerging
2 louisbrulenaudet/ragoon

High level library for batched embeddings generation, blazingly-fast...

42
Emerging
3 pesu-dev/ask-pesu

A RAG pipeline for question answering about PES University

38
Emerging
4 namtroi/RAGBase

Open Source RAG ETL Platform. Turns PDFs, Docs & Slides into queryable...

33
Emerging
5 B-A-M-N/FlockParser

Distributed document RAG system with intelligent GPU/CPU orchestration....

33
Emerging
6 aws-samples/rag-with-amazon-postgresql-using-pgvector-and-sagemaker

Question Answering application with Large Language Models (LLMs) and Amazon...

30
Emerging
7 PerciValXIII/CAFB-food-wise-ai

AI-powered content automation tool for the Capital Area Food Bank (CAFB),...

28
Experimental
8 aws-samples/rag-with-amazon-opensearch-and-sagemaker

Question Answering Generative AI application with Large Language Models...

25
Experimental
9 CarlosManuelDiaz/rag-ready-extractor

Stop indexing noise. Turn messy websites and PDFs into clean, structured...

25
Experimental
10 libraryofcelsus/LLM_File_Parser

AutoML/Unstructured Data Processing for RAG and LLM Dataset Creation. ...

23
Experimental
11 Daddy-Myth/D-RAGon_System

Local Retrieval-Augmented Generation (RAG) system for PDF question answering...

22
Experimental
12 devangvyas-it/fastapi-rag-starter

Lightweight, self-contained RAG application built with FastAPI. It enables...

22
Experimental
13 mpessis/rag-doc-search

Semantic search over technical documentation using natural language. RAG...

22
Experimental
14 tuitige/fijian-rag-app

Public-benefit GenAI platform for the Fijian language — combining Claude +...

22
Experimental
15 Abdellatif404/Eigen-Field

A local Retrieval-Augmented Generation (RAG) system for agricultural...

22
Experimental
16 Amayes985-stack/Mimir

Privacy-first RAG pipeline application that transforms personal documents...

22
Experimental
17 himaenshuu/Multi_modal_rag-application

A powerful, easy-to-use platform for question answering over documents, web...

21
Experimental
18 noaman680/rag-from-scratch

Production-ready RAG (Retrieval Augmented Generation) system built from...

19
Experimental
19 bharghavaram/rag-knowledge-assistant

A lightweight Retrieval-Augmented Generation (RAG) system for answering...

19
Experimental
20 tanmay271/RAG-Qdrant-AI

High-performance RAG pipeline engineered to eliminate LLM hallucinations...

19
Experimental
21 neehanthreddym/doc_query_rag

A basic RAG pipeline which uses gpt-oss-20b model to answer the user query...

19
Experimental
22 pashpashpash/python-rag-scaffold

A comprehensive RAG FastAPI service that handles document uploads and...

19
Experimental
23 josephsenior/Microbione

Multimodal RAG system for microbiome data analysis with cross-modal search,...

19
Experimental
24 Ashish-Abraham/DocWhisperer-Qdrant

A Retrieval-Augmented Generation (RAG) System for PDF Chat using Qdrant...

18
Experimental
25 Debasish-87/rag-based-document-qa

rag-based-document-qa is a Retrieval-Augmented Generation (RAG) based...

17
Experimental
26 johnIT56/STAR-RAG

STAR-RAG is a self-reflective, retrieval-augmented question answering system...

16
Experimental
27 B-A-M-N/FlockParser-legacy

Legacy version of FlockParser PDF processing system

16
Experimental
28 RoodyCode/rag

A modular, self-hosted RAG pipeline for building a private, searchable...

16
Experimental
29 ajitsingh98/Building-RAG-System-with-Deepseek-R1-Locally

This repository contains an end-to-end Retrieval-Augmented Generation (RAG)...

16
Experimental
30 olexmal/ragu

RAGU - Retrieval-Augmented Generation Universal. A privacy-focused RAG...

15
Experimental
31 LEADisDEAD/Vector-Forge

Production-style Retrieval-Augmented Generation (RAG) system with...

15
Experimental
32 gurbaj5124871/rag-app-deepseek

A RAG (Retrieval-Augmented Generation) application which combines...

15
Experimental
33 SrijanShovit/HomeoRAG

A RAG application to search documents for homeopathic remedies based on...

15
Experimental
34 rithunkp/RAG-Codebase

Retrieval-Augmented Generation (RAG) assistant that lets users ask natural...

15
Experimental
35 RijuSaha-01/RAG-Document-Assistant-with-Azure-Cosmos-DB

A RAG pipeline implementation using Azure Cosmos DB (MongoDB vCore) and...

15
Experimental
36 smoothemerson/ragscope

Q&A over documents using RAG (FastAPI + ChromaDB + Ollama + MLflow)

15
Experimental
37 RAK0152/doc-watch-rag

Async document watcher that keeps your RAG index hot. Automatically ingests...

15
Experimental
38 Mohamed-samy2/Arabic-Islamic-Assessment

This repository implements a compact, efficient Retrieval-Augmented...

15
Experimental
39 ramyasri-m/RAG_Property_Document_Pipeline

A RAG pipeline for property documents using Weaviate, sentence-transformers,...

14
Experimental
40 Boney-massiveness357/ragscope

Build a Q&A API that indexes PDFs and text using RAG, logging queries with...

14
Experimental
41 ashankgupta/rag-flow

A visual, node-based RAG (Retrieval-Augmented Generation) pipeline builder...

14
Experimental
42 razevedo1994/paper-rag-pipeline

A complete RAG ingestion pipeline for scientific papers.

14
Experimental
43 QuantumDrizzy/rag-scientific-papers

Full RAG pipeline over 30 seminal AI/ML papers · FAISS vector store · ReAct...

14
Experimental
44 felix-dowl/ResearchPal

Basic RAG pipeline for uploading documents and making natural language queries

14
Experimental
45 Abs01ute000/policymind-rag-showcase

Semantic search and RAG showcase built with FastAPI, ChromaDB,...

14
Experimental
46 Vaibhavii3/AI-Knowlendge-Base-RAG

Built a Retrieval-Augmented Generation system that allows users to upload...

14
Experimental
47 Selam1431/Rag-Document-Search

AI-powered document search system using Retrieval-Augmented Generation (RAG)...

14
Experimental
48 alunoshacker-beep/ragscope

Build an offline Q&A API using RAG to query PDFs and texts, with automated...

14
Experimental
49 tahamohmadf19-dev/rag-document-search

Document search with retrieval-augmented generation using FastAPI, Qdrant...

14
Experimental
50 jy02140251/rag-document-loader

Load documents for RAG pipelines: PDF, DOCX, HTML, Markdown. Smart chunking,...

13
Experimental
51 srinivas-sateesh/RAG-query-classifier

Smart Query Classifier to earn user trust and save $$$

13
Experimental
52 shubham5027/RAG-Qwen-2.5-72b-instruct

I built a production-style RAG system focused on grounded generation, not...

12
Experimental
53 sjlewis25/rag-pipeline

Hybrid RAG pipeline with local/cloud LLM support for semantic document...

12
Experimental
54 ankit123nag/pdf-rag-assistant

Production-grade RAG backend for document ingestion and semantic retrieval...

12
Experimental
55 PrinceKay145/multiDocRAG

Multi-Document RAG System with source attribution and query logging

12
Experimental
56 raza242k5-sys/rag-ai-system

Retrieval-Augmented Generation (RAG) based Intelligent QA System using...

12
Experimental
57 Powerostad/talk_to_github

A Retrieval-Augmented Generation (RAG) system enabling natural language...

11
Experimental
58 DRJ-14/context-aware-email-assistant-RAG

RAG system to query Gmail Takeout (.mbox) with semantic search + local LLM...

11
Experimental
59 thendralmagudapathi/RAG-for-NCERT

A professional-grade Retrieval-Augmented Generation (RAG) system designed...

11
Experimental
60 GowriPriyanka27/adaptive-rag-auto-optimizer

Adaptive Retrieval-Augmented Generation (RAG) system with dynamic...

11
Experimental
61 bijay-odyssey/Personal-Knowledge-Base-RAG-API

Personal Knowledge Base RAG API – FastAPI-based RAG system for querying...

11
Experimental
62 AbhashK1/Verbo

RAG based document query system that performs OCR(Tesseract) for text...

11
Experimental
63 Farhaj499/RAG_with_Weaviate_DB

This project implements a Retrieval Augmented Generation (RAG) system that...

11
Experimental
64 tolios/XPL

A simple cli tool for RAG on documents

10
Experimental
65 daviaraujocc/rag-docs

A simple project about implementing RAG (Retrieval-Augmented Generation) for...

10
Experimental