fredsiika/huxley-pdf

Upload personal docs and Chat with your PDF files with this GPT4-powered app. Built with LangChain, Pinecone Vector Database, deployed on Streamlit

33
/ 100
Emerging

Implements semantic search over PDF documents using FAISS vector indexing with OpenAI embeddings, enabling similarity-based retrieval before passing context to GPT-4 for question-answering. The architecture chunks PDFs with configurable overlap (400-char chunks, 80-char overlap) using LangChain's text splitters, then constructs a retrieval-augmented generation (RAG) pipeline that surfaces the most relevant document segments to answer user queries while tracking token usage via OpenAI callbacks.

No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 9 / 25
Community 17 / 25

How are scores calculated?

Stars

37

Forks

9

Language

Python

License

MIT

Last pushed

Dec 15, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/vector-db/fredsiika/huxley-pdf"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.