LexiestLeszek/scrapeGPT
ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.
Combines web scraping with embedding-based retrieval to build a searchable knowledge base from website content, supporting multiple deployment modes (Telegram bot, CLI, or Gradio UI) and flexible LLM backends including local models via Ollama and remote APIs. Implements robots.txt compliance and rotating proxy support for ethical scraping, while storing indexed content in a database for persistent reuse across queries.
No commits in the last 6 months.
Stars
87
Forks
15
Language
Python
License
MIT
Category
Last pushed
Feb 17, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/LexiestLeszek/scrapeGPT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
gpt-open/rag-gpt
RAG-GPT, leveraging LLM and RAG technology, learns from user-customized knowledge bases to...
leon0204/fast-rag
LLM Rag Intelligent Q&A Robot
PatentTRIZbasedAI20260226110030/Patent-GPT
Patent-GPT is an Agentic RAG-based invention copilot combining TRIZ methodology with LLMs. It...
SujalKamate/Intel-Unnati-Industrial-Training-2025--Slot-3
Problem Statement-1: Multilingual NCERT Doubt-Solver using OPEA-based RAG Pipeline. A...
gptscript-ai/gptparse
Document parser for RAG