LexiestLeszek/scrapeGPT

ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.

42
/ 100
Emerging

Combines web scraping with embedding-based retrieval to build a searchable knowledge base from website content, supporting multiple deployment modes (Telegram bot, CLI, or Gradio UI) and flexible LLM backends including local models via Ollama and remote APIs. Implements robots.txt compliance and rotating proxy support for ethical scraping, while storing indexed content in a database for persistent reuse across queries.

No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

87

Forks

15

Language

Python

License

MIT

Last pushed

Feb 17, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/LexiestLeszek/scrapeGPT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.