intergalacticalvariable/reader
📚 This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simple prefix http://127.0.0.1:3000/https://website-to-scrape.com/
Extracts web content in multiple formats (markdown, HTML, text, screenshots) via HTTP headers, with local screenshot storage replacing cloud uploads. Built on Jina's Reader architecture using Docker for self-hosted deployment, it requires no API keys and demonstrates resource efficiency on minimal hardware (0.5GB RAM). Designed for RAG pipelines and LLM agents that need clean, structured web content without external dependencies.
295 stars. No commits in the last 6 months.
Stars
295
Forks
55
Language
TypeScript
License
Apache-2.0
Category
Last pushed
Jul 18, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/intergalacticalvariable/reader"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
any4ai/AnyCrawl
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts...
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping,...
kreuzberg-dev/html-to-markdown
High performance and CommonMark compliant HTML to Markdown converter. Maintained by the...
lightfeed/extractor
Using LLMs and AI browser automation to robustly extract web data