mlibre/Clean-Web-Scraper
A Node.js web scraper that extracts clean, readable content from websites - perfect for AI/LLM training datasets. Features smart crawling, Mozilla Readability integration, and organized content storage 🤖
Stars
3
Forks
—
Language
JavaScript
License
—
Category
Last pushed
Oct 25, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/mlibre/Clean-Web-Scraper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
carlosplanchon/spidercreator
Automated web scraping spider generation using Browser Use and LLMs. Streamline the creation of...
raznem/parsera
Lightweight library for scraping web-sites with LLMs
Riddhish1/CogniScrape
Intelligent Web Scraping Library with LLMs
poodle64/supacrawl
Zero-infrastructure web scraping for the terminal
rednafi/html-to-text
Extract pure text from any webpage