lorey/mlscraper
🤖 Scrape data from HTML websites automatically by just providing examples
Uses statistical heuristics to infer extraction rules from labeled training examples, eliminating the need for CSS selectors or manual DOM specifications. The library analyzes HTML structure across samples to determine optimal patterns, then applies those rules to new pages. Built as a Python package targeting web scraping workflows with minimal configuration overhead.
1,379 stars and 336 monthly downloads. No commits in the last 6 months. Available on PyPI.
Stars
1,379
Forks
93
Language
Python
License
—
Category
Last pushed
Mar 17, 2024
Monthly downloads
336
Commits (30d)
0
Dependencies
6
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/lorey/mlscraper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
YoongiKim/AutoCrawler
Google, Naver multiprocess image web crawler (Selenium)
machine-learning-apps/Issue-Label-Bot
Code For The Issue Label Bot, an App that automatically labels issues using machine learning,...
nuhmanpk/Webtrench
A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of...
Tuhin-thinks/instagram-unfollower-tracker-meerkit
Analyze Instagram followers, find unfollowers, automate follow/unfollow, and predict follow-backs.