Web Scraping Tools ML Frameworks
Tools and frameworks for automatically extracting data from websites through web scraping, crawling, and HTML parsing. Does NOT include data cleaning libraries, NLP analysis tools, or downstream ML applications that use scraped data.
There are 38 web scraping tools frameworks tracked. 1 score above 50 (established tier). The highest-rated is alirezamika/autoscraper at 64/100 with 7,122 stars and 1,197 monthly downloads.
Get all 38 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=web-scraping-tools&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Framework | Score | Tier |
|---|---|---|---|
| 1 |
alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python |
|
Established |
| 2 |
YoongiKim/AutoCrawler
Google, Naver multiprocess image web crawler (Selenium) |
|
Emerging |
| 3 |
lorey/mlscraper
🤖 Scrape data from HTML websites automatically by just providing examples |
|
Emerging |
| 4 |
machine-learning-apps/Issue-Label-Bot
Code For The Issue Label Bot, an App that automatically labels issues using... |
|
Emerging |
| 5 |
nuhmanpk/Webtrench
A powerful and easy-to-use web scrapper for collecting data from the web.... |
|
Emerging |
| 6 |
Tuhin-thinks/instagram-unfollower-tracker-meerkit
Analyze Instagram followers, find unfollowers, automate follow/unfollow, and... |
|
Emerging |
| 7 |
shaohua0116/ICLR2020-OpenReviewData
Script that crawls meta data from ICLR OpenReview webpage. Tutorials on... |
|
Emerging |
| 8 |
NYX-VORAX/lightning-image-scraper
⚡ Lightning-fast Python image scraper | Download 10K+ images/min from any... |
|
Emerging |
| 9 |
tal95shah/OLX_Scraper
:radio: An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted... |
|
Emerging |
| 10 |
gridaco/figma-archives
Figma Files Scraper for Research & Studies |
|
Emerging |
| 11 |
garysieling/video-crawler
Crawl websites for videos from Youtube, Vimeo, Soundcloud, etc |
|
Experimental |
| 12 |
DevGlitch/botwizer
Social media AI bot using computer vision to imitate human behaviors. Final... |
|
Experimental |
| 13 |
ganeshkavhar/Web-Scraping-in-python
ganesh kavhar python project |
|
Experimental |
| 14 |
b1t0nese/MacLearn
Программа, которая за считанные минуты соберёт для вас качественный датасет... |
|
Experimental |
| 15 |
Tsujimar/tsuki-wscp
Web scraper for AI/ML training |
|
Experimental |
| 16 |
zt8812/lightning-image-scraper
🖼️ Download thousands of images fast with asynchronous scraping and... |
|
Experimental |
| 17 |
udit-git/Python-WebScraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python |
|
Experimental |
| 18 |
jigusp/urls-le
🔗 Extract thousands of URLs per second from various formats like HTML, JSON,... |
|
Experimental |
| 19 |
YafetGetu/Data_scraper-from-jiji-ethiopia
A professional web scraping tool for extracting product listings from the... |
|
Experimental |
| 20 |
Eshtiaque/Multi-Agent-Instagram-bot
Instagram-bot |
|
Experimental |
| 21 |
dpuentel/github-issues-labeller-cohere
This is a GitHub issue labeller. Insert the url of a repository and using... |
|
Experimental |
| 22 |
ismailazdad/stackoverflowTags
flask website that automatically assigns multiple relevant tags to a... |
|
Experimental |
| 23 |
OwenOrcan/YiraBot-Crawler
YiraBot: Simplifying Web Scraping for All. A user-friendly tool for... |
|
Experimental |
| 24 |
Decodo/soundcloud-scraper
Scraper for SoundCloud that extracts audio metadata and download URLs using... |
|
Experimental |
| 25 |
david2442/rscari
📁 Read files from IPFS trustless gateways with an async API using the rs-car... |
|
Experimental |
| 26 |
gmk418/Python-web-scraping
🔍 Discover Python web scraping techniques, libraries, and examples to... |
|
Experimental |
| 27 |
Mrsultan7890/crl
CRL Pure Python crawler The Semantic Web Crawler For AI & Security |
|
Experimental |
| 28 |
bhavanaaroy-sketch/AI-Code-Complexity-Analyzer
AI-based tool to analyze code complexity using Python and Streamlit |
|
Experimental |
| 29 |
gabryelvieiramusico/instagram-content-intelligence-pro
📊 Transform Instagram content into actionable insights with AI-driven... |
|
Experimental |
| 30 |
robertilepot/imgur-scraper
📊 Collect Imgur posts, tags, comments, and user data effortlessly with this... |
|
Experimental |
| 31 |
mate3424/easy-zoot-data-scraper
🛍️ Scrape structured fashion product data effortlessly from multiple... |
|
Experimental |
| 32 |
Gulilil/nusava
Development of Social media bot in Instagram, Nusava. |
|
Experimental |
| 33 |
BlazeInferno64/ScrapyPy
ScrapyPy is a free, open-source, and powerful web scraping tool that... |
|
Experimental |
| 34 |
ArtificialOSS/WebCrawl
Crawls the web to generate a huge dataset for training |
|
Experimental |
| 35 |
bright-data-de/web-scraping-for-machine-learning
Scrapen Sie Webdaten für maschinelles Lernen, richten Sie ETL-Pipelines ein... |
|
Experimental |
| 36 |
ozgesadet/silver-invention
AI based tender finding |
|
Experimental |
| 37 |
MaximumOverflow/Philia
An easy to use imageboard scraper. |
|
Experimental |
| 38 |
Epsilon-Ventures/document-similarity-frontend
Major Project Frontend |
|
Experimental |