Web Scraping Tools ML Frameworks

Tools and frameworks for automatically extracting data from websites through web scraping, crawling, and HTML parsing. Does NOT include data cleaning libraries, NLP analysis tools, or downstream ML applications that use scraped data.

There are 38 web scraping tools frameworks tracked. 1 score above 50 (established tier). The highest-rated is alirezamika/autoscraper at 64/100 with 7,122 stars and 1,197 monthly downloads.

Get all 38 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=web-scraping-tools&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 alirezamika/autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

64
Established
2 YoongiKim/AutoCrawler

Google, Naver multiprocess image web crawler (Selenium)

44
Emerging
3 lorey/mlscraper

🤖 Scrape data from HTML websites automatically by just providing examples

43
Emerging
4 machine-learning-apps/Issue-Label-Bot

Code For The Issue Label Bot, an App that automatically labels issues using...

42
Emerging
5 nuhmanpk/Webtrench

A powerful and easy-to-use web scrapper for collecting data from the web....

41
Emerging
6 Tuhin-thinks/instagram-unfollower-tracker-meerkit

Analyze Instagram followers, find unfollowers, automate follow/unfollow, and...

35
Emerging
7 shaohua0116/ICLR2020-OpenReviewData

Script that crawls meta data from ICLR OpenReview webpage. Tutorials on...

34
Emerging
8 NYX-VORAX/lightning-image-scraper

⚡ Lightning-fast Python image scraper | Download 10K+ images/min from any...

33
Emerging
9 tal95shah/OLX_Scraper

:radio: An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted...

31
Emerging
10 gridaco/figma-archives

Figma Files Scraper for Research & Studies

30
Emerging
11 garysieling/video-crawler

Crawl websites for videos from Youtube, Vimeo, Soundcloud, etc

29
Experimental
12 DevGlitch/botwizer

Social media AI bot using computer vision to imitate human behaviors. Final...

27
Experimental
13 ganeshkavhar/Web-Scraping-in-python

ganesh kavhar python project

24
Experimental
14 b1t0nese/MacLearn

Программа, которая за считанные минуты соберёт для вас качественный датасет...

23
Experimental
15 Tsujimar/tsuki-wscp

Web scraper for AI/ML training

22
Experimental
16 zt8812/lightning-image-scraper

🖼️ Download thousands of images fast with asynchronous scraping and...

22
Experimental
17 udit-git/Python-WebScraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

22
Experimental
18 jigusp/urls-le

🔗 Extract thousands of URLs per second from various formats like HTML, JSON,...

22
Experimental
19 YafetGetu/Data_scraper-from-jiji-ethiopia

A professional web scraping tool for extracting product listings from the...

20
Experimental
20 Eshtiaque/Multi-Agent-Instagram-bot

Instagram-bot

20
Experimental
21 dpuentel/github-issues-labeller-cohere

This is a GitHub issue labeller. Insert the url of a repository and using...

16
Experimental
22 ismailazdad/stackoverflowTags

flask website that automatically assigns multiple relevant tags to a...

15
Experimental
23 OwenOrcan/YiraBot-Crawler

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for...

15
Experimental
24 Decodo/soundcloud-scraper

Scraper for SoundCloud that extracts audio metadata and download URLs using...

15
Experimental
25 david2442/rscari

📁 Read files from IPFS trustless gateways with an async API using the rs-car...

14
Experimental
26 gmk418/Python-web-scraping

🔍 Discover Python web scraping techniques, libraries, and examples to...

14
Experimental
27 Mrsultan7890/crl

CRL Pure Python crawler The Semantic Web Crawler For AI & Security

14
Experimental
28 bhavanaaroy-sketch/AI-Code-Complexity-Analyzer

AI-based tool to analyze code complexity using Python and Streamlit

14
Experimental
29 gabryelvieiramusico/instagram-content-intelligence-pro

📊 Transform Instagram content into actionable insights with AI-driven...

14
Experimental
30 robertilepot/imgur-scraper

📊 Collect Imgur posts, tags, comments, and user data effortlessly with this...

14
Experimental
31 mate3424/easy-zoot-data-scraper

🛍️ Scrape structured fashion product data effortlessly from multiple...

14
Experimental
32 Gulilil/nusava

Development of Social media bot in Instagram, Nusava.

13
Experimental
33 BlazeInferno64/ScrapyPy

ScrapyPy is a free, open-source, and powerful web scraping tool that...

12
Experimental
34 ArtificialOSS/WebCrawl

Crawls the web to generate a huge dataset for training

12
Experimental
35 bright-data-de/web-scraping-for-machine-learning

Scrapen Sie Webdaten für maschinelles Lernen, richten Sie ETL-Pipelines ein...

11
Experimental
36 ozgesadet/silver-invention

AI based tender finding

11
Experimental
37 MaximumOverflow/Philia

An easy to use imageboard scraper.

11
Experimental
38 Epsilon-Ventures/document-similarity-frontend

Major Project Frontend

11
Experimental

Comparisons in this category