Intelligent Web Data Extraction AI Agents

Tools that use AI agents to automatically extract, parse, and structure data from websites through natural language instructions and intent-based scraping. Does NOT include general web crawlers, SEO audit platforms, lead database services, or non-agentic scraping libraries.

There are 45 intelligent web data extraction agents tracked. 1 score above 50 (established tier). The highest-rated is vakra-dev/reader at 56/100 with 474 stars and 196 monthly downloads.

Get all 45 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=agents&subcategory=intelligent-web-data-extraction&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Agent Score Tier
1 vakra-dev/reader

Open-source, production-grade web scraping engine built for LLMs. Scrape and...

56
Established
2 joaobenedetmachado/scrapit

A (really) easy way to web scrape

49
Emerging
3 firecrawl/open-scouts

🔥 AI-powered web monitoring platform. Create automated scouts that search...

43
Emerging
4 memvid/maw

Crawl any website into a single searchable file. Query it forever, offline.

42
Emerging
5 BrowserCash/teracrawl

High-performance web crawler API optimized for LLMs. Turn any search or...

41
Emerging
6 ma-pony/deepspider

智能爬虫工程平台 - 基于 DeepAgents + Patchright 的 AI 爬虫 Agent | Intelligent Web...

40
Emerging
7 jufeng-2022/mtywatch

一句话监控网页内容变化,AI | 爬虫 | 网页监控 | 网页更新提醒 | 网页内容订阅

34
Emerging
8 poneoneo/Alibaba-CLI-Scraper

Create your own Alibaba dataset and interact with it in plain English.

33
Emerging
9 oxylabs/ai-crawler-py

Crawl a website starting from a URL, find relevant pages, and extract data –...

31
Emerging
10 hmshb/scraping-agent-ai

AI-powered web scraping agent built with LangGraph, LangSmith, Firecrawl,...

30
Emerging
11 tinaponting/ai-robots-scrapers

AI robots.txt, AI scrapers block ai scrapers

26
Experimental
12 spider-rs/web-crawling-guides

How to guides on web-crawling or scraping

24
Experimental
13 ScrapeGraphAI/just-scrape

CLI for AI-powered web scraping, data extraction, search, and crawling ...

24
Experimental
14 1nn0k3sh4/trendevourer

Trend Devourer 👗✨ AI-Powered Visual Style Analyst

24
Experimental
15 kaymen99/ai-web-scraper

AI web scraper built with Crawl4AI for extracting structured leads data from...

24
Experimental
16 isweerasingha/Auditeo-AI

An enterprise-grade, agentic website audit engine powered by GPT-5.4 and...

23
Experimental
17 Dieans/Universal-News-Scraper

🌍 Scrape and aggregate news effortlessly with Universal News Scraper, your...

23
Experimental
18 ScrapeGraphAI/ScrapeHubAI

🌟 AI-powered tool to analyze GitHub stargazers, identify companies, and...

23
Experimental
19 sirToby99/swipenode

Lightning-fast, zero-render web extraction CLI built for AI agents. Extracts...

22
Experimental
20 Chaitya44/AI-WebScraper

An intelligent, universal web scraper powered by Google Gemini AI. Features...

22
Experimental
21 Kaus-code/Neuroscout-oss

An autonomous AI agent powered by Gemini 2.5 Flash that scouts GitHub for...

22
Experimental
22 rbhatia1997/artist-scout

Open-source AI A&R toolkit for artist scouting, shortlist building, and...

22
Experimental
23 lout33/scout-oss

Local web research agent and mission-driven intelligence scanner that writes...

22
Experimental
24 phia-francis/nesta-signal-scout

An AI-powered foresight agent for Nesta's Discovery Hub. Signal Scout...

20
Experimental
25 NickEinstein1/Scrapper-Enricher

Scrapping Agent - CrewAI

20
Experimental
26 oxylabs/ai-scraper-py

AI Scraper is a powerful scraping tool and scrape agent built to automate...

19
Experimental
27 breezy89757/AgentScraper

AgentScraper: AI-Powered Web Scraper (v1.0) with Visual Extraction

19
Experimental
28 brightdata/trendscan

TrendScan is a multi-source company intelligence platform for automated...

18
Experimental
29 breezy89757/SmartScraper

🤖 AI-Powered Web Scraper Generator - Turns URLs into Python code with...

16
Experimental
30 Musubi-ai/Musubi

Musubi: A convenient crawling tool for collecting web text data in Python.

16
Experimental
31 smoothemerson/scout

AI-powered multi-agent system that analyzes your GitHub profile and CV to...

15
Experimental
32 musadiq7860/AI_growth_auditor

AI-powered business growth audit tool — scrapes website, generates custom...

15
Experimental
33 rosasbehoundja/tech-trends-monitor

Automated RSS flux monitoring system

15
Experimental
34 nomiS0614/mtywatch

📧 Monitor webpage content with AI and receive real-time updates on topics...

14
Experimental
35 Tomefy5/scout-agent

Autonomous AI Agent for B2B Lead Generation & Enrichment

14
Experimental
36 Ascentia-Sandbox/StartInsight

Daily automated startup intelligence: 6 scrapers (Reddit/HN/PH/Trends/X) → 8...

14
Experimental
37 stell619/scraper-agent

AI-powered research agent — scrapes YouTube, Etsy, crypto, stocks & trends...

14
Experimental
38 FlowExtractAPI/ai-lead-extractor

Extract any information from websites using intelligent AI - from contact...

13
Experimental
39 itallstartedwithaidea/google-ai-agent-audit-engine

AI-powered Google Ads audit engine — automated account analysis, scoring,...

12
Experimental
40 vinay-852/AI-Agent-for-Sheets

The primary objective of this project is to harness Google’s Generative AI...

11
Experimental
41 afrexai-cto/ai-ops-audit

Free AI operations audit checklist for mid-market companies. Score your...

11
Experimental
42 BraaMohammed/microwave-ai

Microwave AI is a chat-based AI agent for vibe data enrichment. Upload a...

11
Experimental
43 michalboryczko/crawler-generator-agent

Autonomous agent analyzes websites and generates production-ready crawling...

11
Experimental
44 Atqiyanabila01/AI-Lead-Scout

An AI-powered web research agent that crawls company data and generates...

11
Experimental
45 Hirsun/Website-Crawler

一个为AI Agent设计的HTML网页爬取服务,能够高效获取网页内容并进行清洗处理。

11
Experimental