LLM Web Scraping LLM Tools

Tools for extracting and parsing structured data from websites using LLM-powered methods, including web crawlers, HTML extractors, and scraping APIs optimized for AI agent integration. Does NOT include general-purpose web scrapers without LLM integration, browser automation tools, or proxy/VPN services.

There are 33 llm web scraping tools tracked. The highest-rated is carlosplanchon/spidercreator at 46/100 with 217 stars and 10 monthly downloads.

Get all 33 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-web-scraping&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 carlosplanchon/spidercreator

Automated web scraping spider generation using Browser Use and LLMs....

46
Emerging
2 raznem/parsera

Lightweight library for scraping web-sites with LLMs

41
Emerging
3 Riddhish1/CogniScrape

Intelligent Web Scraping Library with LLMs

39
Emerging
4 poodle64/supacrawl

Zero-infrastructure web scraping for the terminal

38
Emerging
5 rednafi/html-to-text

Extract pure text from any webpage

37
Emerging
6 yeahhe365/JustSearch

基于 Playwright 的自主 AI 搜索智能体。支持迭代式任务规划、深度网页爬取,以及带引用来源的多源知识整合。

37
Emerging
7 supadata-ai/js

Official TypeScript/JavaScript SDK for the Supadata API.

34
Emerging
8 ElysiumOSS/enterprise-ai-recursive-web-scraper

AI assisted web scraper, w/ content summarization, screensshots, and filter 🤖🕷️

33
Emerging
9 SiluPanda/agent-crawl

High performance, lightweight and typesafe library to crawl and scrape web,...

32
Emerging
10 cipher-rc5/fire_ctrl

Spec-compliant self-hosted Firecrawl v2 runtime in native Rust

29
Experimental
11 AndreaBozzo/Ares

Next-gen AI scraper — LLM-powered structured data extraction

25
Experimental
12 cameronking4/nextjs-firecrawl-starter

Nextjs 15 Firecrawl app to scrape doc links for an LLM. Use it as a starter...

25
Experimental
13 us/crw

⚡Lightweight Firecrawl alternative in Rust — 91.5% coverage, 5x faster, 3MB...

25
Experimental
14 lee-lou2/distill

고성능 Rust 기반 웹 스크래퍼 & LLM 분석 API 서버

25
Experimental
15 rowyio/LLM-Web-Crawler

Web Scraper and Crawler for LLM Apps and AI Workflows with NoCode / LowCode....

25
Experimental
16 sammcj/firecrawler

A lightweight frontend for self-hosted Firecrawl instances

24
Experimental
17 plater7/docrawl

Web crawler para sitios de documentación — convierte páginas a Markdown...

23
Experimental
18 firecrawl/firecrawl-py

Crawl and convert any website into clean markdown

23
Experimental
19 flyrank-bih/flyscrape

The Most Powerful Open-source LLM Friendly Typescript Web Crawler & Scraper

23
Experimental
20 kubernetes-bad/metachar

Scraper for Chub.ai and JanitorAI.com

23
Experimental
21 Daedae147/flyscrape

🕷️ Streamline web scraping and crawling with FlyScrape, the Node.js package...

22
Experimental
22 TheFishPilot/Verity-Agentic-Web-Scraper

Verity API for verified web extraction in AI pipelines (Fastify +...

22
Experimental
23 iamagirlwithtechnicalmonstermind/firecrawl-swift-sdk

🔥 Scrape, crawl, search, extract, and map websites with the powerful...

22
Experimental
24 ruchit-p/essence

A fast, open-source web retrieval engine built in Rust.

19
Experimental
25 greysquirr3l/stygian

High-performance graph-based web scraping engine + anti-detection browser...

19
Experimental
26 Awin36/houzz-product-reviews-scraper

🏠 Extract Houzz product reviews into structured data for easy analysis,...

16
Experimental
27 Pankaj3112/pluckr

Schema-first, self-healing HTML extraction powered by LLMs

15
Experimental
28 ChenTaHung/HTML-Text-Parser

This project is designed to extract text from documents and prepare it for...

13
Experimental
29 aglasencnik/Parsera.NET

A lightweight NuGet package for the Parsera API, designed to simplify...

12
Experimental
30 parsera-labs/parsera-ts

A Typesafe SDK for Scraping LLMs with Parsera.org and JavaScript

11
Experimental
31 davidyen1124/ai-crawler

AI web scraper using GPT to dynamically optimize CSS selectors for reliable...

11
Experimental
32 1amageek/Scouter

A Swift library for recursive web content searching and link extraction...

11
Experimental
33 mlibre/Clean-Web-Scraper

A Node.js web scraper that extracts clean, readable content from websites -...

10
Experimental