carlosplanchon/spidercreator
Automated web scraping spider generation using Browser Use and LLMs. Streamline the creation of Playwright-based spiders with minimal manual coding. Ideal for large enterprises with recurring data extraction needs.
Uses Browser Use to record interactive scraping sessions, then applies a multi-stage LLM pipeline to generate optimized XPath-based Playwright spiders that execute cheaply without further LLM calls. Integrates with Parsel for HTML parsing and includes a virtual execution environment (ctxexec) to validate candidate spider implementations before selecting the best performer for each navigation stage.
217 stars and 10 monthly downloads. No commits in the last 6 months. Available on PyPI.
Stars
217
Forks
22
Language
Python
License
AGPL-3.0
Category
Last pushed
Aug 25, 2025
Monthly downloads
10
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/carlosplanchon/spidercreator"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
raznem/parsera
Lightweight library for scraping web-sites with LLMs
yeahhe365/JustSearch
基于 Playwright 的自主 AI 搜索智能体。支持迭代式任务规划、深度网页爬取,以及带引用来源的多源知识整合。
Riddhish1/CogniScrape
Intelligent Web Scraping Library with LLMs
poodle64/supacrawl
Zero-infrastructure web scraping for the terminal
rednafi/html-to-text
Extract pure text from any webpage