joaobenedetmachado/scrapit

A (really) easy way to web scrape

/ 100

Established

Defines scraping targets declaratively in YAML—selectors, transforms, validation, and output formats—eliminating the need to write Python code for new sources. Supports five fetch backends (BeautifulSoup, Playwright for JavaScript, httpx async, GraphQL, Bright Data) with 28+ field transforms, pagination, spider discovery, and parallel crawling. Outputs to eight formats (JSON, CSV, SQLite, MongoDB, PostgreSQL, Excel, Google Sheets, Parquet) with optional webhooks, change detection, Redis caching, and a built-in web dashboard.

No Package No Dependents

Maintenance 13 / 25

Adoption 8 / 25

Maturity 13 / 25

Community 19 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Related agents

vakra-dev/reader

Open-source, production-grade web scraping engine built for LLMs. Scrape and crawl the entire...

firecrawl/open-scouts

🔥 AI-powered web monitoring platform. Create automated scouts that search the web and send email...

BrowserCash/teracrawl

High-performance web crawler API optimized for LLMs. Turn any search or website into clean...

memvid/maw

Crawl any website into a single searchable file. Query it forever, offline.

ma-pony/deepspider

智能爬虫工程平台 - 基于 DeepAgents + Patchright 的 AI 爬虫 Agent | Intelligent Web Scraping Platform -...

Explore AI Agents

All categories Trending AI Agent directory Insights