any4ai/AnyCrawl

AnyCrawl πŸš€: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.

69
/ 100
Established

Supports multiple scraping engines (Cheerio for static parsing, Playwright/Puppeteer for JavaScript rendering) and integrates Redis for caching and batch task management. Features LLM-powered JSON extraction via JSON Schema, enabling structured data generation directly from crawled pages without separate post-processing. Offers self-hosted deployment with Docker Compose and Bearer token authentication, alongside a REST API for web scraping, full-site crawling with path filtering, and multi-engine SERP extraction.

2,763 stars. Actively maintained with 32 commits in the last 30 days.

No Package No Dependents
Maintenance 23 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

2,763

Forks

289

Language

TypeScript

License

MIT

Last pushed

Mar 08, 2026

Commits (30d)

32

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/any4ai/AnyCrawl"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.