mishushakov/llm-scraper

Turn any webpage into structured data using LLMs

74
/ 100
Verified

Built on Playwright for browser automation, it integrates with Vercel AI SDK to support multiple LLM providers (OpenAI, Anthropic, Google, Groq, Ollama) while maintaining full TypeScript type-safety through Zod or JSON Schema validation. Supports six content extraction modes—from preprocessed HTML and markdown to screenshots and custom formats—with streaming object responses and code generation capabilities for reusable scraping scripts.

6,234 stars and 3,777 monthly downloads. Actively maintained with 1 commit in the last 30 days. Available on npm.

Maintenance 13 / 25
Adoption 18 / 25
Maturity 25 / 25
Community 18 / 25

How are scores calculated?

Stars

6,234

Forks

370

Language

TypeScript

License

MIT

Last pushed

Mar 03, 2026

Monthly downloads

3,777

Commits (30d)

1

Dependencies

3

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/mishushakov/llm-scraper"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.