mishushakov/llm-scraper
Turn any webpage into structured data using LLMs
Built on Playwright for browser automation, it integrates with Vercel AI SDK to support multiple LLM providers (OpenAI, Anthropic, Google, Groq, Ollama) while maintaining full TypeScript type-safety through Zod or JSON Schema validation. Supports six content extraction modes—from preprocessed HTML and markdown to screenshots and custom formats—with streaming object responses and code generation capabilities for reusable scraping scripts.
6,234 stars and 3,777 monthly downloads. Actively maintained with 1 commit in the last 30 days. Available on npm.
Stars
6,234
Forks
370
Language
TypeScript
License
MIT
Category
Last pushed
Mar 03, 2026
Monthly downloads
3,777
Commits (30d)
1
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/mishushakov/llm-scraper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
run-llama/LlamaIndexTS
Data framework for your LLM applications. Focus on server side solution
Mobile-Artificial-Intelligence/maid
Maid is a free and open source application for interfacing with llama.cpp models locally, and...
serge-chat/serge
A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.
nuance1979/llama-server
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
JHubi1/ollama-app
A modern and easy-to-use client for Ollama