reader and scraping-agent-ai
These are complements: vakra-dev/reader provides the core web scraping and markdown cleaning infrastructure that hmshb/scraping-agent-ai wraps with agentic orchestration (LangGraph, Anthropic) to automate intelligent extraction workflows.
About reader
vakra-dev/reader
Open-source, production-grade web scraping engine built for LLMs. Scrape and crawl the entire web, clean markdown, ready for your agents.
Leverages [Ulixee Hero](https://ulixee.org/), a headless browser with built-in anti-bot defenses (TLS fingerprinting, Cloudflare bypass, DNS-over-TLS), managed through a pooled architecture with automatic recycling and health monitoring. Provides two core primitives—`scrape()` for converting URLs to cleaned markdown/HTML, and `crawl()` for breadth-first site discovery—with configurable browser pooling, proxy rotation strategies, batch concurrency, and graceful degradation handling all abstracted away.
About scraping-agent-ai
hmshb/scraping-agent-ai
AI-powered web scraping agent built with LangGraph, LangSmith, Firecrawl, and Anthropic AI. Automates intelligent crawling, structured data extraction, and LLM-powered content formatting. Efficiently handles anti-bot mechanisms, error recovery, and batch processing. 🚀
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work