Thordata/thordata-firecrawl

Thordata Firecrawl – Firecrawl-compatible web crawling & scraping API built on Thordata, turning any website into AI-ready Markdown/JSON/HTML/screenshots.

/ 100

Emerging

Leverages Thordata's unified infrastructure (Web Scraper, Scraping Browser, SERP API, and proxy network) to handle JavaScript-heavy sites and improve success rates beyond basic crawling. Built on FastAPI with Docker/docker-compose self-hosting support, offering both Python client and HTTP REST API with OpenAPI spec for framework integration. Includes agentic capabilities for structured data extraction via LLM prompts and JSON schemas, plus web search integration—enabling RAG pipelines and LangChain tool integration beyond standard scraping.

Available on PyPI.

Maintenance 13 / 25

Adoption 8 / 25

Maturity 18 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Higher-rated alternatives

any4ai/AnyCrawl

AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts...

kreuzberg-dev/html-to-markdown

High performance and CommonMark compliant HTML to Markdown converter. Maintained by the...

lightfeed/extractor

Using LLMs and AI browser automation to robustly extract web data

ScrapeGraphAI/Scrapegraph-ai

Python scraper based on AI

paulpierre/markdown-crawler

A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file...

Explore RAG Tools

All categories Trending RAG directory Insights