Thordata/thordata-firecrawl
Thordata Firecrawl – Firecrawl-compatible web crawling & scraping API built on Thordata, turning any website into AI-ready Markdown/JSON/HTML/screenshots.
Leverages Thordata's unified infrastructure (Web Scraper, Scraping Browser, SERP API, and proxy network) to handle JavaScript-heavy sites and improve success rates beyond basic crawling. Built on FastAPI with Docker/docker-compose self-hosting support, offering both Python client and HTTP REST API with OpenAPI spec for framework integration. Includes agentic capabilities for structured data extraction via LLM prompts and JSON schemas, plus web search integration—enabling RAG pipelines and LangChain tool integration beyond standard scraping.
Available on PyPI.
Stars
2
Forks
—
Language
Python
License
MIT
Category
Last pushed
Mar 09, 2026
Monthly downloads
247
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/Thordata/thordata-firecrawl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
any4ai/AnyCrawl
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts...
kreuzberg-dev/html-to-markdown
High performance and CommonMark compliant HTML to Markdown converter. Maintained by the...
lightfeed/extractor
Using LLMs and AI browser automation to robustly extract web data
ScrapeGraphAI/Scrapegraph-ai
Python scraper based on AI
paulpierre/markdown-crawler
A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file...