spider and spider-clients
The first is a standalone web crawler library, while the second provides client bindings to access a cloud-hosted version of that same crawler via API—making them complements where users choose between local execution or managed service.
About spider
spider-rs/spider
Web crawler and scraper for Rust
Supports multiple rendering engines (HTTP, Chrome DevTools Protocol, WebDriver) with feature-gated compilation for minimal binary size, plus built-in proxy rotation, caching layers (memory/disk/hybrid), and anti-bot fingerprinting. Leverages Tokio's async runtime and lock-free data structures for concurrent crawling of 100k+ pages, with optional io_uring on Linux. Integrates with OpenAI/Gemini for content analysis, Spider Cloud for managed proxy/CAPTCHA bypass, and includes a multimodal LLM agent for web automation tasks.
About spider-clients
spider-rs/spider-clients
Python, Javascript, and Rust libraries for the Spider Cloud API.
Multi-language bindings (Python, JavaScript, Rust, Go, CLI) for Spider Cloud's web crawling API, enabling concurrent scraping with real-time streaming and headless Chrome rendering. Core features include AI-driven smart mode for automated crawling strategies, HTTP proxy support, cron scheduling, and dynamic prompt scripting for JavaScript-heavy sites. Designed for both local deployment and cloud-hosted workflows with fine-grained control via URL blacklisting, whitelisting, and crawl depth budgeting.
Scores updated daily from GitHub, PyPI, and npm data. How scores work