spider and spider-clients

The first is a standalone web crawler library, while the second provides client bindings to access a cloud-hosted version of that same crawler via API—making them complements where users choose between local execution or managed service.

spider
70
Verified
spider-clients
45
Emerging
Maintenance 25/25
Adoption 10/25
Maturity 16/25
Community 19/25
Maintenance 13/25
Adoption 6/25
Maturity 9/25
Community 17/25
Stars: 2,323
Forks: 183
Downloads:
Commits (30d): 171
Language: Rust
License: MIT
Stars: 23
Forks: 9
Downloads:
Commits (30d): 0
Language: Rust
License: MIT
No Package No Dependents
No Package No Dependents

About spider

spider-rs/spider

Web crawler and scraper for Rust

Supports multiple rendering engines (HTTP, Chrome DevTools Protocol, WebDriver) with feature-gated compilation for minimal binary size, plus built-in proxy rotation, caching layers (memory/disk/hybrid), and anti-bot fingerprinting. Leverages Tokio's async runtime and lock-free data structures for concurrent crawling of 100k+ pages, with optional io_uring on Linux. Integrates with OpenAI/Gemini for content analysis, Spider Cloud for managed proxy/CAPTCHA bypass, and includes a multimodal LLM agent for web automation tasks.

About spider-clients

spider-rs/spider-clients

Python, Javascript, and Rust libraries for the Spider Cloud API.

Multi-language bindings (Python, JavaScript, Rust, Go, CLI) for Spider Cloud's web crawling API, enabling concurrent scraping with real-time streaming and headless Chrome rendering. Core features include AI-driven smart mode for automated crawling strategies, HTTP proxy support, cron scheduling, and dynamic prompt scripting for JavaScript-heavy sites. Designed for both local deployment and cloud-hosted workflows with fine-grained control via URL blacklisting, whitelisting, and crawl depth budgeting.

Scores updated daily from GitHub, PyPI, and npm data. How scores work