YoongiKim/AutoCrawler

Google, Naver multiprocess image web crawler (Selenium)

44
/ 100
Emerging

Supports full-resolution image downloads, configurable thread pools, and face detection mode, with data imbalance detection across keyword directories. Uses Selenium with XPath-based link extraction that can be customized per search engine, plus headless mode and proxy rotation for distributed crawling. Includes remote SSH execution via virtual display (Xvfb) and maintainable architecture allowing site-specific selector updates as Google and Naver layouts evolve.

1,692 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 9 / 25
Community 25 / 25

How are scores calculated?

Stars

1,692

Forks

429

Language

Python

License

Apache-2.0

Last pushed

Apr 15, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/YoongiKim/AutoCrawler"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.