Giving AI Agents Eyes: Browser Automation in 2026
Data extraction, agentic browser control, and stealth infrastructure — scored on quality daily. A decision guide for builders navigating scraping, automation, and the anti-detection arms race.
You need your AI agent to interact with the web. Maybe it needs to extract data from a page. Maybe it needs to fill out a form, navigate a workflow, or monitor a site for changes. You search GitHub, find 340+ repositories in the perception domain alone, and now you need to decide what to actually build on.
PT-Edge tracks these projects and scores them daily on maintenance, adoption, maturity, and community. This guide cuts through the noise: three layers of browser automation, what's actually in production (downloads tell a different story than stars), and which layers foundation model vendors are about to make obsolete.
The three layers of giving AI eyes
The browser automation space has stratified into three distinct layers. Most confusion comes from treating them as one problem. They aren't.
- Layer 1: Data extraction — turn web pages into structured data an LLM can consume
- Layer 2: Agentic browser control — let LLMs drive browsers autonomously to complete tasks
- Layer 3: Stealth infrastructure — evade bot detection at the protocol level
You probably need one or two of these layers, not all three. The decision starts with which layer your problem lives in.
Layer 1: Data extraction — giving AI eyes to read
This is the most mature layer and where the majority of production downloads live. The job: take a URL, return structured data.
| Project | Score | Stars | Downloads/mo | Best for |
|---|---|---|---|---|
| firecrawl | /100 | 92,265 | 332,616 | LLM-ready markdown, schema-driven extraction, managed API |
| scrapy | /100 | 60,973 | 3,013,707 | Large-scale crawling, mature ecosystem, battle-tested |
| crawlee | /100 | 22,542 | 346,203 | Node.js browser automation with Playwright/Puppeteer |
| Scrapling | /100 | 28,517 | 392,823 | Adaptive scraping that survives site redesigns |
| curl_cffi | /100 | 5,297 | 22,184,679 | HTTP fingerprint impersonation at massive scale |
| crawlee-python | /100 | 8,682 | — | Python equivalent of Crawlee for browser-based crawling |
Firecrawl (/100, 92,265 stars) is the dominant force in this layer. It combines JavaScript rendering, proxy rotation, and schema-driven JSON extraction into a single API. With 173 commits in the last 30 days and an official MCP server (5,738 stars), it's the safe bet for teams that want data extraction without managing infrastructure. The trade-off: AGPL license and a SaaS pricing model that scales with volume.
Scrapy (/100, 60,973 stars, 3,013,707 downloads/mo) is the veteran. Twelve years old, BSD-licensed, and still pulling 3 million downloads a month. If you need a crawling framework you control end-to-end, Scrapy is the answer. But it doesn't render JavaScript natively, and 36 commits in 30 days suggests maintenance mode rather than active innovation.
Scrapling (/100, 28,517 stars) is the most interesting newcomer. Created October 2024 and already at 28K+ stars, it handles adaptive parsing that automatically relocates selectors when websites change structure. It integrates with AI agent ecosystems via MCP and bypasses anti-bot systems like Cloudflare Turnstile. At 126 commits in 30 days, it's shipping fast.
But the real story in this layer is hidden in the download numbers. curl_cffi has 5,297 stars — modest by GitHub standards — but pulls 21,237,975 downloads per month. That's more than the rest of this table combined. It's a Python binding for curl-impersonate that can mimic browser TLS/JA3/HTTP2 fingerprints, and it's what production scraping infrastructure actually runs on. Stars measure interest. Downloads measure deployment.
Layer 2: Agentic browser control — giving AI hands to act
This is where the excitement lives. Instead of extracting data from a page, these tools let an LLM navigate, click, type, and complete multi-step workflows in a real browser.
| Project | Score | Stars | Commits/30d | Approach |
|---|---|---|---|---|
| browser-use | /100 | 80,598 | 362 | Vision + DOM, multi-LLM, cloud stealth browsers |
| skyvern | /100 | 20,791 | 279 | AI workflow automation, visual understanding |
| nanobrowser | 52/100 | 12,440 | None | Chrome extension, multi-agent architecture |
| page-agent | 88/100 | 6,693 | 217 | In-page GUI agent, text-based DOM analysis |
| steel-browser | 64/100 | 6,593 | 5 | Browser API sandbox for AI agents |
| browser | /100 | 25,461 | 784 | From-scratch Zig browser, 16x less memory than Chrome |
| maxun | /100 | 15,314 | 56 | No-code platform for non-technical users |
browser-use (/100, 80,598 stars) is the clear leader. Created October 2024, it hit 80K+ stars in under 18 months. It works by combining vision-based browser control (interpreting screenshots) with DOM element access, supporting multiple LLM backends including Claude and Gemini. At 7,302,913 downloads per month, it's not just hype — it's in production. 362 commits in the last 30 days and 5 releases in the past month confirm active shipping.
Skyvern (/100, 20,791 stars) takes a different approach: it focuses on automating specific business workflows rather than general browsing. With 279 commits in 30 days and 6 releases this month, it's the most actively shipping project in the space. If you need to automate a specific multi-step web process (filling insurance forms, navigating procurement portals), Skyvern is purpose-built for that.
Lightpanda deserves special attention. At 25,461 stars and 784 commits in 30 days — the highest commit velocity in the entire browser automation space — it's building a headless browser from scratch in Zig specifically for AI and automation. The claims: 16x less memory and 9x faster than Chrome. It exposes a CDP server compatible with Playwright, Puppeteer, and chromedp, so you can use existing tooling without Chrome's overhead. This is a from-scratch bet that the current approach (forking Chromium, patching it for stealth) is fundamentally wrong. Worth watching closely.
page-agent from Alibaba (6,693 stars, 217 commits/30d) represents the large-company approach: text-based DOM analysis that doesn't rely on screenshots, making it faster and cheaper per interaction. The fact that Alibaba is investing 217 commits a month into an open-source browser agent tells you where enterprise is heading.
Layer 3: Stealth infrastructure — the uncomfortable layer
Nobody talks about this layer in conference talks, but the download numbers don't lie. Anti-detection is a massive, quietly thriving ecosystem.
| Project | Score | Stars | Downloads/mo | What it does |
|---|---|---|---|---|
| patchright | /100 | 2,719 | 732,736 | Undetected Playwright (TypeScript) |
| BotBrowser | /100 | 2,297 | — | Unified fingerprint defense browser |
| HeadlessX | /100 | 1,818 | — | Self-hosted undetected platform (Camoufox/Firefox) |
| pinchtab | /100 | 8,305 | — | Multi-instance orchestrator with stealth injection |
| SeleniumBase | /100 | 12,538 | 4,925,372 | Automation framework with built-in anti-detection |
The Patchright ecosystem tells the real story. Patchright is an undetected fork of Playwright — same API, patched to evade bot detection. Combined across its TypeScript, Python, and Node.js packages, it pulls 2,822,399 downloads per month. That's 2.8 million monthly downloads of a tool whose entire purpose is bypassing anti-bot systems. For context, that's more than Crawlee, Scrapling, and Firecrawl combined.
SeleniumBase (/100, 12,538 stars, 4,925,372 downloads/mo) is the stealth tool that hides in plain sight. Marketed as a testing and automation framework, its anti-detection capabilities are a core feature. Nearly 5 million monthly downloads make it the most-deployed browser automation tool in the entire space after curl_cffi.
pinchtab (8,305 stars, 604 commits/30d) is the second-highest commit velocity project in the space behind Lightpanda, focused on multi-instance browser orchestration with stealth injection. It's Go-based, aimed at teams running hundreds of concurrent browser sessions.
The trade-off here is straightforward: the anti-detection arms race never ends. Every stealth technique has a shelf life. If you're building core infrastructure on top of undetected browsing, you're signing up for continuous maintenance as detection evolves. That said, the download numbers suggest most production scraping already depends on this layer.
The MCP bridge: browsers as agent tools
Browser-automation-mcp is the fastest-growing subcategory in the entire perception space: 200 repos, 19 new in the last 7 days, with an acceleration score of 0.83 and a classic nucleation signal — creation without buzz. Developers are building before the media notices.
| Project | Score | Stars | Downloads/mo | What it does |
|---|---|---|---|---|
| chrome-devtools-mcp | 87/100 | 28,511 | 2,689,135 | Chrome DevTools Protocol bridge for AI coding agents |
| firecrawl-mcp-server | 64/100 | 5,738 | — | Official Firecrawl MCP server for data extraction |
| agentql | /100 | 1,304 | — | AI-native web query language for agent interaction |
chrome-devtools-mcp (28,511 stars, 2,689,135 downloads/mo) is the breakout hit. Created September 2025 and already at 28K+ stars with 6 releases in the last 30 days, it bridges AI coding agents with Chrome's DevTools Protocol. This is the pattern that's winning: rather than building a new browser automation framework, wrap the existing one (Chrome DevTools) in an MCP interface so any AI agent can use it.
The MCP layer matters because it's standardising how agents talk to browsers. Instead of every agent framework building its own browser integration, MCP provides a common protocol. Browse the full browser-automation-mcp category to see the 200 repos building on this pattern.
What foundation models change
Claude Computer Use and GPT-5.4 Computer Use both shipped native browser control capabilities in early 2026. UI-TARS from ByteDance (28,739 stars) brings the same capability to open-source with a vision model that understands screenshots and drives GUI interactions.
This changes the calculus for each layer differently:
- Layer 1 (data extraction) survives. Foundation models are expensive per page. Running Claude Computer Use to scrape 10,000 product pages is economically absurd. curl_cffi at 21 million downloads a month isn't going anywhere — the cost per request is orders of magnitude lower.
- Layer 2 (agentic control) gets disrupted. If Claude can drive a browser natively, do you need browser-use? The answer is: yes, for now. Foundation model computer use is slow, expensive, and unreliable on complex workflows. But the ceiling is rising fast. Projects in this layer need to differentiate on reliability, cost, and specialisation — not just "LLM drives browser."
- Layer 3 (stealth) becomes more important. Paradoxically, as more agents hit the web, anti-bot systems get more aggressive, and stealth infrastructure becomes more valuable. Foundation model computer use doesn't solve the detection problem — it makes it worse.
The safe bet: invest in Layer 1 for data extraction, use Layer 2 tools for workflows that require interaction, and assume Layer 3 will be necessary for any production-scale web access.
The anti-scraping backlash
The Hacker News signal is unambiguous. "News publishers limit Internet Archive access due to AI scraping concerns" hit 569 points. "Miasma: A tool to trap AI web scrapers in an endless poison pit" hit 309 points with 222 comments. The sentiment is shifting.
This isn't just cultural backlash. LinkedIn is checking for 2,953 browser extensions (534 HN points). Publishers are blocking crawlers. Sites are deploying increasingly sophisticated detection. The web is becoming adversarial to automated access in ways it wasn't two years ago.
For builders, this means three things:
- API-first data access is the legal safe path. Firecrawl's managed service model, where they handle the compliance and proxy infrastructure, is winning partly because teams don't want to own that risk.
- The stealth arms race has real consequences. Patchright's 2.8 million monthly downloads tell you the demand is there. But building core business logic on anti-detection is building on sand.
- Structured data standards (llms.txt, sitemaps, APIs) will grow. The long-term answer isn't better scrapers — it's publishers providing machine-readable access. We're in the messy transition period.
Downloads tell the real story
Star counts measure developer interest. Downloads measure production adoption. They tell very different stories.
| Project | Stars | Downloads/mo | What this tells you |
|---|---|---|---|
| curl_cffi | 5,297 | 22,184,679 | Actual production scraping infrastructure |
| SeleniumBase | 12,538 | 4,925,372 | Anti-detection automation in CI/CD pipelines |
| browser-use | 80,598 | 7,302,913 | Agentic control reaching real adoption |
| scrapy | 60,973 | 3,013,707 | Legacy crawling still in heavy production use |
| chrome-devtools-mcp | 28,511 | 2,689,135 | MCP browser bridge scaling fast |
curl_cffi at 21,237,975 monthly downloads is the quiet giant. It has 7% of the stars that Firecrawl has, but 97x the monthly package downloads. This is the tool that production scraping teams actually install. Firecrawl's downloads (218K/mo) are respectable for a SaaS SDK, but the raw HTTP layer — where curl_cffi lives — is where the real volume sits.
The Patchright ecosystem (2.8M combined) outdownloads every agentic browser tool except browser-use. Undetected Playwright is, quietly, one of the most-deployed browser automation approaches in existence.
The decision framework
Here's how to choose:
- You need to extract data from web pages → Start with Firecrawl (managed) or Scrapy (self-hosted). At scale, you'll end up using curl_cffi.
- You need an LLM to complete web tasks → browser-use is the default. For specific business workflows, look at Skyvern.
- You need to evade bot detection → Patchright (drop-in Playwright replacement) or SeleniumBase (broader toolkit). Accept that this is an ongoing arms race.
- You're building an AI agent that needs web access via MCP → chrome-devtools-mcp for IDE-integrated browsing, firecrawl-mcp-server for data extraction.
- You want to bet on the future → Watch Lightpanda. A purpose-built browser for AI, written from scratch, with the highest commit velocity in the space. If it delivers on its performance claims, it changes the economics of everything above.
How to use this data
Every project mentioned in this guide has a quality-scored page in our directory, updated daily. You can:
- Browse perception categories across 340+ repos covering scrapers, browser automation, and frameworks
- Explore browser-automation-mcp to see the 200 repos building MCP browser bridges
- Check browser-agent-automation for the 196 agentic browser projects
- Compare any two projects side by side on maintenance, adoption, maturity, and community scores
Quality scores update daily from live GitHub, PyPI, and npm data. If a project stops being maintained, the score drops. If a stealth tool gets detected and abandoned, you'll see it in the numbers before you see it on Twitter. The data does the work so you don't have to manually check each repo's pulse before making a dependency decision.
Related analysis
Your Docs Are Written for Humans. Your Users Are Agents.
Agents choose developer tools based on documentation quality. llms.txt is the new robots.txt. The MCP data proves...
Your Agent Doesn't Have an Email Address (Yet)
30+ repos are building identity, credentials, email, and payment infrastructure for agents as first-class entities....
Agent Platforms Are Four Problems, Not One
You'll deploy a coding agent and think you're done. You won't be told you also need sandboxing, governance, and...
The Claude Code Ecosystem: Everything You Can Plug In
2,400 repos. 370 new ones per week. A practitioner's guide to what's mature, what's emerging, and what's noise.