Giving AI Agents Eyes: Browser Automation in 2026

Data extraction, agentic browser control, and stealth infrastructure — scored on quality daily. A decision guide for builders navigating scraping, automation, and the anti-detection arms race.

Graham Rowe · April 01, 2026 · Updated daily with live data
perception agents mcp

You need your AI agent to interact with the web. Maybe it needs to extract data from a page. Maybe it needs to fill out a form, navigate a workflow, or monitor a site for changes. You search GitHub, find 340+ repositories in the perception domain alone, and now you need to decide what to actually build on.

PT-Edge tracks these projects and scores them daily on maintenance, adoption, maturity, and community. This guide cuts through the noise: three layers of browser automation, what's actually in production (downloads tell a different story than stars), and which layers foundation model vendors are about to make obsolete.

The three layers of giving AI eyes

The browser automation space has stratified into three distinct layers. Most confusion comes from treating them as one problem. They aren't.

  • Layer 1: Data extraction — turn web pages into structured data an LLM can consume
  • Layer 2: Agentic browser control — let LLMs drive browsers autonomously to complete tasks
  • Layer 3: Stealth infrastructure — evade bot detection at the protocol level

You probably need one or two of these layers, not all three. The decision starts with which layer your problem lives in.

Layer 1: Data extraction — giving AI eyes to read

This is the most mature layer and where the majority of production downloads live. The job: take a URL, return structured data.

ProjectScoreStarsDownloads/moBest for
firecrawl /100 92,265 332,616 LLM-ready markdown, schema-driven extraction, managed API
scrapy /100 60,973 3,013,707 Large-scale crawling, mature ecosystem, battle-tested
crawlee /100 22,542 346,203 Node.js browser automation with Playwright/Puppeteer
Scrapling /100 28,517 392,823 Adaptive scraping that survives site redesigns
curl_cffi /100 5,297 22,184,679 HTTP fingerprint impersonation at massive scale
crawlee-python /100 8,682 Python equivalent of Crawlee for browser-based crawling

Firecrawl (/100, 92,265 stars) is the dominant force in this layer. It combines JavaScript rendering, proxy rotation, and schema-driven JSON extraction into a single API. With 173 commits in the last 30 days and an official MCP server (5,738 stars), it's the safe bet for teams that want data extraction without managing infrastructure. The trade-off: AGPL license and a SaaS pricing model that scales with volume.

Scrapy (/100, 60,973 stars, 3,013,707 downloads/mo) is the veteran. Twelve years old, BSD-licensed, and still pulling 3 million downloads a month. If you need a crawling framework you control end-to-end, Scrapy is the answer. But it doesn't render JavaScript natively, and 36 commits in 30 days suggests maintenance mode rather than active innovation.

Scrapling (/100, 28,517 stars) is the most interesting newcomer. Created October 2024 and already at 28K+ stars, it handles adaptive parsing that automatically relocates selectors when websites change structure. It integrates with AI agent ecosystems via MCP and bypasses anti-bot systems like Cloudflare Turnstile. At 126 commits in 30 days, it's shipping fast.

But the real story in this layer is hidden in the download numbers. curl_cffi has 5,297 stars — modest by GitHub standards — but pulls 21,237,975 downloads per month. That's more than the rest of this table combined. It's a Python binding for curl-impersonate that can mimic browser TLS/JA3/HTTP2 fingerprints, and it's what production scraping infrastructure actually runs on. Stars measure interest. Downloads measure deployment.

Layer 2: Agentic browser control — giving AI hands to act

This is where the excitement lives. Instead of extracting data from a page, these tools let an LLM navigate, click, type, and complete multi-step workflows in a real browser.

ProjectScoreStarsCommits/30dApproach
browser-use /100 80,598 362 Vision + DOM, multi-LLM, cloud stealth browsers
skyvern /100 20,791 279 AI workflow automation, visual understanding
nanobrowser 52/100 12,440 None Chrome extension, multi-agent architecture
page-agent 88/100 6,693 217 In-page GUI agent, text-based DOM analysis
steel-browser 64/100 6,593 5 Browser API sandbox for AI agents
browser /100 25,461 784 From-scratch Zig browser, 16x less memory than Chrome
maxun /100 15,314 56 No-code platform for non-technical users

browser-use (/100, 80,598 stars) is the clear leader. Created October 2024, it hit 80K+ stars in under 18 months. It works by combining vision-based browser control (interpreting screenshots) with DOM element access, supporting multiple LLM backends including Claude and Gemini. At 7,302,913 downloads per month, it's not just hype — it's in production. 362 commits in the last 30 days and 5 releases in the past month confirm active shipping.

Skyvern (/100, 20,791 stars) takes a different approach: it focuses on automating specific business workflows rather than general browsing. With 279 commits in 30 days and 6 releases this month, it's the most actively shipping project in the space. If you need to automate a specific multi-step web process (filling insurance forms, navigating procurement portals), Skyvern is purpose-built for that.

Lightpanda deserves special attention. At 25,461 stars and 784 commits in 30 days — the highest commit velocity in the entire browser automation space — it's building a headless browser from scratch in Zig specifically for AI and automation. The claims: 16x less memory and 9x faster than Chrome. It exposes a CDP server compatible with Playwright, Puppeteer, and chromedp, so you can use existing tooling without Chrome's overhead. This is a from-scratch bet that the current approach (forking Chromium, patching it for stealth) is fundamentally wrong. Worth watching closely.

page-agent from Alibaba (6,693 stars, 217 commits/30d) represents the large-company approach: text-based DOM analysis that doesn't rely on screenshots, making it faster and cheaper per interaction. The fact that Alibaba is investing 217 commits a month into an open-source browser agent tells you where enterprise is heading.

Layer 3: Stealth infrastructure — the uncomfortable layer

Nobody talks about this layer in conference talks, but the download numbers don't lie. Anti-detection is a massive, quietly thriving ecosystem.

ProjectScoreStarsDownloads/moWhat it does
patchright /100 2,719 732,736 Undetected Playwright (TypeScript)
BotBrowser /100 2,297 Unified fingerprint defense browser
HeadlessX /100 1,818 Self-hosted undetected platform (Camoufox/Firefox)
pinchtab /100 8,305 Multi-instance orchestrator with stealth injection
SeleniumBase /100 12,538 4,925,372 Automation framework with built-in anti-detection

The Patchright ecosystem tells the real story. Patchright is an undetected fork of Playwright — same API, patched to evade bot detection. Combined across its TypeScript, Python, and Node.js packages, it pulls 2,822,399 downloads per month. That's 2.8 million monthly downloads of a tool whose entire purpose is bypassing anti-bot systems. For context, that's more than Crawlee, Scrapling, and Firecrawl combined.

SeleniumBase (/100, 12,538 stars, 4,925,372 downloads/mo) is the stealth tool that hides in plain sight. Marketed as a testing and automation framework, its anti-detection capabilities are a core feature. Nearly 5 million monthly downloads make it the most-deployed browser automation tool in the entire space after curl_cffi.

pinchtab (8,305 stars, 604 commits/30d) is the second-highest commit velocity project in the space behind Lightpanda, focused on multi-instance browser orchestration with stealth injection. It's Go-based, aimed at teams running hundreds of concurrent browser sessions.

The trade-off here is straightforward: the anti-detection arms race never ends. Every stealth technique has a shelf life. If you're building core infrastructure on top of undetected browsing, you're signing up for continuous maintenance as detection evolves. That said, the download numbers suggest most production scraping already depends on this layer.

The MCP bridge: browsers as agent tools

Browser-automation-mcp is the fastest-growing subcategory in the entire perception space: 200 repos, 19 new in the last 7 days, with an acceleration score of 0.83 and a classic nucleation signal — creation without buzz. Developers are building before the media notices.

ProjectScoreStarsDownloads/moWhat it does
chrome-devtools-mcp 87/100 28,511 2,689,135 Chrome DevTools Protocol bridge for AI coding agents
firecrawl-mcp-server 64/100 5,738 Official Firecrawl MCP server for data extraction
agentql /100 1,304 AI-native web query language for agent interaction

chrome-devtools-mcp (28,511 stars, 2,689,135 downloads/mo) is the breakout hit. Created September 2025 and already at 28K+ stars with 6 releases in the last 30 days, it bridges AI coding agents with Chrome's DevTools Protocol. This is the pattern that's winning: rather than building a new browser automation framework, wrap the existing one (Chrome DevTools) in an MCP interface so any AI agent can use it.

The MCP layer matters because it's standardising how agents talk to browsers. Instead of every agent framework building its own browser integration, MCP provides a common protocol. Browse the full browser-automation-mcp category to see the 200 repos building on this pattern.

What foundation models change

Claude Computer Use and GPT-5.4 Computer Use both shipped native browser control capabilities in early 2026. UI-TARS from ByteDance (28,739 stars) brings the same capability to open-source with a vision model that understands screenshots and drives GUI interactions.

This changes the calculus for each layer differently:

  • Layer 1 (data extraction) survives. Foundation models are expensive per page. Running Claude Computer Use to scrape 10,000 product pages is economically absurd. curl_cffi at 21 million downloads a month isn't going anywhere — the cost per request is orders of magnitude lower.
  • Layer 2 (agentic control) gets disrupted. If Claude can drive a browser natively, do you need browser-use? The answer is: yes, for now. Foundation model computer use is slow, expensive, and unreliable on complex workflows. But the ceiling is rising fast. Projects in this layer need to differentiate on reliability, cost, and specialisation — not just "LLM drives browser."
  • Layer 3 (stealth) becomes more important. Paradoxically, as more agents hit the web, anti-bot systems get more aggressive, and stealth infrastructure becomes more valuable. Foundation model computer use doesn't solve the detection problem — it makes it worse.

The safe bet: invest in Layer 1 for data extraction, use Layer 2 tools for workflows that require interaction, and assume Layer 3 will be necessary for any production-scale web access.

The anti-scraping backlash

The Hacker News signal is unambiguous. "News publishers limit Internet Archive access due to AI scraping concerns" hit 569 points. "Miasma: A tool to trap AI web scrapers in an endless poison pit" hit 309 points with 222 comments. The sentiment is shifting.

This isn't just cultural backlash. LinkedIn is checking for 2,953 browser extensions (534 HN points). Publishers are blocking crawlers. Sites are deploying increasingly sophisticated detection. The web is becoming adversarial to automated access in ways it wasn't two years ago.

For builders, this means three things:

  1. API-first data access is the legal safe path. Firecrawl's managed service model, where they handle the compliance and proxy infrastructure, is winning partly because teams don't want to own that risk.
  2. The stealth arms race has real consequences. Patchright's 2.8 million monthly downloads tell you the demand is there. But building core business logic on anti-detection is building on sand.
  3. Structured data standards (llms.txt, sitemaps, APIs) will grow. The long-term answer isn't better scrapers — it's publishers providing machine-readable access. We're in the messy transition period.

Downloads tell the real story

Star counts measure developer interest. Downloads measure production adoption. They tell very different stories.

ProjectStarsDownloads/moWhat this tells you
curl_cffi 5,297 22,184,679 Actual production scraping infrastructure
SeleniumBase 12,538 4,925,372 Anti-detection automation in CI/CD pipelines
browser-use 80,598 7,302,913 Agentic control reaching real adoption
scrapy 60,973 3,013,707 Legacy crawling still in heavy production use
chrome-devtools-mcp 28,511 2,689,135 MCP browser bridge scaling fast

curl_cffi at 21,237,975 monthly downloads is the quiet giant. It has 7% of the stars that Firecrawl has, but 97x the monthly package downloads. This is the tool that production scraping teams actually install. Firecrawl's downloads (218K/mo) are respectable for a SaaS SDK, but the raw HTTP layer — where curl_cffi lives — is where the real volume sits.

The Patchright ecosystem (2.8M combined) outdownloads every agentic browser tool except browser-use. Undetected Playwright is, quietly, one of the most-deployed browser automation approaches in existence.

The decision framework

Here's how to choose:

  • You need to extract data from web pages → Start with Firecrawl (managed) or Scrapy (self-hosted). At scale, you'll end up using curl_cffi.
  • You need an LLM to complete web tasksbrowser-use is the default. For specific business workflows, look at Skyvern.
  • You need to evade bot detectionPatchright (drop-in Playwright replacement) or SeleniumBase (broader toolkit). Accept that this is an ongoing arms race.
  • You're building an AI agent that needs web access via MCPchrome-devtools-mcp for IDE-integrated browsing, firecrawl-mcp-server for data extraction.
  • You want to bet on the future → Watch Lightpanda. A purpose-built browser for AI, written from scratch, with the highest commit velocity in the space. If it delivers on its performance claims, it changes the economics of everything above.

How to use this data

Every project mentioned in this guide has a quality-scored page in our directory, updated daily. You can:

  • Browse perception categories across 340+ repos covering scrapers, browser automation, and frameworks
  • Explore browser-automation-mcp to see the 200 repos building MCP browser bridges
  • Check browser-agent-automation for the 196 agentic browser projects
  • Compare any two projects side by side on maintenance, adoption, maturity, and community scores

Quality scores update daily from live GitHub, PyPI, and npm data. If a project stops being maintained, the score drops. If a stealth tool gets detected and abandoned, you'll see it in the numbers before you see it on Twitter. The data does the work so you don't have to manually check each repo's pulse before making a dependency decision.

Related analysis