Web Scraping Agents

Tools and SDKs for extracting structured data from web pages optimized for AI agents. Includes framework integrations, language-specific clients, and performance benchmarks. Does NOT include general web scraping libraries, crawlers without agent optimization, or non-web data extraction tools.

There are 17 web scraping agents tracked. The highest-rated is plasmate-labs/plasmate-a11y at 22/100 with 0 stars.

Get all 17 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=agents&subcategory=web-scraping-agents&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Agent Score Tier
1 plasmate-labs/plasmate-a11y

Accessibility auditing via Plasmate's Semantic Object Model. 7 WCAG rules,...

22
Experimental
2 plasmate-labs/plasmate

The browser engine for agents. HTML in, Semantic Object Model out. 5x faster...

22
Experimental
3 plasmate-labs/autogen-plasmate

Plasmate web browsing tool for Microsoft AutoGen agents. Structured SOM...

22
Experimental
4 plasmate-labs/plasmate-audit

Site auditing powered by Plasmate. 10x faster than Chrome-based auditors....

22
Experimental
5 plasmate-labs/crewai-plasmate

Plasmate web browsing tool for CrewAI agents. 10x fewer tokens than raw HTML...

22
Experimental
6 plasmate-labs/smolagents-plasmate

Plasmate web browsing tool for HuggingFace Smolagents. Give your agent...

22
Experimental
7 plasmate-labs/plasmate-benchmarks

Reproducible benchmarks for Plasmate. Token compression, latency, and...

22
Experimental
8 plasmate-labs/notebooks

Interactive Jupyter notebooks for Plasmate. Getting started, benchmarks, and...

22
Experimental
9 plasmate-labs/plasmate-extension

Chrome extension for exporting auth cookies to Plasmate

22
Experimental
10 plasmate-labs/plasmate-python

Python SDK for Plasmate - fetch web pages as structured SOM JSON. pip...

22
Experimental
11 plasmate-labs/scrapy-plasmate

Scrapy middleware for Plasmate - replace Splash/Playwright with 10x faster...

22
Experimental
12 plasmate-labs/som-action

GitHub Action: fetch a web page with Plasmate and output SOM JSON. Use in CI...

14
Experimental
13 plasmate-labs/awesome-plasmate

A curated list of tools, integrations, and resources for Plasmate - the...

14
Experimental
14 plasmate-labs/quickstart-rust

Quickstart template: use Plasmate with Rust. Fetch any URL and get...

14
Experimental
15 plasmate-labs/quickstart-node

Quickstart template: use Plasmate with Node.js. Fetch any URL and get...

14
Experimental
16 plasmate-labs/replit-template

Try Plasmate in your browser. Replit template for the browser engine for AI agents.

14
Experimental
17 plasmate-labs/quickstart-python

Quickstart template: use Plasmate with Python. Fetch any URL and get...

14
Experimental