pmbstyle/gemini-computer-use
A minimal browser automation agent using Google's Gemini 2.5 Computer Use Preview model and Playwright for web browser control.
Implements vision-based browser automation by feeding screenshots to Gemini 2.5 for visual understanding, enabling the model to locate and interact with page elements without DOM parsing. Includes built-in safety guardrails with human-in-the-loop confirmation for sensitive operations, and provides a comprehensive action API covering clicks, typing, scrolling, drag-and-drop, and navigation primitives executed through Playwright's browser control layer.
Stars
23
Forks
6
Language
Python
License
—
Category
Last pushed
Oct 23, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/pmbstyle/gemini-computer-use"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
nth5693/gemini-kit
🚀 19 AI Agents + 44 Commands for Gemini CLI - Code 10x faster with auto planning, testing,...
josstei/maestro-gemini
Turn Gemini CLI into a multi-agent platform — 12 specialized subagents, parallel dispatch,...
lopushok9/gemini_quant
Free, easy-to-use, AI-driven market research tool for the Gemini CLI
jduncan-rva/gemini-agent-creator
AI-powered extension for Gemini CLI that creates custom agents through natural conversation
pmbstyle/gemini-browser-agent
A browser agent with a Google Chrome extension that can work in your browser. Based on Google...