aidriventesting/Agent

Open-source AI agent for UI automation, combining structural and visual understanding of mobile & web interfaces. Toward the next generation of open-source, AI-driven testing.

/ 100

Emerging

Integrates with Robot Framework as a natural-language library that converts plain-English instructions into UI actions via multi-provider LLMs (OpenAI, Claude, Gemini). Uses vision-based UI parsing with OmniParser and Set-of-Mark techniques to ground visual elements, supporting both mobile (via Appium) and web platforms through a unified LLM→context→tool selection pipeline. Keywords like `Agent.Do`, `Agent.Check`, and `Agent.Ask` enable semantic test steps without explicit selectors or coordinates.

No Package No Dependents

Maintenance 10 / 25

Adoption 5 / 25

Maturity 9 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

browserwing/browserwing

BrowserWing turns your browser actions into MCP commands Or Claude Skill, allowing AI agents to...

theredsix/cerebellum

Browser automation system that uses AI-driven planning to navigate web pages and perform goals.

MigoXLab/webqa-agent

Autonomous web browser agent that audits performance, functionality & UX for engineers and...

nottelabs/notte

🌸 Best framework to build web agents, and deploy serverless web automation functions on reliable...

hyperbrowserai/HyperAgent

AI Browser Automation

Explore AI Agents

All categories Trending AI Agent directory Insights