Turbo1123/roubao
Android Automation Tool Based on Vision-Language Models
Based on the README, here's a technical summary that goes deeper than the one-line description: --- Executes automation tasks entirely on-device by porting MobileAgent's agentic framework to native Kotlin, leveraging Shizuku for system-level permissions without requiring a computer or USB connection. Implements a dual-layer Tools/Skills architecture inspired by Claude Code—delegating to AI-capable apps via DeepLink when available, otherwise performing GUI automation through vision-language model analysis of screenshots captured locally. Supports multiple VLM backends (Qwen, GPT-4V, Claude) with dynamic model discovery and optional local inference endpoints.
1,895 stars.
Stars
1,895
Forks
204
Language
Kotlin
License
MIT
Category
Last pushed
Jan 08, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/Turbo1123/roubao"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
droidrun/droidrun
Automate your mobile devices with natural language commands - an LLM agnostic mobile Agent 🤖
trycua/cua
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and...
TurixAI/TuriX-CUA
This is the official website for TuriX Computer-use-Agent
Haervwe/open-webui-tools
Open‑WebUI Tools is a modular toolkit designed to extend and enrich your Open WebUI instance,...
erickjtorres/app-use
📱 Make apps accessible for AI agents