TencentQQGYLab/AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

47
/ 100
Emerging

Operates Android apps through vision-based perception and simplified action primitives (tap, swipe) without requiring backend access, making it applicable across diverse third-party applications. The framework employs a two-phase approach: an exploration phase where the agent builds a knowledge base either through autonomous navigation or human demonstrations, followed by a deployment phase that leverages this documentation to execute complex tasks. Supports multiple multimodal LLM backends including GPT-4V and Qwen-VL, with integration via Android Debug Bridge (ADB) for real devices or Android Studio emulators.

6,582 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 21 / 25

How are scores calculated?

Stars

6,582

Forks

736

Language

Python

License

MIT

Last pushed

Mar 19, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/TencentQQGYLab/AppAgent"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.