ms-agent and MobileAgent
These are complementary tools serving different deployment contexts: MS-Agent provides a lightweight execution framework for complex task orchestration, while Mobile-Agent specializes in GUI automation agents for mobile interfaces, allowing them to be used together in a system where MS-Agent handles task planning and Mobile-Agent executes actions on mobile applications.
About ms-agent
modelscope/ms-agent
MS-Agent: a lightweight framework to empower agentic execution of complex tasks
# Technical Summary Implements autonomous agent execution through MCP (Model Calling Protocol) support with tool-calling capabilities, featuring specialized workflows for deep research, code generation, and video synthesis. Provides context compression with token monitoring, multimodal input handling (image/video), and knowledge search integration via Sirchmunk for retrieval over local codebases. Architecture includes memory systems for long/short-term persistence, DAG-based workflow orchestration, and sandbox execution via ms-enclave for secure code evaluation.
About MobileAgent
X-PLUG/MobileAgent
Mobile-Agent: The Powerful GUI Agent Family
Implements multimodal vision-language models (GUI-Owl series: 2B-235B parameters) optimized for GUI perception and grounding across desktop, mobile, and browser environments using Qwen3-VL backbone. The agentic framework layers planning, reflection, memory management, and tool/MCP calling on top of vision capabilities, enabling end-to-end task automation across platforms. Achieves state-of-the-art on 20+ GUI benchmarks including OSWorld and AndroidWorld through semi-online RL fine-tuning and native multi-platform support.
Scores updated daily from GitHub, PyPI, and npm data. How scores work