Mobile-Agent

Autonomous Agents for Any GUI.

驾驭万千图形界面的自主智能体。

Our Vision我们的愿景

We are building a new generation of autonomous agents capable of understanding and operating any graphical user interface (GUI), from mobile apps to desktop software. Mobile-Agent empowers AI to interact with the digital world just as humans do, unlocking unprecedented levels of automation and accessibility. 我们正在构建新一代的自主智能体,它能够理解并操作从移动应用到桌面软件的任何图形用户界面(GUI)。Mobile-Agent 赋予 AI 像人类一样与数字世界互动的能力,从而开启前所未有的自动化与无障碍化新篇章。

Showcase功能展示

Cross-Platform Control跨平台操控

Mobile-Agent v3 seamlessly operates across mobile, web, and PC platforms. Here it's tasked to perform a complex, multi-step operation. Mobile-Agent v3 能够无缝地在移动、网页和PC平台之间执行操作。这里它正在执行一个复杂的多步骤任务。

Complex PC Tasks复杂的PC任务

Watch as the agent creates a PowerPoint presentation from scratch, demonstrating its ability to use complex desktop software. 观看智能体从零开始创建一个PowerPoint演示文稿,展示其使用复杂桌面软件的能力。

Our Research学术成果

Mobile-Agent-v3 (Preprint)

The foundational model for versatile GUI automation.通用GUI自动化的基础模型。

GUI-Critic-R1 (Preprint)

A novel method for pre-operative error diagnosis.一种新颖的操作前错误诊断方法。

PC-Agent (ICLR 2025 Workshop)

A hierarchical multi-agent framework for PC automation.用于PC自动化的分层多智能体框架。

Mobile-Agent-E (Preprint)

Self-evolving mobile assistant for complex tasks.可自我进化的移动助手,用于执行复杂任务。

Mobile-Agent-v2 (NeurIPS 2024)

Effective navigation via multi-agent collaboration on mobile.通过多智能体协作在移动设备上实现高效导航。

Mobile-Agent-v1 (ICLR 2024 Workshop)

Autonomous multi-modal mobile device agent with visual perception.具有视觉感知的自主多模式移动设备代理。

Join the Future加入未来

Explore our code, contribute to the project, or build your own GUI agents. The journey starts here. 探索我们的代码,为项目做出贡献,或构建您自己的GUI智能体。旅程从这里开始。