Mobile-use is an open-source AI agent that's changing the game when it comes to controlling Android and iOS devices using natural language. With its powerful UI-aware automation capabilities, mobile-use can perform a wide range of tasks, from sending messages to navigating complex apps.
Natural Language Control
One of the most impressive features of mobile-use is its ability to understand your native language and interact with your phone accordingly. Whether you want to send a message or navigate through an app, mobile-use can do it all using natural language commands.
UI-Aware Automation
Mobile-use's UI-aware automation capabilities allow it to intelligently navigate through app interfaces, making it easy to automate complex tasks. Currently, the only limitation is that games don't provide accessibility tree data, but this is an area where the project is actively improving.
Data Scraping
Mobile-use also excels at extracting specific information from apps and structuring it into a desired format using natural language descriptions. For example, you can use mobile-use to extract a list of unread emails from your Gmail inbox and get it back in a JSON format.
Extensibility and Customization
One of the most exciting aspects of mobile-use is its extensibility and customizability. You can easily configure different large language models (LLMs) to power the agents that drive mobile-use, allowing you to tailor the experience to your specific needs.
Mobile-Use Achieves 100% AndroidWorld Benchmark
We're proud to announce that mobile-use has achieved a major milestone by becoming the top performer and the first to complete 100% of the AndroidWorld benchmark. To learn more about how we reached this achievement, check out our Minitap Benchmark page.
Get Started with Mobile-Use
Ready to automate your mobile experience? Follow these steps to get started:
Step 1: Set up Environment Variables
To start using mobile-use, you'll need to set up environment variables. Simply copy the example .env.example file and rename it to .env. Then, add your API keys by copying the contents of the .env.example file into .env.
Optional: Customize LLM Configuration
If you want to use different models or providers, create your own LLM configuration file by copying llm-config.override.template.jsonc and renaming it to llm-config.override.jsonc. Then, edit this file to fit your needs.
Running Mobile-Use
To run mobile-use, simply pass your command as an argument. For example, you can use the following command to open Gmail, find the first three unread emails, and list their sender and subject line:
`
python ./src/mobile_use/main.py "Open Gmail, find first 3 unread emails, and list their sender and subject line" --output-description "A JSON list of objects, each with 'sender' and 'subject' keys"
`
Common Requirements
Before you begin, ensure you have the following installed:
- uv: A lightning-fast Python package manager
- Clone the repository:
`
git clone https://github.com/minitap-ai/mobile-use.git && cd mobile-use
`
Create & activate the virtual environment:
`
# This will create a .venv directory using the Python version in .python-version
uv venv
# Activate the environment
source .venv/bin/activate (on macOS/Linux) or .venv\Scripts\activate (on Windows)
`
Install dependencies:
`
uv sync
`
Conclusion
Mobile-use is a powerful AI agent that's revolutionizing app user experience. With its natural language control, UI-aware automation, data scraping capabilities, and extensibility, mobile-use has the potential to automate complex tasks and make your mobile experience more efficient than ever.