Install
openclaw skills install yyds-autoControl Android devices via MCP — tap, swipe, OCR, screenshot, UI automation, shell, file management, and AI agent orchestration for Android RPA.
openclaw skills install yyds-autoLet LLMs directly control Android devices through the MCP protocol.
Yyds.Auto is a production-grade Android RPA (Robotic Process Automation) platform that exposes 60 MCP tools covering the full spectrum of Android device automation — from pixel-level touch injection and OCR to UI hierarchy inspection, file management, and on-device AI agent orchestration.
| Category | Tools | Capabilities |
|---|---|---|
| 📱 Device Info | 4 | Device model, screen size, IMEI, foreground app, network status |
| 👆 Touch & Input | 8 | Tap, swipe, long press, drag, text input, clipboard, key press |
| 📸 Screenshot | 2 | Screenshot as base64 image (LLM can see it directly), save to device |
| 🌲 UI Automation | 5 | UI hierarchy dump, find elements by attributes, element relations, wait & scroll |
| 🔍 OCR & Image | 8 | Screen OCR, tap-on-text, template matching, pixel color, image comparison |
| 💻 Shell | 1 | Execute shell commands with ROOT/SHELL privileges |
| 📦 App Management | 8 | Launch/stop apps, list installed, install/uninstall APK, open URL, toast |
| 📁 File Operations | 7 | List, read, write, delete, rename files and directories on device |
| 🐍 Script Projects | 5 | List/start/stop Python projects, execute Python code snippets |
| 📚 Pip Management | 4 | List, install, uninstall, inspect Python packages |
| 🤖 AI Agent | 8 | Configure and run an on-device AI agent with natural language instructions |
AI Agent (Claude / GPT / Gemini / Cursor / Windsurf / ...)
↓ MCP Protocol (stdio, JSON-RPC)
yyds-auto-mcp (Node.js, this skill)
↓ HTTP REST (JSON, port 61140)
yyds.py engine (Android, aiohttp server)
↓ IPC
yyds.auto engine (Android, kernel-level UI automation)
The MCP server communicates with the on-device engine via HTTP REST. When connected via USB, ADB port forwarding is set up automatically. Remote devices over WiFi/LAN are also supported.
npm install -g yyds-auto-mcp
# Default: 127.0.0.1:61140, ADB forward set up automatically
yyds-auto-mcp
YYDS_DEVICE_HOST=192.168.1.100 YYDS_DEVICE_PORT=61140 yyds-auto-mcp
Add to claude_desktop_config.json:
{
"mcpServers": {
"yyds-auto": {
"command": "npx",
"args": ["-y", "yyds-auto-mcp"],
"env": {
"YYDS_DEVICE_HOST": "127.0.0.1",
"YYDS_DEVICE_PORT": "61140"
}
}
}
}
Add the same MCP server configuration in your editor's MCP settings.
| Variable | Default | Description |
|---|---|---|
YYDS_DEVICE_HOST | 127.0.0.1 | Device IP address |
YYDS_DEVICE_PORT | 61140 | Engine port number |
YYDS_DEVICE_SERIAL | (first device) | Specify ADB device serial |
YYDS_ADB_PATH | (auto-detect) | Custom ADB binary path |
device_info — Comprehensive device info: engine version, screen size, IMEI, foreground appget_foreground_app — Current foreground app & Activityget_screen_size — Device screen resolutionis_network_online — Check network connectivitytap (x, y, count?, interval?) — Tap at coordinates (supports multi-tap)swipe (x1, y1, x2, y2, duration?) — Swipe gesturelong_press (x, y, duration?) — Long press at coordinatesdrag (x1, y1, x2, y2, duration?) — Drag from point A to Binput_text (text) — Input text into the focused fieldset_clipboard / get_clipboard — Clipboard operationspress_key (key) — Press a key (home, back, enter, etc.)take_screenshot (quality?) — Returns base64 JPEG image (LLM directly interprets it)save_screenshot (path?) — Save screenshot to device storagedump_ui_hierarchy — Full UI tree (auto-trimmed when >15KB to save tokens)find_ui_elements (text?, resourceId?, className?, clickable?, ...) — Find elements by attributesget_element_relation (hashcode, type?) — Get parent/children/sibling of an elementwait_for_element (text?, resourceId?, timeout?) — Wait until an element appearsscroll_to_find (text?, direction?, maxScrolls?) — Scroll until an element is foundscreen_ocr (x?, y?, w?, h?) — Recognize text on screen (region supported)tap_text (text, index?) — OCR + tap on the matching textimage_ocr (path) — Recognize text from an image filefind_image_on_screen (templates, threshold?) — Template matchingget_pixel_color (x, y) — Get pixel color at coordinatescompare_images (image1, image2) — Image similarity comparisonwait_for_screen_change (timeout?, threshold?) — Wait for the screen to changerun_shell (command) — Execute shell commands with elevated privilegeslaunch_app / stop_app (packageName) — Start/stop appslist_installed_apps — List all non-system installed appsis_app_running (packageName) — Check if an app is runningopen_url (url) — Open URL in browsershow_toast (message) — Display a toast notificationinstall_apk / uninstall_app — Install/uninstall appslist_files / read_file / write_file — Browse, read, write filesfile_exists / delete_file / rename_file / create_directorylist_projects / project_status — View Python projectsstart_project / stop_project — Control project executionrun_python_code (code) — Execute Python code snippets on the devicepip_list / pip_install / pip_uninstall / pip_showagent_run (instruction) — Run an on-device AI agent with natural languageagent_stop / agent_status — Control and monitor the agentagent_get_config / agent_set_config — Configure AI provider & modelagent_get_providers / agent_get_models — List available providers & modelsagent_test_connection — Verify AI model connectivityUSB connection drops are handled gracefully — when the device disconnects, the MCP server automatically re-establishes ADB port forwarding and retries the request.
On first connection via USB, the server automatically sets up ADB forwarding and starts the engine on the device if it's not already running.
UI hierarchy dumps over 15KB are automatically trimmed to keep only actionable elements (those with text, resource-id, content-desc, or clickable/scrollable attributes), reducing LLM token usage.
Touch events are injected at the Linux kernel level, making them work in any app including games, locked-down apps, and areas that block accessibility-based input.
Once connected, try these prompts with your AI agent:
print('Hello from Android!')"npm i -g yyds-auto-mcp