mano-cua

Computer use for GUI automation tasks via VLA models. Use when the user describes a task in natural language that requires visual screen interaction and no A...

MIT-0 · Free to use, modify, and redistribute. No attribution required.

⭐ 1 · 84 · 0 current installs · 0 all-time installs

by@hanningwang

MIT-0

Security Scan

VirusTotal

Suspicious

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

Name/description describe desktop GUI automation and the SKILL.md, usage, and install steps all align with that purpose. Required artifacts (a local mano-cua binary) are coherent for a client that executes clicks/typing locally while relying on a remote vision model.

ℹ

Instruction Scope

Runtime instructions explicitly require capturing screenshots of the primary display and sending them (plus the task text) to a remote host (mano.mininglamp.com) — this is expected for a cloud-backed VLA automation tool, but the SKILL.md's claim that "no local files, clipboard content, or system credentials are read or transmitted" is a developer assertion that cannot be verified from the instruction file alone. Users should treat screenshot transmission as a significant privacy action.

ℹ

Install Mechanism

The install spec uses a third-party Homebrew tap (HanningWang/tap) which will create a mano-cua binary. Homebrew taps are common, but third-party taps and GitHub release downloads (Windows instructions) carry supply-chain risk; verify the tap/formula and releases before installing.

✓

Credentials

The skill declares no required environment variables, no credentials, and no config paths. That is proportionate to the described client behavior (local binary + remote service) — nothing extraneous is requested by the skill manifest.

✓

Persistence & Privilege

always is false and the skill does not request system-wide persistence or to modify other agent settings. The expected privileges are local input injection (mouse/keyboard) and network access, which are inherent to the stated functionality.

Assessment

This skill appears to do what it says (local automation driven by a cloud vision model), but before installing you should: 1) Inspect the GitHub repo and Homebrew formula to confirm the binary is built from the claimed source; 2) Review the network endpoint (mano.mininglamp.com) and the module referenced (task_model.py) to ensure only screenshots and task text are sent; 3) Avoid running it while interacting with sensitive apps (password managers, banking, private documents) because screenshots will be transmitted; 4) Consider testing in an isolated environment (VM) or blocking network access if you only want local automation; and 5) If you rely on strict privacy, do not install until you can audit the source and confirm the release artifacts match the repository.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.3

Download zip

latestvk978xschc0xq4p5jev76cnjhsd83f9jq

License

MIT-0

Free to use, modify, and redistribute. No attribution required.

Termshttps://spdx.org/licenses/MIT-0.html

Runtime requirements

🖥️ Clawdis

Install

Install mano-cua (brew)

Bins: mano-cua

brew install HanningWang/tap/mano-cua

SKILL.md

mano-cua

Desktop GUI automation driven by natural language. Captures screenshots, sends them to a cloud-based hybrid vision model, and executes the returned actions on the local machine — click, type, scroll, drag, and more.

Requirements

A system with a graphical desktop (macOS / Windows / Linux)
mano-cua binary installed

Installation

macOS / Linux (Homebrew):

brew install HanningWang/tap/mano-cua

Windows:

Download the latest mano-cua-windows.zip from GitHub Releases, extract it, and add the folder to your PATH.

Usage

# Run a task
mano-cua run "your task description"

# Stop the current running task
mano-cua stop

usage: fty-nb [-h] command [task]

VLA Desktop Automation Client

positional arguments:
  command     Command: 'run' or 'stop'
  task        Task description (required for 'run')

options:
  -h, --help  show this help message and exit

Note: Only one task can run at a time per device. If you need to start a new task, first stop the current one with mano-cua stop.

Examples

# Run a task
mano-cua run "Open WeChat and tell FTY that the meeting is postponed"
mano-cua run "Search for AI news in Xiaohongshu and show the first post"

# Stop the current task (use before starting a new one)
mano-cua stop

How It Works

The current screenshot is captured and sent to the cloud at each step. A hybrid vision solution decides the next action:

Mano model — handles straightforward, lightweight tasks with rapid output.
Claude CUA model — handles complex tasks requiring deeper reasoning.

The system automatically selects the appropriate model based on task complexity.

Supported Interactions

click · type · hotkey · scroll · drag · mouse move · screenshot · wait · app launch · url direction

Status Panel

A small UI panel is displayed on the top-right corner of the screen to track and manage the current session status.

Data, Privacy & Safety

What is sent: Screenshots of the primary display and the task description are sent to mano.mininglamp.com — these are the minimal inputs required for the vision model to determine the next action.
What is NOT sent: No local files, clipboard content, or system credentials are read or transmitted. All network calls are in a single module (task_model.py) for easy review.
Authentication: No API key or credentials are required. The client identifies itself with a locally generated device ID (~/.myapp_device_id) — no secrets are embedded in the binary.
Supply chain: The full client is open source. The Homebrew formula builds directly from this public source, ensuring the installed binary is fully auditable.
User control: Users can stop any session at any time via the UI panel or mano-cua stop.

Important Notes

Do not use the mouse or keyboard during the task. Manual input while mano-cua is running may cause unexpected behavior.
Multiple displays: only the primary display is used. All mouse movements, clicks, and screenshots are restricted to that display.

Platform Support

macOS is the preferred and most tested platform. Adaptations for Windows and Linux are not yet fully completed — minor issues are expected.

Files

1 total

Select a file

Select a file to preview.

Comments

Loading comments…