Desktop Gui

WarnAudited by ClawScan on May 10, 2026.

Overview

This is a coherent desktop automation skill, but it can upload full-screen screenshots to a remote AI service and execute that service’s click instructions on your real desktop without clear per-action approval.

Install only if you are comfortable with powerful desktop-wide automation. Use it first in a VM or test account, keep failsafe and pauses enabled, avoid sensitive screens, verify the model endpoint, require confirmation before actions, and do not send screenshots or API tokens over untrusted HTTP connections.

Findings (4)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

A mistaken or manipulated model response could click, type, submit forms, close windows, or change real account/business data in whatever application is visible.

Why it was flagged

The skill instructs the agent to execute model-produced GUI actions with xdotool on the live desktop.

Skill content
截图 (scrot) → Qwen3.5-27b 分析 → 返回坐标/操作 → xdotool 执行 ... if data['action'] == 'click': ... subprocess.run(['xdotool', 'click', '1'])
Recommendation

Use only in a VM or isolated desktop, restrict automation to a chosen window/app, and require explicit user confirmation before clicks, typing, submissions, or window-closing actions.

What this means

Anything visible on the desktop, including private messages, documents, secrets, or account pages, could be sent to that model service.

Why it was flagged

The visual mode captures a full screenshot, base64-encodes it, and sends it to a model endpoint over HTTP with no stated data minimization, retention, or trust boundary.

Skill content
subprocess.run(['scrot', '/tmp/screen.png']) ... requests.post("http://10.6.207.56:8000/v1/chat/completions", ... "image_url": {"url": f"data:image/png;base64,{img_b64}"})
Recommendation

Do not use this mode on sensitive screens unless you fully trust the endpoint; prefer HTTPS, crop screenshots to the minimum region, and add explicit user approval before each upload.

What this means

A model API token may be mishandled or exposed on the network, and users may not realize a credential is needed for the recommended mode.

Why it was flagged

The example uses a bearer credential for the model endpoint, and it is shown being sent to an HTTP URL while the registry declares no required credential or environment variable.

Skill content
requests.post("http://10.6.207.56:8000/v1/chat/completions", headers={"Authorization": "Bearer VLLM_API_KEY"}, ...)
Recommendation

Declare the required credential clearly, store it in a scoped environment variable or secret manager, use HTTPS or a trusted local endpoint, and avoid sending bearer tokens over plain HTTP.

What this means

Installing these dependencies changes the local system and trusts upstream package sources.

Why it was flagged

The setup relies on live package repositories and privileged system package installation; this is purpose-aligned for GUI automation but still affects the local environment.

Skill content
pip3 install pyautogui pygetwindow pymsgbox opencv-python-headless pillow pytesseract ... sudo apt-get install -y xdotool scrot tesseract-ocr ...
Recommendation

Install from trusted repositories, consider pinning Python package versions, and review the packages before using the skill on an important workstation.