Desktop Gui
PassAudited by VirusTotal on May 9, 2026.
Overview
Type: OpenClaw Skill Name: desktop-gui Version: 1.0.0 The skill provides powerful desktop GUI automation capabilities (mouse/keyboard control, screen capture) which are inherently high-risk. It is classified as suspicious primarily because the 'Vision Model' implementation in SKILL.md includes code that sends full-screen screenshots to a hardcoded private IP address (http://10.6.207.56:8000). While framed as a feature for intelligent automation, the practice of exfiltrating screen data to an external endpoint—even a private one—poses a significant security and privacy risk without further verification of the destination.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A mistaken or manipulated model response could click, type, submit forms, close windows, or change real account/business data in whatever application is visible.
The skill instructs the agent to execute model-produced GUI actions with xdotool on the live desktop.
截图 (scrot) → Qwen3.5-27b 分析 → 返回坐标/操作 → xdotool 执行 ... if data['action'] == 'click': ... subprocess.run(['xdotool', 'click', '1'])
Use only in a VM or isolated desktop, restrict automation to a chosen window/app, and require explicit user confirmation before clicks, typing, submissions, or window-closing actions.
Anything visible on the desktop, including private messages, documents, secrets, or account pages, could be sent to that model service.
The visual mode captures a full screenshot, base64-encodes it, and sends it to a model endpoint over HTTP with no stated data minimization, retention, or trust boundary.
subprocess.run(['scrot', '/tmp/screen.png']) ... requests.post("http://10.6.207.56:8000/v1/chat/completions", ... "image_url": {"url": f"data:image/png;base64,{img_b64}"})Do not use this mode on sensitive screens unless you fully trust the endpoint; prefer HTTPS, crop screenshots to the minimum region, and add explicit user approval before each upload.
A model API token may be mishandled or exposed on the network, and users may not realize a credential is needed for the recommended mode.
The example uses a bearer credential for the model endpoint, and it is shown being sent to an HTTP URL while the registry declares no required credential or environment variable.
requests.post("http://10.6.207.56:8000/v1/chat/completions", headers={"Authorization": "Bearer VLLM_API_KEY"}, ...)Declare the required credential clearly, store it in a scoped environment variable or secret manager, use HTTPS or a trusted local endpoint, and avoid sending bearer tokens over plain HTTP.
Installing these dependencies changes the local system and trusts upstream package sources.
The setup relies on live package repositories and privileged system package installation; this is purpose-aligned for GUI automation but still affects the local environment.
pip3 install pyautogui pygetwindow pymsgbox opencv-python-headless pillow pytesseract ... sudo apt-get install -y xdotool scrot tesseract-ocr ...
Install from trusted repositories, consider pinning Python package versions, and review the packages before using the skill on an important workstation.
