Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

UI Element Ops

v1.0.2

Parse UI screenshots into structured element JSON (type, OCR text, bbox) and operate desktop UI from parsed elements. Use when a user asks to detect/locate U...

0· 442·4 current·4 all-time
byMuRong@murongg
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description (parse UI screenshots and operate desktop UI) matches the code and scripts: parse_ui.py uses OmniParser models for detection/captioning, bootstrap installs ML libraries and downloads weights, and operate_ui.py uses pyautogui to click/type/screenshot. The one small mismatch is that bootstrap installs the 'openai' package (and some general-purpose libs) which are not used by the included scripts — likely unnecessary but not evidence of malicious intent.
Instruction Scope
SKILL.md stays on-topic (bootstrapping, parsing screenshots, listing/finding elements, and performing UI actions). The runtime instructions explicitly enable desktop control (click/type/hotkey) via pyautogui — expected for the stated purpose but high-privilege. A minor inconsistency: capture_and_parse.sh invokes operate_ui.py via the system 'python3' (not the venv python created by bootstrap), which can cause environment/runtime mismatch and unexpected behavior if system Python lacks the required packages.
Install Mechanism
There is no registry install spec, but the included bootstrap script creates a venv, pip-installs many ML packages, clones the OmniParser GitHub repo, and uses the Hugging Face CLI to download model weights. The sources used (GitHub and HF) are common release hosts; however, downloading/extracting model weights and installing many packages is high-impact and should be done deliberately (prefer isolated environment).
Credentials
The skill does not declare or require any sensitive environment variables or credentials. It optionally respects OMNIPARSER_DIR and TYPE_RULES. Note: the bootstrap uses the HF CLI to download weights — if a requested model version were private the CLI could prompt for/require a Hugging Face token, but no HF token is declared as required here. No other unrelated credentials are requested.
Persistence & Privilege
always:false (normal). The skill can autonomously perform desktop actions via pyautogui; that capability is coherent with its purpose but grants broad control over the user's desktop. Autonomous invocation combined with desktop-control is a meaningful risk vector — exercise caution when allowing the agent to call this skill without user confirmation.
Assessment
This skill appears to do what it says: it installs ML dependencies, downloads OmniParser code/weights, parses screenshots, and can automate your desktop using pyautogui. Before installing: (1) review and run the bootstrap script in an isolated environment or VM (it installs many packages and downloads models); (2) verify you trust the OmniParser GitHub repo and the HF model being downloaded; (3) be aware that operate_ui.py can click/type/press keys — test in dry-run mode first and do not allow unattended/autonomous runs unless you trust the skill and its inputs; (4) note the capture script calls system python3 (not the venv) — prefer running commands using the venv python to avoid unexpected behavior; (5) if you are concerned about privacy, inspect what screenshots/elements are stored and where (defaults are /tmp and cwd).

Like a lobster shell, security has layers — review code before you run it.

latestvk9745x751aqsptq96m2gc8csm581zd22

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments