Desktop automation ultra

Security checks across malware telemetry and agentic risk

Overview

This is a real local desktop automation skill, but it needs Review because it can read the screen, record keystrokes, and replay GUI actions with weak boundaries.

Install only if you are comfortable granting broad local desktop-control authority. Use it for non-sensitive workflows, keep passwords and private screens out of view while recording or OCR is active, run only macros you created or reviewed, prefer dry-run first, and treat saved macros, screenshots, logs, and reports as sensitive files.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
Findings (30)

Dynamic attribute access via getattr()

Low
Category
Dangerous Code Execution
Content
# Execute action
                    if hasattr(self.action_manager, action):
                        result = getattr(self.action_manager, action)(**params)
                        if result.get('status') != 'ok':
                            errors.append(f"Event {executed}: {action} failed - {result.get('message')}")
                        executed += 1
Confidence
95% confidence
Finding
result = getattr(self.action_manager, action)(**params)

Dynamic attribute access via getattr()

Low
Category
Dangerous Code Execution
Content
time.sleep(wait / speed)
                    
                    if hasattr(self.action_manager, action):
                        return getattr(self.action_manager, action)(**params)
                
                return {"status": "error", "message": f"Unknown action: {action}"}
Confidence
95% confidence
Finding
return getattr(self.action_manager, action)(**params)

Intent-Code Divergence

Medium
Confidence
93% confidence
Finding
The module advertises 'safety' and 'thread-safe' desktop automation, but the actual controls are weak and easily bypassed. Safe mode only scans a few string patterns for a subset of actions, while still allowing powerful actions like window activation, screenshot capture, clipboard manipulation, and arbitrary keyboard/mouse input that can exfiltrate or alter user data.

Intent-Code Divergence

Medium
Confidence
97% confidence
Finding
The class advertises a configurable emergency stop hotkey, but the implementation never actually detects the configured key combination and `check_stop()` only returns a manual flag. In an automation skill that can click, type, and replay macros, a non-functional stop mechanism is a real safety issue because users may be unable to interrupt unintended actions.

Intent-Code Divergence

Medium
Confidence
87% confidence
Finding
The module advertises 'safety' and 'robust error handling', but the implementation executes action names directly from macro files and therefore does not provide meaningful safety controls. This mismatch can cause downstream developers or operators to trust untrusted macros and deploy the feature in higher-risk environments, increasing likelihood of abuse.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The screenshot function captures the current desktop and writes it to an arbitrary filesystem path without any user-facing warning, consent, or path restrictions. In an agent skill, this can silently collect sensitive on-screen data such as credentials, messages, tokens, or personal information and persist it for later access or exfiltration.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
This function reads clipboard contents and injects them into the active application via simulated typing with no disclosure, confirmation, or destination validation. Clipboard data often contains secrets, and blindly pasting into whatever window is focused can leak credentials, API keys, or private text into chats, terminals, browsers, or untrusted apps.

Missing User Warnings

High
Confidence
93% confidence
Finding
This routine continuously monitors screen contents and automatically executes UI actions when conditions match, enabling unattended interaction with arbitrary applications. In the context of a desktop automation skill, that materially increases the risk of unintended or harmful actions, especially because actions are driven by external condition data and there is no strong gating, confirmation, or reliable stop control.

Missing User Warnings

High
Confidence
96% confidence
Finding
Macro playback loads external JSON files and executes actions by name via `globals()`, including nested sub-macros, with minimal validation. In an automation module this is dangerous because untrusted macro files can drive keyboard/mouse actions against the system, effectively turning data files into a command source for uncontrolled UI automation.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The reporting function persists execution details, including actions and parameters, to JSON and HTML files in a reports directory. In a desktop automation context those logs can contain sensitive typed text, file paths, window targets, or workflow details, creating a local data exposure risk if other users or processes can read them.

Missing User Warnings

Medium
Confidence
82% confidence
Finding
The screenshot and spreadsheet-writing features persist potentially sensitive desktop contents or extracted data to disk without requiring explicit disclosure, confirmation, or secure storage controls. That increases the chance of silent data retention, accidental leakage through predictable file locations, and exposure to other local users or processes.

Missing User Warnings

Medium
Confidence
82% confidence
Finding
The screenshot and spreadsheet-writing features persist potentially sensitive desktop contents or extracted data to disk without requiring explicit disclosure, confirmation, or secure storage controls. That increases the chance of silent data retention, accidental leakage through predictable file locations, and exposure to other local users or processes.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The method captures a full-screen screenshot via pyautogui without any user-facing notice, consent check, or scope limitation. In an agent skill context, screen contents may include sensitive information such as credentials, messages, or personal data, so silent capture creates a privacy and data-exposure risk even if the apparent purpose is benign image matching.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The multi-scale search path also takes a screenshot without informing the user, repeating the same privacy issue across another code path. Because this function may run repeatedly across scales, it can process sensitive on-screen data without transparency, increasing the chance of unnoticed collection during automation.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
This method captures the entire screen and performs OCR on it, then returns matched text and coordinates without any built-in user notice, consent check, or scope restriction. In an agent skill context, that can expose passwords, messages, tokens, financial data, or other sensitive on-screen content to downstream callers or logs, even if the code appears intended for automation rather than theft.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
These extraction functions read text from either a specified region or the whole screen and return raw OCR output directly to the caller, again without explicit privacy warning, consent, or data minimization. Because the returned content may include arbitrary sensitive text visible on screen, this creates a meaningful privacy and data-exposure risk in an agent environment.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The screenshot function captures the full primary monitor and is later used by OCR helpers, which can expose sensitive on-screen data such as passwords, emails, tokens, financial information, or private messages. There is no built-in user notice, scope restriction, consent check, or minimization, so any caller can silently collect broad visual data from the user's desktop.

Missing User Warnings

Medium
Confidence
89% confidence
Finding
The click helpers can move the mouse and click on matched UI elements without any confirmation, rate limiting, or safety interlock. In an agent setting, this can trigger unintended actions such as sending messages, confirming prompts, changing settings, authorizing transactions, or dismissing security dialogs based solely on image matching.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The script replays arbitrary GUI actions from an external JSON file immediately, including clicks, keystrokes, window activation, clipboard pasting, dragging, OCR/image-driven waits, and typing, without an execution consent prompt, target restriction, or visible safety interlock. In the context of an agent skill, this can be abused to trigger unintended actions in whatever application is focused, including sending messages, altering settings, approving prompts, or interacting with sensitive desktop apps.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The clipboard write action allows arbitrary text from the macro file to be copied into the system clipboard with no user disclosure or consent. Because the clipboard is shared OS state, this can overwrite sensitive existing contents and stage hidden payloads for later paste actions into terminals, chats, browsers, or password fields, increasing the risk of data loss or unintended command/input injection.

Vague Triggers

Medium
Confidence
93% confidence
Finding
The skill advertises unrestricted 'full desktop automation' including mouse, keyboard, window control, OCR, image recognition, and macro recording/replay, but provides no activation boundaries, target-application restrictions, consent requirements, or safety limits. In an agent context, this broad capability can be abused to interact with arbitrary applications, capture sensitive on-screen data, and perform unintended destructive actions on the host system.

Ssd 3

Medium
Confidence
91% confidence
Finding
The README includes an explicit example that watches for login fields and types credentials automatically, normalizing insecure handling of secrets in a desktop automation tool that already records keystrokes and manipulates the UI. In this context, such examples increase the likelihood of credential exposure through logs, macros, shoulder-surfing, screen capture, or unsafe reuse of plaintext secrets.

Unpinned Dependencies

Low
Category
Supply Chain
Content
pyautogui>=0.9.53
pygetwindow>=0.0.9
Pillow>=8.0.0
opencv-python>=4.5.0
Confidence
91% confidence
Finding
pyautogui>=0.9.53

Unpinned Dependencies

Low
Category
Supply Chain
Content
pyautogui>=0.9.53
pygetwindow>=0.0.9
Pillow>=8.0.0
opencv-python>=4.5.0
pytesseract>=0.3.10
Confidence
91% confidence
Finding
pygetwindow>=0.0.9

Unpinned Dependencies

Low
Category
Supply Chain
Content
pyautogui>=0.9.53
pygetwindow>=0.0.9
Pillow>=8.0.0
opencv-python>=4.5.0
pytesseract>=0.3.10
pyperclip>=1.8.2
Confidence
97% confidence
Finding
Pillow>=8.0.0

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal