desktop-automation-100per100-local

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real local desktop automation skill, but it grants very broad screen, keyboard, mouse, clipboard, and macro powers with incomplete safety boundaries.

Install only if you explicitly need local desktop automation and are comfortable giving the skill control over your screen, mouse, keyboard, clipboard, and macro files. Do not record while entering credentials or private data, keep saved macros and logs protected, use dry-run first, and avoid replaying macros from untrusted sources.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
YARA SignaturesMalware Match, Webshell Match, Cryptominer Match

Findings (22)

Dynamic attribute access via getattr()

Low

Category: Dangerous Code Execution
Content: # Execute action if hasattr(self.action_manager, action): result = getattr(self.action_manager, action)(**params) if result.get('status') != 'ok': errors.append(f"Event {executed}: {action} failed - {result.get('message')}") executed += 1
Confidence: 97% confidence
Finding: result = getattr(self.action_manager, action)(**params)

Dynamic attribute access via getattr()

Low

Category: Dangerous Code Execution
Content: time.sleep(wait / speed) if hasattr(self.action_manager, action): return getattr(self.action_manager, action)(**params) return {"status": "error", "message": f"Unknown action: {action}"}
Confidence: 97% confidence
Finding: return getattr(self.action_manager, action)(**params)

Intent-Code Divergence

Medium

Confidence: 86% confidence
Finding: The README makes a security claim that Safe Mode is enabled by default and blocks risky actions, yet other examples document direct click/type execution without showing Safe Mode being disabled first. This inconsistency can cause users to trust protections that may not actually be active, increasing the chance of unintended UI interaction, credential entry, or destructive automation.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The code presents the class as having 'safety' controls, but the implemented safe mode only checks for a few string substrings and does not constrain the actual desktop actions being performed. In a desktop automation skill, clicks, keypresses, typing, window activation, clipboard use, and screenshots can all directly affect the user environment, so overstating protection can lead downstream agents or users to trust dangerous operations that remain fully possible.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The class advertises a configurable emergency stop hotkey, but the implementation never actually detects the configured key combination and `check_stop()` only returns a flag set by `stop()`. In a GUI automation skill that can click, type, drag, and play macros, a nonfunctional stop mechanism is a real safety issue because runaway automation may continue interacting with sensitive applications without an effective user abort path.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: `find_all_text_on_screen` accepts a `text` argument and its docstring says it finds all occurrences of that text, but the implementation returns every non-empty OCR token on the screen. In an agent skill context, this can cause over-collection of unrelated on-screen data, including sensitive information such as messages, passwords, or personal data, violating least-privilege expectations and potentially exposing far more data than the caller intended.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The module includes active UI-control functions (`click_position`, `click_image`) even though its top-level description only advertises screenshot, image detection, and OCR. This capability expansion is dangerous because it enables real interaction with the user's desktop, which can be abused for unintended clicks, workflow manipulation, or launching sensitive actions without clear disclosure.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The docstring states the module is for screenshot, image detection, and OCR, but omits that it can move the mouse and click on the user's screen. This misrepresentation increases security risk because downstream users or agents may trust the module as read-only while it actually has write-like control over the UI.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The screenshot method captures the full screen and writes it to disk without any user-facing prompt, disclosure, or policy check. In a desktop automation context, screenshots can expose credentials, personal data, messages, or confidential documents, making silent capture a meaningful privacy and data-exposure risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The method reads clipboard contents and injects them into the active application without warning or confirmation. Clipboard data often contains secrets such as passwords, tokens, personal data, or proprietary text, so blindly typing it into whichever window is focused can cause accidental disclosure or unintended actions.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: This function captures the full screen and runs OCR over on-screen text without any user-facing notice, consent flow, or scope restriction. In an automation context, screens may contain passwords, personal messages, tokens, or business data, so silent screen scraping increases privacy and data-exposure risk even if the code does not immediately exfiltrate the results.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The screenshot helper captures the full screen and writes the image to disk by default without any interactive notice or confirmation. In a desktop automation skill, screenshots may contain credentials, personal messages, tokens, or regulated data, so silent capture and persistence meaningfully increases privacy and data-exposure risk.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: This function performs OCR over the whole screen or a selected region and returns structured extracted text and coordinates, which can expose highly sensitive on-screen data at scale. In an automation agent, screen scraping is a powerful exfiltration primitive because it can harvest secrets from applications that are otherwise outside normal file/API boundaries.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The function takes a live screenshot via pyautogui without any user-facing notice, consent check, or access control. Screen captures can expose sensitive information such as credentials, personal data, messages, or confidential documents, so silent capture creates a real privacy and data-exposure risk even if the feature is intended for automation.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The multi-scale search path also captures the current screen silently before processing it for template matching. Repeated or automated screen capture increases the chance of collecting sensitive on-screen content without the user's awareness, which is especially risky in an agent skill that may be invoked programmatically.

Missing User Warnings

Medium

Confidence: 82% confidence
Finding: These OCR methods capture screenshots of the full screen or specified regions and process whatever text is visible, but there is no user-facing consent, disclosure, or access control in the module itself. In an agent skill, screen OCR is inherently privacy-sensitive because it may ingest secrets, personal communications, tokens, or enterprise data from unrelated applications.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The function logs the full `params` dictionary for every action without any redaction or filtering. In a desktop automation skill, action parameters can easily contain sensitive data such as typed text, file paths, credentials, tokens, personal data, or screen-derived content, and those values would be written to persistent log files under the user's home directory.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The screenshot functionality captures the full primary monitor without any explicit user-facing consent, warning, or scoping controls. Because screenshots can include passwords, messages, financial data, or other sensitive material, silent full-screen capture creates a meaningful privacy and data-exposure risk.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The module performs automated mouse movement and clicking via `pyautogui` without a clear warning or gating mechanism. In an agent skill context, this is more dangerous than passive analysis because it can trigger purchases, permission prompts, destructive UI actions, or security-relevant changes on behalf of the user.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This script replays untrusted JSON-defined GUI actions directly against the user's desktop, including clicks, keypresses, typing, clipboard operations, window activation, OCR, and image-based interaction, without any confirmation prompt, sandboxing, allowlist, or kill-switch beyond whatever pyautogui provides by default. In the context of an agent skill, that makes a supplied macro file an implicit capability to drive arbitrary UI actions on the host, which can trigger destructive operations, submit sensitive data, or manipulate other applications unexpectedly.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The manifest advertises full desktop automation, including mouse, keyboard, window control, OCR, image recognition, and macro recording/replay, without any stated scope limits, consent requirements, or safety boundaries. In an agent skill, this creates a powerful capability surface that could enable unintended or abusive control of the host desktop, data capture from the screen, and replay of sensitive user actions.

YARA rule 'keylogger_indicators': Keylogger functionality in scripts or source code [malware]

High

Category: YARA Match
Content: from datetime import datetime from pynput import mouse, keyboard from pynput.mouse import Button from pynput.keyboard import Key # ============ CONFIGURATION LOGGING ============ logging.basicConfig(
Confidence: 95% confidence
Finding: from pynput.keyboard import

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal