Security audit

pyautogui

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed desktop automation skill with real screen, keyboard, clipboard, OCR, and cleanup risks, but the behavior matches its stated purpose and is user-invoked.

Install only if you intentionally want local desktop automation. Grant OS Accessibility/admin permissions narrowly, avoid using screenshots/OCR/clipboard around secrets, prefer region screenshots, confirm the active window before clicks/hotkeys/paste/OCR-click actions, and preview cleanup before using --execute or auto cleanup.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (13)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 88% confidence
Finding: The skill advertises and documents file-reading behavior (screenshots, image metadata/cropping, cleanup analysis) but does not declare corresponding permissions. Undeclared file access weakens policy enforcement and user awareness, making it easier for the skill to read local files or process sensitive screenshots without explicit authorization boundaries.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 94% confidence
Finding: The documented behavior exceeds the declared purpose by including clipboard manipulation and direct click execution from image/OCR matches. In a UI automation skill, these hidden or under-declared capabilities are particularly risky because they can trigger unintended actions, exfiltrate or overwrite clipboard contents, and interact with sensitive UI elements with minimal user visibility.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The README documents screenshot capture, OCR-based text extraction, and clipboard manipulation without clearly warning that these operations can expose sensitive on-screen data such as credentials, messages, tokens, personal information, or proprietary content. In a UI automation skill, these capabilities materially increase privacy and data-handling risk because users may invoke them on live desktops containing confidential information.

Vague Triggers

Medium

Confidence: 81% confidence
Finding: The activation criteria are broad enough to trigger on many generic automation, screenshot, or image-analysis requests, increasing the chance the skill is invoked in contexts involving sensitive applications or data. Because this skill can control keyboard/mouse, capture screens, and manipulate files, overbroad activation materially increases misuse risk.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The trigger examples are generic and lack scope constraints, so they may match routine requests without checking whether the target window, application, or data is sensitive. Given the skill's capability to click, type, screenshot, OCR, and delete files, this loose routing can cause unintended execution in high-impact contexts.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The documentation promotes screenshot capture and OCR/text recognition without warning that these operations may collect passwords, messages, tokens, personal information, or other sensitive on-screen content. In this skill's context, screen capture is core functionality, so omission of privacy and sensitivity guidance makes accidental over-collection more likely.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The cleanup examples include commands that can delete files, including with broad targets like '.', without strong warnings about irreversible removal or verification of scope. In a skill that generates and processes files, such examples can normalize destructive usage and lead to accidental deletion of unrelated data if the working directory is wrong.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The trigger list contains broad, common-user phrases such as clicking, typing, screenshots, image lookup, and cleanup, which can cause the skill to activate in situations where the user did not explicitly request powerful UI automation. In this skill's context, accidental activation is more dangerous than usual because the skill can control mouse/keyboard, capture screens, run OCR, and delete files, creating real confidentiality and integrity risks from misrouting alone.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The documentation advertises screenshot, OCR, and image-finding features without clearly warning that these operations may capture sensitive on-screen data such as credentials, personal messages, tokens, financial data, or confidential documents. In a desktop automation skill, this omission materially increases privacy risk because users may not realize that full-screen capture and text extraction can expose far more data than the immediate target element.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The cleanup section documents deletion and auto-cleanup operations, including examples that can delete files broadly, but does not clearly emphasize that deletion may be irreversible and may remove user-important data if the path or filters are misapplied. Because this skill mixes generated artifacts with arbitrary directory cleanup commands, insufficient warnings can lead to unintended data loss from operator error or overbroad execution.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The OCR path writes a full screenshot to a predictable local file (.temp_screenshot.png), which may contain sensitive on-screen data such as messages, credentials, or personal information. If the process crashes, runs in a shared directory, or another local process monitors files, screen contents can be exposed unintentionally.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: This code performs an immediate desktop click on the first match once --click is supplied, with no runtime confirmation, countdown, or contextual validation of the target. Misidentification, stale screenshots, or ambiguous matches could trigger unintended actions on the user's system, including sending messages, confirming dialogs, or interacting with privileged windows.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The text-search workflow can click the first OCR match immediately, even though OCR results may be noisy, partial, or ambiguous. In a UI automation skill, this is especially risky because text like 'OK', 'Send', or 'Delete' may appear in multiple places, causing unintended or destructive interactions.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal