lazy

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real desktop automation skill, but it needs Review because it can control the whole desktop, read screenshots and clipboard data, and optionally send screenshots to an LLM without strong built-in consent boundaries.

Install only if you need powerful supervised desktop automation. Keep failsafe enabled, use confirmation modes for risky workflows, close sensitive windows before running it, avoid using untrusted remote LLM clients, and treat any screenshot sent to an LLM or saved to disk as potentially containing private information.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (21)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 92% confidence
Finding: The skill is presented as desktop automation for mouse, keyboard, and screen control, but the documentation also exposes broader capabilities including window enumeration/activation, clipboard access, key monitoring, screenshot capture, and an optional autonomous AI agent. This scope expansion matters because these features enable data collection, application targeting, and autonomous action beyond what a caller may reasonably expect from the description, increasing the risk of stealthy misuse and over-privileged invocation.

Description-Behavior Mismatch

Medium

Confidence: 87% confidence
Finding: The documented skill includes clipboard access and an optional AI task-executing agent, which go beyond the manifest’s stated mouse/keyboard/screen-control purpose. In an agent ecosystem, that mismatch can cause unsafe trust decisions, because operators may approve or route the skill assuming it only performs low-level UI actions while it can also read sensitive clipboard contents and autonomously plan tasks.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: An optional AI agent capable of executing tasks significantly increases risk relative to a basic desktop control skill because it can combine screenshots, UI manipulation, and autonomous planning into higher-level actions. That expanded autonomy makes unintended or unsafe operations more likely, especially if users invoke the skill based on the narrower desktop-control description.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The agent captures a full desktop screenshot, base64-encodes it, and sends it to an external LLM as part of planning. That creates a real data-exposure risk because screenshots can contain credentials, messages, documents, or other sensitive on-screen content unrelated to the requested task. In a desktop automation skill, exporting the user's full screen to a third party materially expands the trust boundary beyond local control.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill metadata advertises mouse, keyboard, and screen control, but the implementation also includes window enumeration/activation and clipboard read/write capabilities. This expands the effective privilege and data-access surface beyond what a caller or reviewer would reasonably expect, increasing the risk of stealthy focus hijacking or access to sensitive clipboard contents.

Description-Behavior Mismatch

Low

Confidence: 91% confidence
Finding: The screenshot API can save captured screen contents to an arbitrary filesystem path, but the manifest does not disclose file-writing behavior. Even if intended as convenience functionality, undeclared file output can enable covert persistence of sensitive screen data or unexpected overwriting of user files.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The class documents `require_approval` as requiring user confirmation for actions, but multiple state-changing or sensitive methods bypass `_check_approval`, including clipboard access, scrolling, key hold/release, alerts/confirms, screenshots, and window operations. This creates a misleading safety boundary: operators may enable approval mode believing actions are gated when they are not.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The guide explicitly states the agent takes screenshots during autonomous execution and retains them in results, but provides no warning that screenshots may capture secrets, personal data, tokens, emails, or other sensitive on-screen content. In a desktop automation skill, this omission is materially risky because screenshots are a core telemetry artifact and may be stored, shared, or logged beyond the immediate session.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The form-filling example encourages use of resume data and other personal information without any warning about handling PII, consent, destination verification, or data minimization. Because this is an autonomous desktop agent that can type into arbitrary applications and websites, users could inadvertently expose sensitive personal data to the wrong form or malicious interface.

Missing User Warnings

Medium

Confidence: 80% confidence
Finding: The social media posting example normalizes autonomous public posting without warning that posts may be irreversible, reputationally damaging, or sent from the wrong account/context. In an agent that observes screens and acts autonomously, accidental or misdirected posting is a realistic risk and can cause immediate external impact.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The guide presents disabling the failsafe as a simple performance option but does not explain that this removes an important control against runaway mouse/keyboard automation and unintended actions. In the context of a fully autonomous desktop agent, removing the failsafe can materially increase the chance of uncontrolled interaction, destructive clicks, unintended submissions, or difficulty interrupting the agent.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The quick-reference includes examples for save, search/replace, screenshots, clipboard access, and file manipulation, but it does not prominently warn that these actions can alter user data, capture sensitive information, or affect the active application if focus is wrong. In a desktop automation skill, concise examples are expected, but documenting destructive or privacy-impacting actions without clear guardrails increases the chance of accidental misuse or unsafe agent behavior.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The description is broad and does not define invocation constraints, acceptable use boundaries, or sensitive-action limitations. In agent systems, overly broad trigger language can cause a powerful skill to be selected in many contexts, including those involving sensitive applications or data, increasing the chance of overreach and misuse.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The documentation describes screenshot capture to files and nearby sensitive capabilities without any warning about privacy, secret exposure, or regulated data handling. Screenshots and clipboard-like desktop artifacts can contain credentials, personal information, tokens, or confidential documents, so omitting warnings and guardrails increases the likelihood of unsafe use.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: This package explicitly exposes desktop automation primitives such as mouse movement, clicking, typing, hotkeys, and screenshots, but the public module interface and usage example provide no user-facing warning, consent requirement, or safety notice about the ability to control input devices and capture screen contents. In an agent context, these capabilities can enable unintended destructive actions, credential entry, UI manipulation, or sensitive data capture if invoked without clear disclosure and guardrails.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The code sends screenshots to `self.llm.generate(..., images=[img_b64])` automatically whenever an LLM client is configured, without any user confirmation at the time of transfer. This is dangerous because a normal task request can silently trigger exfiltration of sensitive visual data from the desktop, including secrets not needed for task completion. The risk is elevated by the broad screenshot capture of the current desktop state rather than a narrowly scoped region.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The screenshot action writes an image file to disk without any user-facing warning, which can persist sensitive desktop contents locally in an easily discoverable file. Even if not exfiltrated, silent screenshot creation can expose confidential information to other local users, backup systems, syncing tools, or later accidental sharing. In an automation agent, persistence of screen contents should be treated as sensitive data handling.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The demo saves full-screen and region screenshots directly to disk, which can capture sensitive on-screen data such as emails, tokens, documents, or personal information without an explicit advance warning or consent step. In a desktop automation skill, screen capture is expected functionality, but persisting those captures to files increases exposure because data remains on disk after the demo ends.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The demo reads the current clipboard, overwrites it with new content, and may restore the original clipboard, all without clear advance disclosure before accessing potentially sensitive copied data. Clipboard contents often include passwords, API keys, personal data, or proprietary text, so silent access and modification can leak or disrupt user data even in a demonstration context.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: Writing arbitrary text to the system clipboard occurs with no user-facing warning or approval. In a desktop automation context, clipboard contents are often used to transfer secrets, commands, or payment/account data, so silent modification can facilitate deception, data replacement, or downstream command injection into user workflows.

Missing User Warnings

Medium

Confidence: 99% confidence
Finding: Reading the system clipboard silently can expose passwords, tokens, personal data, or proprietary information that the user copied for unrelated tasks. In this skill's desktop-control context, clipboard access is especially sensitive because it enables opportunistic data harvesting outside the core mouse/keyboard/screen automation scope.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal