desktop-control

Security checks across malware telemetry and agentic risk

Overview

This appears to be a legitimate desktop automation skill, but it gives broad control over the user's desktop and its safety controls are incomplete.

Install only if you intentionally want an agent that can control your full desktop. Use it in a clean or separate desktop session, keep sensitive windows and clipboard contents clear, leave failsafe enabled, avoid unattended autonomous runs, and treat saved screenshots and optional LLM/API integrations as sensitive data flows.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (19)

Tp4

High

Category: MCP Tool Poisoning
Confidence: 89% confidence
Finding: The skill is presented as desktop automation, but the documented behavior expands into broader sensitive capabilities such as window enumeration/activation, clipboard read/write, simulated application launching, and semi-autonomous workflows. This mismatch reduces informed consent and can cause operators to enable a skill without realizing it can access sensitive on-screen data, manipulate other applications, or trigger unintended actions.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The guide includes concrete code for sending task content to an external LLM service and retrieving an API key from environment variables, even though the skill is framed as desktop automation. In this context, desktop tasks and screen-derived context can contain sensitive local data, so encouraging outbound transmission without strict scope, disclosure, or data-handling controls creates a real privacy and data-exposure risk.

Intent-Code Divergence

Medium

Confidence: 78% confidence
Finding: The documentation is internally inconsistent: earlier sections provide live integration code for an external LLM API, while a later section presents LLM integration as only a future enhancement. This mismatch can mislead operators about what capabilities are already active, weakening informed consent and safe deployment decisions.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill advertises mouse, keyboard, and screen automation, but it also exposes window discovery/activation and clipboard read/write features that expand its effective privileges beyond the declared scope. This mismatch increases the chance that callers grant trust or install the skill without realizing it can enumerate applications and access sensitive clipboard contents such as passwords, tokens, or copied documents.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: Clipboard read and write access is not necessary for the stated purpose and can directly expose or tamper with sensitive user data. A skill with clipboard access can silently collect copied secrets or replace clipboard contents with attacker-controlled data, enabling data theft or social engineering workflows.

Context-Inappropriate Capability

Low

Confidence: 84% confidence
Finding: Window enumeration and activation go beyond the stated desktop-control scope and can be used to discover what applications or documents a user has open, then bring target windows to the foreground for further automated interaction. While less sensitive than direct clipboard or screenshot exfiltration, this still expands surveillance and control capabilities without transparent disclosure.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: The class documents an approval mode for actions, but several methods bypass `_check_approval`, including scrolling, screenshot capture, pixel reads, key hold/release, clipboard access, dialogs, and some utility behaviors. This creates a broken security boundary: operators may believe approval mode protects all sensitive actions when in reality important input/output and data-capture functions can execute silently.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The guide promotes screenshot capture and screen analysis without warning that on-screen content may include credentials, personal data, messages, documents, or other sensitive information. In a desktop automation skill, this is especially risky because the entire user session may be visible and repeatedly captured.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The guide documents sending task content to an external API but does not warn users that prompts and possibly desktop-derived context may be transmitted off-device. Because this skill operates on local desktop state, outbound requests may expose sensitive operational or personal information to third parties.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The examples encourage form filling, social posting, and moving data between applications without warning about privacy, account misuse, irreversible actions, or data quality issues. In a desktop-control skill, these actions can directly affect user accounts and sensitive records, making omission of safeguards materially risky.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The guide advertises disabling the failsafe without explaining that uncontrolled automation may become difficult to interrupt, especially when mouse and keyboard control are autonomous. In a desktop agent, removing emergency stop protections increases the chance of runaway actions, unintended clicks/typing, and operational disruption.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The quick reference includes concrete examples for screen capture, clipboard access, hotkeys, window switching, file selection, and launching applications, but provides little contextual warning about unintended actions, data exposure, or system effects. In a desktop automation skill, these examples can be directly repurposed for unauthorized interaction with user applications, copying sensitive content, or capturing on-screen information, especially when paired with options like disabling failsafe checks.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill explicitly supports screenshots, image recognition, and clipboard access, all of which can expose secrets such as passwords, tokens, personal data, or proprietary information. Documenting these features without prominent privacy warnings or handling guidance increases the risk of accidental collection, storage, or disclosure of sensitive user data.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The examples demonstrate live UI actions such as form filling, submitting with Enter, file selection/copying, window activation, and drag-and-drop. Without explicit cautions, users may run these patterns against real applications and unintentionally modify data, submit forms, move files, or interact with the wrong window due to focus or coordinate errors.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The screenshot function can capture the full screen or a region and save it to disk without any approval by default. In a desktop automation context, this can expose emails, chats, documents, credentials, or other on-screen secrets and create a persistent artifact on disk that is easier to exfiltrate later.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: Clipboard contents are read and overwritten without warning or approval, allowing silent access to sensitive copied data and undetected modification of what the user later pastes. In a desktop-control skill, this is particularly risky because clipboard data often contains passwords, API keys, payment info, and private text copied from other applications.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The agent accepts a natural-language task and can autonomously launch apps, activate windows, and inject keystrokes without an explicit consent or safety confirmation step. In a desktop-control context, this is dangerous because prompt misunderstanding, misuse, or task injection could trigger system-impacting actions on the user's machine.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The agent captures screenshots before and after each step and stores them in results without any user-facing privacy notice or consent control. Desktop screenshots can contain credentials, personal messages, tokens, and other sensitive on-screen data, making silent capture a meaningful privacy and security risk.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: This step writes screenshots to disk automatically using a caller-controlled filename, without explicit confirmation or retention controls. Persisting desktop images increases exposure because sensitive visual data remains on disk and may be accessible to other users, processes, backups, or later compromise.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal