Xia Desktop Agent

Security checks across malware telemetry and agentic risk

Overview

This skill openly provides Windows desktop automation, but it can run shell commands, capture remote-access credentials, and send WeChat messages or files without clear approval safeguards.

Review before installing. Only use this on a Windows machine you intentionally want an agent to control, and do not let untrusted users invoke it. Require manual confirmation before ToDesk credential capture, WeChat messages, file transfers, app launches, screenshots, or natural-language task execution. Treat saved screenshots and ToDesk codes/passwords as sensitive secrets, and avoid using the open_app fallback with arbitrary text unless it is changed to a strict allowlist.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Findings (17)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
"code": "code",
            }
            cmd = app_map.get(app_name.lower(), app_name)
            subprocess.Popen(cmd, shell=True)
            logger.info(f"启动应用: {app_name} -> {cmd}")
            return True
        except Exception as e:
Confidence
99% confidence
Finding
subprocess.Popen(cmd, shell=True)

Lp3

Medium
Category
MCP Least Privilege
Confidence
84% confidence
Finding
The skill exposes capabilities that can drive desktop automation, invoke scripts, and interact with networked tools like WeChat and ToDesk, but it does not declare corresponding permissions or boundaries. Undeclared powerful capabilities reduce transparency and make it easier for a broadly-invoked skill to perform sensitive local or remote actions without adequate review or user understanding.

Context-Inappropriate Capability

High
Confidence
98% confidence
Finding
The skill advertises desktop automation and app launching, but open_app actually accepts arbitrary strings and executes them as shell commands. That expands the capability from GUI automation to unrestricted command execution, which is substantially more dangerous in this context because the agent may be driven by natural-language instructions.

Intent-Code Divergence

Medium
Confidence
99% confidence
Finding
The module and method names/docstrings claim to enforce safety, but both check_task() and check_plan() still return success after detecting dangerous patterns. In a desktop automation agent that can click, type, launch apps, send WeChat messages, and establish remote sessions, this creates a misleading safety boundary: destructive or administrative actions are only logged, not blocked, so unsafe instructions can still execute.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The skill description says to use it for 'any Windows GUI task' and similar broad requests, making activation likely to overlap with ordinary desktop-help conversations. Overly broad routing increases the chance the agent will invoke a high-privilege automation skill when the user only intended advice, leading to unintended clicks, typing, file changes, or message sending.

Vague Triggers

Medium
Confidence
92% confidence
Finding
Trigger examples like '远程连接', 'ToDesk', or '连接你' are ambiguous and can easily appear in normal conversation, yet they map to a workflow that launches remote-access software, captures screenshots, extracts credentials, and shares them. Because the triggered action is highly sensitive, vague phrases materially raise the risk of accidental activation and credential exposure.

Missing User Warnings

Medium
Confidence
86% confidence
Finding
The natural-language mode advertises arbitrary desktop automation including examples that create and save files, but it does not clearly warn users that free-form requests may write to disk or alter local state. In a high-privilege desktop-control context, missing disclosure makes unintended destructive or privacy-impacting actions more likely because users may treat the skill as informational rather than operational.

Missing User Warnings

High
Confidence
95% confidence
Finding
The ToDesk flow explicitly instructs the system to screenshot the remote-access window, OCR the temporary password, and send the recognized credentials to the user, but it lacks a clear privacy and security warning. Remote-access credentials are highly sensitive, and capturing plus transmitting them creates a direct avenue for unauthorized system access if activation is mistaken, the conversation is exposed, or the credentials are mishandled.

Missing User Warnings

Medium
Confidence
83% confidence
Finding
Screenshots are silently captured and persisted to C:\temp\desktop_agent, which may contain sensitive information such as chats, passwords, or business data visible on screen. In a desktop automation skill, this is materially risky because screenshots are a core function and retained on disk beyond immediate use.

Missing User Warnings

Medium
Confidence
86% confidence
Finding
The fallback typing path copies arbitrary text into the system clipboard, which can overwrite user clipboard contents and expose sensitive data to other applications or later pastes. In a desktop agent, clipboard use is especially sensitive because it affects global OS state outside the immediate target application.

Missing User Warnings

Low
Confidence
77% confidence
Finding
Listing all visible windows exposes titles and geometry, which can reveal sensitive information such as document names, websites, chat contacts, or active applications. Although useful for automation, this increases privacy exposure because it inventories the user's desktop state without any user-facing disclosure or scope restriction.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
Applications or commands are launched without confirmation, which can trigger unintended program execution and side effects. In this file the risk is amplified because the same function also accepts arbitrary strings, so the absence of confirmation compounds the command-execution issue.

Missing User Warnings

High
Confidence
98% confidence
Finding
The preset captures a screenshot of ToDesk and returns a path plus a fixed device code in a workflow explicitly intended to recover remote-connection credentials. In the context of a desktop-control skill, this is dangerous because screenshots may expose temporary passwords or other remote-access secrets that can enable unauthorized remote control if accessed by the agent, logs, or downstream tools.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
This function automatically activates WeChat, selects a contact, and sends arbitrary message content with no user confirmation or preview. In a desktop-agent context, that materially increases the risk of unintended data disclosure, social engineering, or abuse of the user's trusted messaging identity.

Missing User Warnings

High
Confidence
97% confidence
Finding
This function can transmit any local file to a WeChat contact without confirmation, which creates a straightforward path for exfiltrating sensitive documents. In the context of a desktop automation skill with filesystem access and GUI control, the combination of arbitrary file path input and immediate send action makes the risk especially significant.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The function sends the full user task text to an LLM HTTP endpoint without any consent flow, disclosure, or minimization. In a desktop automation skill, task text can easily contain sensitive data such as messages to send, file paths, credentials, contact names, or remote-access instructions, so transmitting it externally increases privacy and data-handling risk.

External Transmission

Medium
Category
Data Exfiltration
Content
}

    try:
        resp = requests.post(LLM_URL, json=payload, timeout=30)
        resp.raise_for_status()
        content = resp.json()["choices"][0]["message"]["content"]
Confidence
91% confidence
Finding
requests.post(LLM_URL, json=

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal