dingtalk-gui-message

Security checks across malware telemetry and agentic risk

Overview

This skill is a real DingTalk message-sending automation tool, but it needs review because it can control the desktop, capture screenshots, send messages, optionally upload screenshots, and has a concrete shell-injection weakness.

Install only if you intentionally want a desktop-control tool that can send real DingTalk messages. Review recipient and message text before invocation, avoid using untrusted contact names or message content until shell=True command construction is fixed, treat QR-code screenshots as sensitive, and leave --vision disabled unless you explicitly accept sending screenshots to DashScope.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (7)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def run(cmd, timeout=15): log(f"→ {cmd[:120]}") result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout) if result.returncode != 0 and result.stderr: log(f" stderr: {result.stderr[:200]}") return result
Confidence: 98% confidence
Finding: result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill performs sensitive actions through shell tools, screen capture/OCR, and likely external vision analysis, but the manifest does not declare permissions or clearly constrain those capabilities. This creates a transparency and consent gap: a user or orchestrator may invoke a skill that can read screen contents, automate input, and potentially send data over the network without explicit authorization boundaries.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: The optional Vision mode sends screenshots of the user's desktop/application window to a remote DashScope API, which expands the skill from local GUI automation into external data transfer. Those screenshots may contain chat contents, contact lists, QR login codes, or unrelated on-screen sensitive information, creating a privacy and data-exfiltration risk.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The README clearly documents behavior that can send messages through a real messaging client and capture login QR screenshots, but it does not prominently warn about privacy, consent, or the risk of acting on the wrong contact/window. In an agent-skill context, that omission is security-relevant because it normalizes potentially sensitive GUI automation and credential-adjacent screen capture without explicit operator acknowledgment.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The trigger phrases include broad natural-language expressions such as '给XX发消息' and '钉钉发消息', which can overlap with ordinary user conversation and cause unintended activation. In this skill, accidental triggering is more dangerous than usual because activation can lead directly to GUI automation that searches contacts and sends messages on the user's behalf.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill requires Screen Recording, Accessibility, screenshots, and OCR, but the documentation does not prominently warn users about the privacy implications of capturing on-screen content. Because the workflow also mentions semantic analysis via a vision model, sensitive screen contents, QR login data, contact names, or message text may be exposed or retained without informed user consent.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: When --vision is enabled, the code base64-encodes a full screenshot and transmits it to an external service without an additional user-facing confirmation at the time of upload. Because this is a GUI automation skill for messaging, screenshots can easily contain private conversations, login QR codes, and other incidental secrets.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal