dingtalk-gui-message

Security checks across malware telemetry and agentic risk

Overview

This skill is a real DingTalk message-sending automation tool, but it needs review because it can control the desktop, capture screenshots, send messages, optionally upload screenshots, and has a concrete shell-injection weakness.

Install only if you intentionally want a desktop-control tool that can send real DingTalk messages. Review recipient and message text before invocation, avoid using untrusted contact names or message content until shell=True command construction is fixed, treat QR-code screenshots as sensitive, and leave --vision disabled unless you explicitly accept sending screenshots to DashScope.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (7)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def run(cmd, timeout=15):
    log(f"→ {cmd[:120]}")
    result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout)
    if result.returncode != 0 and result.stderr:
        log(f"  stderr: {result.stderr[:200]}")
    return result
Confidence
98% confidence
Finding
result = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout)

Lp3

Medium
Category
MCP Least Privilege
Confidence
93% confidence
Finding
The skill performs sensitive actions through shell tools, screen capture/OCR, and likely external vision analysis, but the manifest does not declare permissions or clearly constrain those capabilities. This creates a transparency and consent gap: a user or orchestrator may invoke a skill that can read screen contents, automate input, and potentially send data over the network without explicit authorization boundaries.

Description-Behavior Mismatch

Medium
Confidence
84% confidence
Finding
The optional Vision mode sends screenshots of the user's desktop/application window to a remote DashScope API, which expands the skill from local GUI automation into external data transfer. Those screenshots may contain chat contents, contact lists, QR login codes, or unrelated on-screen sensitive information, creating a privacy and data-exfiltration risk.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The README clearly documents behavior that can send messages through a real messaging client and capture login QR screenshots, but it does not prominently warn about privacy, consent, or the risk of acting on the wrong contact/window. In an agent-skill context, that omission is security-relevant because it normalizes potentially sensitive GUI automation and credential-adjacent screen capture without explicit operator acknowledgment.

Vague Triggers

Medium
Confidence
88% confidence
Finding
The trigger phrases include broad natural-language expressions such as '给XX发消息' and '钉钉发消息', which can overlap with ordinary user conversation and cause unintended activation. In this skill, accidental triggering is more dangerous than usual because activation can lead directly to GUI automation that searches contacts and sends messages on the user's behalf.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The skill requires Screen Recording, Accessibility, screenshots, and OCR, but the documentation does not prominently warn users about the privacy implications of capturing on-screen content. Because the workflow also mentions semantic analysis via a vision model, sensitive screen contents, QR login data, contact names, or message text may be exposed or retained without informed user consent.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
When --vision is enabled, the code base64-encodes a full screenshot and transmits it to an external service without an additional user-facing confirmation at the time of upload. Because this is a GUI automation skill for messaging, screenshots can easily contain private conversations, login QR codes, and other incidental secrets.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal