Security audit

小剪刀视频剪辑

Security checks across malware telemetry and agentic risk

Overview

This is a real cloud video-editing integration, but it needs Review because it asks users to share account tokens in chat and can upload/process sensitive media with weak consent boundaries.

Install only if you are comfortable sending videos, derived frames, audio, OCR text, prompts, device metadata, and signed media links to Xiao Jian Dao/Cutflow services. Do not paste tokens into chat; use a secure secret store or environment variable, confirm before uploads or credit-consuming task creation, and treat returned signed URLs as private.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (16)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 86% confidence
Finding: The skill describes use of environment variables and outbound network/API access but does not declare corresponding permissions. That mismatch weakens user and platform visibility into what the skill can access, increasing the risk of unexpected secret handling and remote data transmission during operation.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The skill explicitly instructs users to paste an authentication token directly into chat. Chat is typically an insecure channel for secrets because tokens may be logged, retained in conversation history, exposed to other tools/agents, or accidentally shared, creating a direct path to credential compromise.

Intent-Code Divergence

Medium

Confidence: 84% confidence
Finding: The skill claims every step requires user confirmation, yet it directs the agent to start subtitle extraction in the background without informing the user. This is a consent and transparency failure: user media is sent for processing before clear approval of that specific action, undermining expectations around control of uploaded content.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The file hard-codes package-specific AES keys and derives IVs deterministically from package name and timestamp using MD5, then exposes generic encrypt/decrypt helpers. This is dangerous because embedded secrets are recoverable from the code and the scheme lacks authenticated encryption, enabling reverse engineering, ciphertext tampering, and misuse for protecting API traffic or sensitive data.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The document explicitly instructs users to obtain an authentication token via website login and packet capture, which facilitates credential/session token extraction outside normal supported authentication flows. In the context of a video-editing skill, this is unjustified and enables unauthorized backend API use, account abuse, and potential terms/privacy violations.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The file provides a complete custom encrypted HTTP client, including hardcoded AES key material, IV derivation, header construction, and response decryption for direct backend access. This materially lowers the barrier to interact with private service endpoints outside intended client controls, increasing the risk of unauthorized access, reverse engineering, and misuse of protected APIs.

Description-Behavior Mismatch

Medium

Confidence: 86% confidence
Finding: The skill requires extracted video frames to be uploaded to publicly accessible URLs, which exposes potentially sensitive visual content outside the core editing environment and to third-party retrieval. In a subtitle/OCR workflow this may be functionally convenient, but making frames public broadens data exposure unnecessarily and creates privacy and retention risks.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The document explicitly states uploads occur in plaintext over a non-encrypted channel, which can expose extracted video frames and related metadata to interception or tampering in transit. Because the content comes from user videos, this creates a direct confidentiality risk and also enables man-in-the-middle manipulation of OCR inputs.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The trigger phrases are generic and likely to match ordinary conversation about video editing. Overly broad activation can cause the skill to engage unintentionally, which is more dangerous here because the workflow involves uploads, remote API calls, and possible token handling.

Missing User Warnings

High

Confidence: 99% confidence
Finding: The documentation tells users to send authentication tokens in chat and does so without any privacy or security warning. This normalizes unsafe secret-sharing behavior and significantly increases the chance that credentials are stored in transcripts, leaked to operators, or reused by unauthorized parties.

Missing User Warnings

Medium

Confidence: 80% confidence
Finding: The script silently pulls authentication tokens from environment variables, which can cause the agent to use ambient credentials without an explicit user action or clear disclosure. In an agent setting, this weakens consent boundaries and may transmit user media and requests to third-party services under credentials the user did not intend to expose for this run.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The markdown tells users to capture and use a token from traffic without any warning about credential theft, account compromise, or privacy implications. That guidance normalizes unsafe handling of authentication artifacts and can lead to replay of session secrets against the backend service.

Missing User Warnings

Low

Confidence: 82% confidence
Finding: The documentation directs transmission of encrypted device metadata, including OS, architecture, CPU model, and MAC address, to a remote service without meaningful disclosure or consent guidance. Even if encrypted in transit at the application layer, this still exposes potentially identifying device information and can enable tracking or fingerprinting.

Missing User Warnings

High

Confidence: 95% confidence
Finding: The workflow sends extracted frames to publicly accessible URLs and notes a non-encrypted upload path, yet provides no warning or consent mechanism for the privacy implications of exposing video-derived images. In the context of subtitle extraction, users may reasonably expect local or protected processing, so this mismatch increases the danger of unintended disclosure.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The function accepts a user token and uses it as `appsecret` for `HookHttpClient` while also sending the video URL and OCR-related data to remote endpoints. This means sensitive credentials are transmitted to external services without any explicit user consent flow, warning, minimization, or indication of transport/handling guarantees in this file. In a video-editing skill, users may not expect their auth token and media-derived data to be forwarded to third-party/cloud OCR services, creating privacy and credential-exposure risk if logs, proxies, or downstream services are compromised.

Ssd 3

High

Confidence: 99% confidence
Finding: By creating a natural-language path for users to paste live authentication tokens into the conversation, the skill introduces a straightforward secret-exfiltration vector. In this context, the risk is heightened because the token is then used for remote video-processing actions, so compromise could expose user media, task data, or account-scoped operations.

VirusTotal

No VirusTotal findings

View on VirusTotal

Static analysis

No suspicious patterns detected.