红果短剧审片

Security checks across malware telemetry and agentic risk

Overview

This video-review skill is mostly coherent, but its short description understates that it performs content-safety screening and audio transcription on user videos.

Install only if you want uploaded or local videos to be checked for both technical quality and content-policy issues, including extracted frames and possible speech transcription. Avoid using it on confidential videos unless you are comfortable with OpenClaw image analysis and the referenced transcription tool processing derived media.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (13)

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The top-level metadata says the skill is for basic quality auditing, while the body expands scope into content safety and prohibited-content detection. That hidden scope expansion increases privacy and policy risk because users may submit media expecting technical QA, not sensitive content classification or compliance screening.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Subtitle transcription and banned-word moderation materially exceed a basic quality-review scope and process the video's spoken content, which can contain sensitive personal or confidential information. The use of a transcription CLI without clear disclosure or consent raises privacy, compliance, and data-handling risks.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The skill performs broad visual policy enforcement for categories unrelated to simple technical video quality, such as sexualized content, government symbolism, money, violence, and superstition. This hidden moderation scope can lead to unexpected sensitive inference and misuse of submitted media beyond the user's apparent request.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The skill metadata describes basic video quality review, but the code additionally performs speech transcription and banned-word content inspection. This is a capability mismatch that can cause unauthorized processing of audio content and unexpected collection of sensitive speech data, making the skill more dangerous than advertised.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: Calling an external transcription CLI expands the skill from quality assurance into speech-content analysis, which is materially different from the stated purpose. In a user-trust context, this hidden capability can expose sensitive spoken information and violate least-surprise and least-privilege expectations.

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The top-level documentation frames the script as a basic initial video review tool, but the implementation includes banned-word scanning of transcribed speech. Misleading documentation is a security-relevant issue because it obscures data processing behavior and undermines informed review and consent.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The code’s core prompt implements sensitive content moderation categories such as public officials, government insignia, RMB, superstition, and minors rather than the manifest’s stated purpose of basic video quality checks like composition, black borders, audio, or resolution. This is dangerous because it materially expands the skill into compliance/policy surveillance, creating undisclosed screening behavior and potential misuse against user content beyond the declared scope.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The prompt adds sensitive screening for state symbols, law-enforcement imagery, currency, violence, superstition, and minors without justification from a basic quality-audit workflow. That scope creep is risky because it enables covert policy/compliance classification of media, which can trigger inappropriate flagging, privacy concerns, or censorship-like behavior in contexts where users expected only technical quality analysis.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: Broad trigger phrases like '审核视频' and '检测视频' can cause the skill to activate for generic video-review requests, even when the user did not intend to invoke intensive moderation, shell tooling, file extraction, or transcription. Over-broad activation increases the chance of unintended processing of sensitive media.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill does not warn users that extracted audio may be processed by an external transcription CLI/service, which may transmit or expose spoken content outside the local environment. For video review workflows, audio often contains personal, confidential, or regulated data, so undisclosed external processing is a meaningful privacy risk.

Missing User Warnings

Low

Confidence: 90% confidence
Finding: The skill writes extracted frames and audio to predictable locations under /tmp without prominently warning the user. Temporary artifacts can persist, be recovered later, or be exposed to other local processes depending on system configuration, especially when handling sensitive video content.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The trigger list includes very generic phrases such as '检测视频' and '审核视频', which can overlap with common user requests and cause the skill to activate unintentionally. Overly broad activation increases the chance of misrouting unrelated tasks into this skill, potentially causing incorrect handling, unexpected data exposure to the skill pipeline, or bypass of more appropriate review flows.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The script extracts audio from the input video into a sidecar WAV file derived from the original path, without explicit user warning or secure temp-file handling. That can leave sensitive audio material on disk, create files in unintended locations, and increase exposure if cleanup fails or permissions are too broad.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal