Bytedance Visual Recognition

Security checks across malware telemetry and agentic risk

Overview

This skill does what it claims, but it can send local media to an external API immediately and retain/copy sensitive local data with too little user control.

Install only if you are comfortable sending selected images, videos, screenshots, prompts, and follow-up context to Volcengine/Doubao. Avoid broad or sensitive directories in batch mode, use least-privilege API/IAM keys, and periodically delete the skill’s Temp, vision_history.json, and .last_response files if they contain private paths or content.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (8)

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The skill persistently stores user file paths, prompts, partial model responses, timestamps, and token history for 7 days. This exceeds the minimum necessary for basic visual recognition and can expose sensitive local-path information and user content to other local users or later compromise of the host.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: Batch mode recursively copies an entire user-selected directory into the temp workspace before processing supported files. This expands the tool’s data footprint, duplicates potentially sensitive media locally, and increases the risk of unintended retention or disclosure beyond what users expect from recognition alone.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The skill uses separate IAM credentials to call a remote usage API unrelated to core media recognition. This broadens credential scope and network behavior, creating additional attack surface and a secondary channel for account metadata access that users may not expect.

Context-Inappropriate Capability

Low

Confidence: 83% confidence
Finding: The code automatically reads a local .env file and imports its contents into the process environment at runtime. This implicit credential-loading behavior increases the blast radius of any local file exposure and may cause the skill to consume secrets without clear operator awareness.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The trigger patterns are broad phrases such as '分析图片', 'vision', and 'image to text', which can match ordinary conversation and cause unintended activation. In this skill, accidental activation is more dangerous because invocation can immediately send user media and prompts to an external API and consume quota without confirmation.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The instructions explicitly direct immediate execution but do not warn that images, videos, and prompts will be sent to a third-party API service. This undermines informed consent and may leak sensitive personal, proprietary, or regulated content to an external provider unexpectedly.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The skill forbids confirmation and mandates immediate execution even for quota-consuming, externally transmitted operations. This removes a critical safety checkpoint and makes accidental or coerced invocation far more harmful, especially given the broad triggers and support for batch processing.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The tool base64-encodes local image/video content and sends it to a remote API, but there is no strong user-facing disclosure or consent step explaining that local media leaves the system. For sensitive screenshots, recordings, or documents, this can lead to unintended external exposure of private information.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal