Byted Kickart Video Analyzer

Security checks across malware telemetry and agentic risk

Overview

This video-analysis skill has real remote-analysis functionality, but it handles cloud credentials unsafely and includes account-affecting and under-disclosed backend actions.

Install only if you are comfortable sending video files and metadata to the remote Volcengine/Kickart services and using cloud credentials with this skill. Do not paste AK/SK secrets into chat, do not run the documented echo command as written, and review or patch the logging and package-registration behavior before using it with real accounts or private media.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (18)

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
The skill instructs the agent to ask users to paste ACCESS_KEY_ID and SECRET_ACCESS_KEY directly into chat, then export them into the session. Collecting long-lived cloud credentials in conversational text is highly sensitive and unnecessary for a normal video-analysis interface, and it exposes secrets to logs, transcripts, downstream tooling, and accidental reuse.

Context-Inappropriate Capability

Medium
Confidence
86% confidence
Finding
The skill includes package querying and self-upgrade behavior that is outside the stated scope of video analysis. While not inherently malicious, these extra behaviors expand the trust boundary and can trigger unexpected network activity, account-state inspection, and software changes unrelated to the user's immediate task.

Intent-Code Divergence

Medium
Confidence
90% confidence
Finding
The code path contradicts the CLI description of a local parser: it resolves a media ID to a URL and submits that URL to an external analysis service. This creates an undisclosed data-transfer boundary, which can expose video content or metadata to remote systems and mislead users about where processing occurs.

Description-Behavior Mismatch

Medium
Confidence
88% confidence
Finding
The skill advertises analysis of local or network video files, but the implementation only accepts an internal media ID and depends on backend services. This mismatch can cause users to provide or authorize processing under false assumptions, weakening informed consent and masking actual trust boundaries.

Description-Behavior Mismatch

High
Confidence
97% confidence
Finding
The submit path ignores any actual user-supplied video resource and instead always submits a hard-coded remote image URL to the backend AI template task service. This is a strong mismatch with the skill's declared purpose and can mislead users about what content is being analyzed while still triggering backend processing on unrelated remote data.

Context-Inappropriate Capability

Medium
Confidence
93% confidence
Finding
The generic post(action, params) method acts as a broad backend proxy that can invoke arbitrary ICCP actions, far beyond the stated video-analysis functionality. If exposed through the skill surface, this expands the attack surface and can enable unauthorized or unintended backend operations using the skill's credentials or trust boundary.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The code enumerates IAM users via ListUsers and selects an admin or first available user ID, which is unrelated to the stated purpose of video parsing. In a media-analysis skill, this expands privilege scope and can cause uploads or media operations to run under an unintended principal, increasing the blast radius if the skill is abused or misconfigured.

Description-Behavior Mismatch

High
Confidence
98% confidence
Finding
The file’s implemented behavior does not match the declared video-analysis purpose of the skill. Instead, it instantiates an ICCP service and invokes a package registration endpoint ("RegisterArkClawCombo"), which indicates hidden functionality unrelated to the advertised capability and could trigger unauthorized account/service actions when the skill is run.

Intent-Code Divergence

Medium
Confidence
96% confidence
Finding
The inline documentation explicitly states the command queries/registers a free Ark Claw package, directly contradicting the skill’s stated video-analysis purpose. This kind of deceptive mismatch is dangerous because it masks the real behavior of the skill, undermines user trust, and can facilitate unauthorized service enrollment or unexpected outbound actions under a benign-looking package.

Description-Behavior Mismatch

Medium
Confidence
90% confidence
Finding
This script uploads a user-supplied local video file to an external media service via `SimpleMediaService.add_media`, which expands the skill from passive local analysis into data exfiltration/remote transfer behavior. In a skill advertised primarily for video parsing, metadata extraction, and scene extraction, undisclosed upload functionality can leak sensitive local media and violates least surprise and least privilege.

Intent-Code Divergence

Low
Confidence
82% confidence
Finding
The tool describes itself as a local video upload utility, while the manifest frames the skill as analysis/parsing. This documentation mismatch is security-relevant because users may authorize the skill expecting local inspection, not outbound transfer of files, increasing the chance of unintended disclosure of private videos.

Missing User Warnings

High
Confidence
98% confidence
Finding
The documentation asks users to send AK/SK credentials in chat without prominent risk warnings or safe-handling controls. In context, this is especially dangerous because the same skill also instructs exporting and using those credentials in commands, making accidental disclosure and credential compromise much more likely.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
Media-derived data is submitted over the network without any user-facing warning, confirmation, or disclosure in the main execution path. In a video-analysis skill, this is more sensitive because videos may contain private, proprietary, or regulated content, so silent remote processing increases privacy and compliance risk.

Missing User Warnings

High
Confidence
98% confidence
Finding
The client logs full request headers and bodies before sending requests, which includes Authorization credentials and potentially sensitive video-analysis payload data. If logs are accessible to operators, other services, or attackers through log aggregation or crash reports, this can lead to credential theft, replay of authenticated requests, and exposure of user data.

Missing User Warnings

High
Confidence
99% confidence
Finding
The request logger records full headers and request bodies before sending HTTP requests, which includes Authorization tokens and potentially sensitive upload metadata. Anyone with access to logs could recover bearer credentials or signed request details and reuse them to access backend services.

Ssd 3

High
Confidence
99% confidence
Finding
The skill explicitly directs the agent to collect cloud secrets in chat and reuse them in session commands. This creates a direct secret-handling vulnerability: credentials can be exposed in chat history, command logs, error output, analytics, or copied into subsequent commands beyond the user's awareness.

Ssd 3

High
Confidence
100% confidence
Finding
The authentication check command echoes ARK_SKILL_API_KEY and SECRET_ACCESS_KEY values directly to session output. Printing secrets to terminal/chat output is a severe credential disclosure issue because those values may be stored in logs, transcripts, screenshots, or monitoring systems and can be immediately abused if exposed.

Ssd 3

Medium
Confidence
90% confidence
Finding
The workflow requires persisting full analysis outputs and returning complete raw JSON, which may contain extracted sensitive information from the processed video or metadata. Because this is a media-analysis skill operating on user-provided content, bulk persistence and verbatim disclosure increase the chance of overexposing personal, confidential, or regulated data beyond what the user actually needs.

VirusTotal

60/60 vendors flagged this skill as clean.

View on VirusTotal