Security audit

Pet Vocal Emotion Analysis Skill | 宠物叫声情绪解析技能

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a cloud media-analysis wrapper, but it asks for broad media upload, automatic identity handling, local token storage, and history access that are not tightly scoped to pet vocal analysis.

Install only if you are comfortable with pet media, file paths or URLs, and linked report history being sent to the publisher’s cloud service. Treat it as a Review item: check whether automatic identity creation, local token storage, and cloud history retrieval match your privacy expectations before use.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (32)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill documentation directs use of shell commands, local file handling, network access, and implicit user association, but the manifest declares no permissions. This creates a transparency and policy-enforcement gap: a host may grant or users may assume less access than the skill actually expects, increasing the risk of unintended file, network, or environment exposure.

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The documentation broadens the feature from pet vocal emotion analysis into general image/video analysis and cloud report retrieval. That scope expansion can cause the agent to process unrelated media or perform extra network actions beyond the user’s reasonable expectation, enabling overcollection and off-purpose data handling.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: Allowing images, arbitrary local files, and network URLs exceeds the narrow purpose of analyzing pet vocalizations. Broad input types increase the chance of unauthorized file access, fetching remote content, or processing non-audio data without strong necessity tied to the stated task.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The history-report cloud query feature adds persistent/reporting functionality not described in the core manifest purpose. This can expose prior user data or trigger network retrieval of historical records when the user expects only one-off analysis, creating privacy and scope-creep risks.

Context-Inappropriate Capability

Low

Confidence: 86% confidence
Finding: Automatically saving uploaded attachments to local storage is not clearly necessary for the stated purpose and increases exposure of user-provided media on disk. Even if temporary, local persistence can leak sensitive content through other processes, backups, or misconfigured cleanup.

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The skill says history must not be read from local memory, yet it also states the system will reuse or create a local default user for report association. This contradiction suggests hidden local state and identity linkage, which can lead to cross-session confusion, unintended account mixing, or privacy violations.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The script includes a hidden identity-resolution flow and a listing mode tied to an internal open_id, even though the stated purpose is only pet vocal emotion analysis. This creates an unnecessary access path to user-scoped analysis history and identity context, which broadens the attack surface and could expose another user's records if identity resolution or authorization is weak in shared environments.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The manifest describes a narrowly scoped pet voiceprint skill for recognizing cat and dog sounds and translating them into emotions and behavioral intentions. In contrast, this file accepts arbitrary local or remote video input, submits it for generic 'analysis', polls for report generation, and lists historical analysis reports without any pet-, audio-, cat-, dog-, or emotion-specific logic.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The skill accepts arbitrary http/https URLs and forwards them to the backend as videoUrl without allowlisting, scheme tightening beyond basic prefix checks, or user-facing constraints tied to the pet-audio use case. If the downstream service fetches the URL server-side, this can enable SSRF-style access to internal resources, cloud metadata endpoints, or unexpected third-party content retrieval.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The code implements video analysis and history listing while the manifest claims pet vocal emotion analysis. This kind of capability mismatch is dangerous because it can mislead users and reviewers about what data the skill actually processes, enabling collection or transmission of unrelated and potentially more sensitive media than expected.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The CLI arguments and help text explicitly describe video input, video URLs, and video history, directly contradicting the skill's pet sound analysis description. In context, this increases risk because users may invoke a skill under false assumptions and expose video files or URLs to a backend that they did not expect to use.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: This file exposes a generic network-capable API wrapper with CRUD-style methods, arbitrary URL handling, and pagination helpers that are not scoped to pet vocal emotion analysis. In a skill whose stated purpose is audio/emotion inference for pets, this creates unnecessary capability for broad outbound API access and data manipulation, increasing the risk of hidden data exfiltration, unauthorized service interaction, or repurposing the skill as a generic HTTP client.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The get_user_by_username capability introduces user-account lookup functionality unrelated to pet vocal emotion analysis. Even if not directly exploitable in this file alone, such functionality can enable user enumeration, privacy violations, or pivoting into broader account-centric operations when combined with other components.

Description-Behavior Mismatch

High

Confidence: 92% confidence
Finding: This file implements generic user-account persistence, including identity and account lookup behavior, which does not align with a pet vocal emotion analysis skill. Capability mismatch is a supply-chain red flag because unrelated account-management code expands the attack surface and may enable hidden data collection or user tracking unrelated to the stated function.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: The model stores sensitive identity and authentication-related fields such as username, realname, email, token, and open_token without any demonstrated need for pet-audio emotion recognition. In this skill context, collecting and persisting such data is especially suspicious because it enables unnecessary credential/token retention and privacy exposure if the local database is accessed or exfiltrated.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: This utility file contains open-id resolution, workspace identity harvesting, fallback user creation, and persistent account initialization behavior that is unrelated to pet vocal emotion analysis. That hidden identity/account management expands the skill's privileges and can silently bind a user to backend services without clear consent, making the skill significantly more dangerous in context.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The generic HTTP helper automatically performs authentication bootstrapping, token reuse, local token caching, user lookup, and remote registration/login before normal requests. For a pet emotion analysis skill, embedding these behaviors in a shared request wrapper creates covert side effects and unauthorized data flows far beyond the declared feature set.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The code derives a workspace path from an environment variable and reads data/smyx-api-key.txt to obtain an internal identity value. Accessing workspace-resident identity material without an obvious user-facing reason or consent can expose internal identifiers and enable unintended impersonation or backend account linkage.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: When no identity is present, the code generates a synthetic open-id, creates a local user record, and persists it for future reuse. This establishes durable account state and can enable silent backend activity under a fabricated or fallback identity, which is especially risky because it is unrelated to the advertised skill purpose.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The trigger phrases for history-report access are broad enough to match ordinary conversational requests, making unintended cloud queries more likely. In context, this is risky because it can retrieve prior reports or linked account data without a sufficiently explicit, informed user action.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill instructs automatic local file saving but does not pair that behavior with a clear, prominent user-facing notice at the point of collection. This undermines informed consent around storage and handling of uploaded media, especially where recordings may contain sensitive ambient audio.

Missing User Warnings

Medium

Confidence: 82% confidence
Finding: The code reads the entire local file into memory and sends it to the analysis API with no visible warning, confirmation, or minimization controls in this file. In a skill context, this creates a real data-exposure risk because users may provide sensitive local media that is silently uploaded to an external service.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The script accepts a hidden API key parameter and passes API-related inputs into an external-analysis workflow without user-facing notice about credential handling. Hidden credential-bearing options reduce transparency and can cause users or wrappers to supply secrets without understanding storage, logging, or transmission risks.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The analysis function forwards a local path or remote URL into an API-backed workflow via skill.get_output_analysis without clear disclosure that user-supplied media or references may be transmitted externally. Given the manifest mismatch, this is more dangerous because users expecting pet-audio analysis may unknowingly expose unrelated local or remote content to a backend service.

Missing User Warnings

Low

Confidence: 67% confidence
Finding: The save routine overwrites an arbitrary path with no validation, backup, or error reporting, and all exceptions are silently suppressed. If an attacker can influence the path argument elsewhere in the application, this pattern can enable destructive file overwrite or tampering while concealing failures from operators.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Static analysis

Detected: suspicious.install_untrusted_source

Install source points to URL shortener or raw IP.

Warn

Code: suspicious.install_untrusted_source
Location: skills/smyx_common/scripts/config-dev.yaml:2