宠物叫声情绪解析技能

Security checks across malware telemetry and agentic risk

Overview

This skill appears to provide pet sound analysis, but it also performs account, token, report-history, and broader media/device-related operations that are not clearly scoped or disclosed.

Install only if you are comfortable sending pet recordings or videos, a user identifier, and report history to the publisher's cloud service, and with the skill creating local account/token storage. Review is warranted until the publisher narrows the package to pet vocal analysis, removes or clearly documents account registration, token persistence, camera/device helpers, and report-retention behavior.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (25)

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The skill’s stated purpose is pet vocal emotion analysis, but the workflow expands into cloud-backed history report retrieval and report management. That scope expansion increases data processing and retention surface beyond user expectations, creating privacy and consent risk because users may not realize their analysis activity is being queried and managed as persistent remote records.

Description-Behavior Mismatch

Low

Confidence: 90% confidence
Finding: The documentation states that uploaded attachments/audio/video are automatically saved as local files, but this behavior is not disclosed in the skill’s top-level description. Undisclosed local persistence can expose sensitive media to other local processes, backups, or later misuse, especially when users expect transient processing only.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The skill claims original video data is deleted immediately after cloud analysis, but elsewhere says uploaded media is automatically saved locally. This inconsistency is dangerous because it can mislead users about actual retention, undermining informed consent and potentially causing sensitive pet/home recordings to remain stored on disk contrary to the privacy statement.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The documented API performs human video face/health analysis, which materially conflicts with the skill's declared purpose of pet vocal emotion analysis. This kind of capability mismatch is dangerous because it can cause the agent to collect and transmit sensitive user video or health-related data under a misleading pet-audio pretense, creating privacy, consent, and data-misuse risks.

Intent-Code Divergence

High

Confidence: 99% confidence
Finding: The request schema explicitly accepts uploaded videos or public video URLs, directly contradicting the stated pet-voice use case. In this skill context, that contradiction increases risk because users may reasonably believe they are sharing innocuous pet audio while the system is actually designed to ingest richer, potentially identifying video data.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: This file exposes generic CRUD-style record management methods (`page`, `list`, `add`, `edit`, `delete`) that are not clearly tied to the stated pet vocal emotion analysis purpose. In a skill that should primarily submit audio for analysis and fetch results, unrelated management endpoints expand the attack surface and may permit unauthorized data or resource manipulation if the wider platform exposes them.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: The `add`, `edit`, and especially `delete(cameraSn)` methods indicate device or resource administration capability centered on `cameraSn`, which is unrelated to pet sound emotion analysis. This mismatch is dangerous because a user or calling component expecting analysis functionality may unknowingly gain the ability to modify or delete camera-associated resources, creating potential for unauthorized device management or service disruption.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The implementation exposes generic video analysis behavior that does not match the manifest’s stated pet vocal emotion analysis purpose. This mismatch is dangerous because users and platform reviewers may grant access or trust based on the declared function, while the code actually processes broader media inputs and could send unrelated content to backend services without informed consent.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The history-listing feature expands the skill’s scope beyond the declared pet emotion translation function and introduces access to previously processed user data. Undisclosed retention and retrieval capabilities increase privacy risk, especially if users do not expect the skill to maintain and expose historical records tied to their identifier.

Description-Behavior Mismatch

High

Confidence: 94% confidence
Finding: This file exposes a generic API client with broad CRUD and arbitrary HTTP helper methods that are not scoped to the stated pet-vocal-emotion-analysis purpose. Because callers can supply URLs and request data, the skill can be repurposed as a general network proxy or backend action runner, expanding the attack surface well beyond emotion-analysis functionality.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The http_post/http_put/http_get/http_delete methods accept caller-controlled URLs and forward requests directly, creating unjustified arbitrary network access. If reachable from higher-level skill logic, this can enable SSRF-style access to internal services, exfiltration to attacker-controlled endpoints, or use of the skill as a generic outbound request primitive.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: This file implements generic user/account persistence, token storage, and CRUD operations that are unrelated to a pet vocal emotion analysis skill. In a mismatched skill context, hidden user-data management increases the risk of unnecessary collection, retention, and misuse of sensitive identifiers or tokens without clear user expectation or need.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The code derives a workspace path from environment/current directory and creates a local SQLite database under a data folder, despite the declared skill being about pet-audio emotion interpretation. This broadens filesystem interaction and persistence in a way that is not obviously necessary, creating risk of unexpected local data storage, privacy issues, and harder-to-audit behavior.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: This utility file contains unrelated account/login bootstrap logic, token handling, local persistence, and payment-flow messaging inside a generic HTTP helper for a pet emotion analysis skill. That hidden cross-domain behavior creates an unexpected authentication and billing side effect surface, enabling silent user/account actions and secret handling far beyond the stated skill purpose.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The helper can silently call /sys/phoneLogin with register=1 using a username/mobile identifier, meaning it may auto-create or log into an account without explicit user action. In the context of a pet vocal analysis skill, this is a serious overreach because a benign-seeming feature can trigger identity-bound account operations on external services.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The code stores tokens and user profile data through a local DAO/database even though the skill's stated function is emotion analysis. Persisting credentials locally increases exposure to token theft, replay, and privacy compromise, especially when the storage behavior is not obvious to users or scoped to a dedicated auth component.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The default trigger condition is broad enough that ordinary mentions of pet sounds plus an uploaded media file may invoke the skill automatically. Overbroad triggering can cause unintended processing of user media and accidental transmission to external services without sufficiently explicit user intent.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The history-report keywords are broad and ambiguous, so unrelated requests about reports or prior activity may trigger cloud history retrieval automatically. This can expose historical analysis metadata or report links without clear authorization intent, especially in shared-device or ambiguous conversational contexts.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The API documentation allows direct video upload and remote URL analysis but provides no privacy notice, retention policy, consent guidance, or data-handling constraints. Because video may contain people, homes, metadata, or health-related inferences, omission of these disclosures creates a meaningful privacy and compliance risk.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The skill reads the entire local file and forwards it to an external analysis API without any visible consent flow, warning, or minimization in this code path. That creates a privacy and data-handling risk because users may believe they are using a narrowly scoped pet-voice feature while arbitrary file contents are uploaded to a backend service.

Missing User Warnings

Low

Confidence: 72% confidence
Finding: The code forwards user-supplied remote URLs to the analysis service with no visible disclosure about external fetching or downstream processing. While the backend service likely performs the actual retrieval, this still creates a transparency and privacy issue because users are not informed that the service will access the specified remote resource.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script requires an open_id/user identifier and uses it in network-backed operations without any privacy notice, minimization, or visible consent flow. This is dangerous because identifiers such as username or phone number can be sensitive personal data and may enable tracking, correlation of user activity, or unauthorized exposure if backend handling is weak.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The constructor automatically creates tables and performs ALTER TABLE on initialization, causing persistent schema changes without any explicit user/admin action. In a skill whose purpose does not justify database administration, silent filesystem and schema modification is risky because it can alter local state unexpectedly and complicate trust and recovery.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The helper transmits identifiers and authentication headers such as X-Access-Token, X-Api-Key, Authorization, and pnaUserName to remote services without any visible disclosure or consent mechanism in this file. For a pet emotion analysis skill, silent transfer of identity and auth data is privacy-invasive and broadens the blast radius if the endpoint, logs, or surrounding code are compromised.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The code can create or update local user records and save tokens with no visible user warning, making credential storage an implicit side effect of normal skill use. Hidden local persistence is dangerous because it creates long-lived sensitive artifacts that may be accessed by other components, leaked in backups, or mishandled in later code changes.

VirusTotal

60/60 vendors flagged this skill as clean.

View on VirusTotal