Security audit

LrshuAI Sfx Generation

Security checks across malware telemetry and agentic risk

Overview

This sound-effect skill can run broader remote model calls than it advertises, including using an API key and optionally uploading local media files.

Install only after reviewing the Python helper. Use a narrowly scoped TEAM_API_KEY, verify TEAM_BASE_URL before running, and avoid passing private image or video paths. Treat this as a Review item because the code can do more than the SFX-only description says, not because there is evidence of automatic exfiltration or destructive behavior.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (8)

Tainted flow: 'endpoint' from os.getenv (line 39, credential/environment) → requests.post (network output)

Critical

Category: Data Flow
Content: print(f"Invoking model: {args.model} ...") try: response = requests.post(endpoint, headers=headers, json=payload) response.raise_for_status() result = response.json()
Confidence: 90% confidence
Finding: response = requests.post(endpoint, headers=headers, json=payload)

Tainted flow: 'poll_endpoint' from os.getenv (line 132, credential/environment) → requests.get (network output)

Critical

Category: Data Flow
Content: while True: time.sleep(3) # 每3秒查询一次 poll_resp = requests.get(poll_endpoint, headers=headers) poll_resp.raise_for_status() poll_data = poll_resp.json()
Confidence: 88% confidence
Finding: poll_resp = requests.get(poll_endpoint, headers=headers)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill declares only runtime requirements but no explicit permissions, while its documented/runtime behavior implies access to environment variables and outbound network resources. That mismatch weakens policy enforcement and user understanding, making it easier for the skill to exfiltrate secrets such as TEAM_API_KEY or send data to remote services without clear authorization boundaries.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 97% confidence
Finding: The skill is ներկայացced as a text-to-sound generator, but the underlying behavior reportedly supports arbitrary model IDs, image/video inputs, asynchronous remote jobs, and a configurable base URL. This broad hidden capability materially expands the attack surface: users or upstream prompts may cause unrelated multimodal processing, local file ingestion, or data transmission to untrusted/custom endpoints under the cover of a narrow audio-only description.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The documentation instructs the agent to execute a system-level Python command directly instead of using a more constrained skill runner. Direct shell/process execution is broader than necessary for the stated task and can bypass sandboxing, policy checks, auditing, or argument validation that a managed invocation path would normally provide.

Description-Behavior Mismatch

High

Confidence: 84% confidence
Finding: The skill is presented as sound-effect generation, but the implementation is a broad multimodal model invoker that accepts arbitrary model IDs and image/video inputs. This capability mismatch increases risk because users and reviewers may authorize the skill expecting narrow audio generation while it can transmit and process much broader content classes.

Context-Inappropriate Capability

High

Confidence: 86% confidence
Finding: Image-to-video and generic multimodal execution are not justified by the stated sound-effect purpose, yet the script can read local image/video files, base64-encode them, and send them to a remote API. In this skill context, the extra capabilities materially increase the chance of unintended sensitive file disclosure and policy bypass through overbroad functionality.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script transmits prompts and optional local image/video content to a remote API, but there is no explicit warning, consent prompt, or visible data handling notice before upload. In a skill framed as simple sound-effect generation, that omission makes accidental disclosure of sensitive prompts or local media more likely.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal