Back to skill

Security audit

LrshuAI Sfx Generation

Security checks across malware telemetry and agentic risk

Overview

This sound-effect skill can run broader remote model calls than it advertises, including using an API key and optionally uploading local media files.

Install only after reviewing the Python helper. Use a narrowly scoped TEAM_API_KEY, verify TEAM_BASE_URL before running, and avoid passing private image or video paths. Treat this as a Review item because the code can do more than the SFX-only description says, not because there is evidence of automatic exfiltration or destructive behavior.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (8)

Tainted flow: 'endpoint' from os.getenv (line 39, credential/environment) → requests.post (network output)

Critical
Category
Data Flow
Content
print(f"Invoking model: {args.model} ...")
    try:
        response = requests.post(endpoint, headers=headers, json=payload)
        response.raise_for_status()
        result = response.json()
Confidence
90% confidence
Finding
response = requests.post(endpoint, headers=headers, json=payload)

Tainted flow: 'poll_endpoint' from os.getenv (line 132, credential/environment) → requests.get (network output)

Critical
Category
Data Flow
Content
while True:
            time.sleep(3) # 每3秒查询一次
            poll_resp = requests.get(poll_endpoint, headers=headers)
            poll_resp.raise_for_status()
            poll_data = poll_resp.json()
Confidence
88% confidence
Finding
poll_resp = requests.get(poll_endpoint, headers=headers)

Lp3

Medium
Category
MCP Least Privilege
Confidence
92% confidence
Finding
The skill declares only runtime requirements but no explicit permissions, while its documented/runtime behavior implies access to environment variables and outbound network resources. That mismatch weakens policy enforcement and user understanding, making it easier for the skill to exfiltrate secrets such as TEAM_API_KEY or send data to remote services without clear authorization boundaries.

Tp4

High
Category
MCP Tool Poisoning
Confidence
97% confidence
Finding
The skill is ներկայացced as a text-to-sound generator, but the underlying behavior reportedly supports arbitrary model IDs, image/video inputs, asynchronous remote jobs, and a configurable base URL. This broad hidden capability materially expands the attack surface: users or upstream prompts may cause unrelated multimodal processing, local file ingestion, or data transmission to untrusted/custom endpoints under the cover of a narrow audio-only description.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The documentation instructs the agent to execute a system-level Python command directly instead of using a more constrained skill runner. Direct shell/process execution is broader than necessary for the stated task and can bypass sandboxing, policy checks, auditing, or argument validation that a managed invocation path would normally provide.

Description-Behavior Mismatch

High
Confidence
84% confidence
Finding
The skill is presented as sound-effect generation, but the implementation is a broad multimodal model invoker that accepts arbitrary model IDs and image/video inputs. This capability mismatch increases risk because users and reviewers may authorize the skill expecting narrow audio generation while it can transmit and process much broader content classes.

Context-Inappropriate Capability

High
Confidence
86% confidence
Finding
Image-to-video and generic multimodal execution are not justified by the stated sound-effect purpose, yet the script can read local image/video files, base64-encode them, and send them to a remote API. In this skill context, the extra capabilities materially increase the chance of unintended sensitive file disclosure and policy bypass through overbroad functionality.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The script transmits prompts and optional local image/video content to a remote API, but there is no explicit warning, consent prompt, or visible data handling notice before upload. In a skill framed as simple sound-effect generation, that omission makes accidental disclosure of sensitive prompts or local media more likely.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal