Security audit

Audio Broadcast

Security checks across malware telemetry and agentic risk

Overview

The skill is mostly a real broadcast-device integration, but it ships a populated credential file and uses plain HTTP for high-impact device, schedule, and file operations.

Review before installing. Delete the bundled config.json token, log in with your own account, and use this only with a trusted Xiaoboshu server on a trusted LAN/VPN or behind HTTPS. Require explicit confirmation before broadcasting to all devices, changing volume broadly, deleting files/tasks, uploading sensitive audio, sending sensitive TTS text, or enabling the daily cleanup job.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (16)

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The documented alarm-device features allow configuring external reporting endpoints and callbacks that extend beyond ordinary audio broadcast control into network-integrated signaling behavior. In the context of a broadcast skill, exposing these capabilities increases the risk of SSRF-like misuse, unauthorized outbound communications, or covert data exfiltration if the skill surfaces them without strict scope restriction and validation.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The API reference exposes configuration of reporting IPs, callback URLs, and cacheable remote URLs without clear justification for an audio-broadcast skill. That mismatch in capability versus stated purpose materially raises risk because an attacker or over-permissioned agent could repurpose the skill to contact arbitrary hosts or alter device reporting behavior.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The collection includes user-administration endpoints such as adduser and token-management operations that materially exceed the stated audio-broadcast purpose. This over-scoped capability increases the blast radius of the skill: if exposed through an agent, it could create or rekey accounts and expand access beyond broadcast control.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The collection exposes alarm-management, callback, and device-behavior configuration functions unrelated to ordinary audio playback. In the context of a broadcast skill, these capabilities could be abused to alter device behavior, register alarms, or redirect device communications without the user expecting such control.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The skill documents token reset and token retrieval operations requiring usernames and passwords, which are unnecessary for routine broadcast control. An agent with access to these endpoints could rotate credentials or obtain active tokens, enabling broader account compromise and follow-on misuse of all device APIs.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The device APIs allow arbitrary reporting IPs and callback URLs to be configured on hardware, including externally directed destinations. That creates a path for covert exfiltration, unauthorized outbound connections, or attacker-controlled callbacks under the cover of a benign broadcast integration.

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The documentation labels an endpoint as token retrieval that will not change the token, yet it resembles the reset-token operation naming and path pattern. Misleading semantics around authentication endpoints can cause accidental token rotation or improper credential handling, which is especially risky for an agent automating actions.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The documentation states that login credentials are saved to `config.json` but does not warn about plaintext credential storage, file permissions, rotation, or multi-user exposure. In a skill that controls broadcast devices over LAN/Internet, leaked credentials could allow unauthorized device control, audio playback, task manipulation, and broader operational disruption.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The skill recommends automated deletion of server-side TTS files and provides cron-style cleanup instructions, but it does not prominently warn that this is a destructive operation or discuss recovery limitations and verification safeguards. If the matching logic is imperfect or naming overlaps with user-important files, automation could delete needed audio assets and disrupt scheduled broadcasts.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script performs irreversible remote deletions immediately after matching filenames, with no confirmation prompt, dry-run default, or explicit safety interlock. In an administrative environment, a misconfigured host, bad credentials, or overly broad filename pattern could cause unintended bulk data loss.

Missing User Warnings

High

Confidence: 99% confidence
Finding: The script sends the account ID and token over plain HTTP, allowing interception or modification by anyone able to observe or tamper with network traffic. Compromised credentials could let an attacker enumerate tasks/files and delete or manipulate broadcast content remotely.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script stores host, account ID, token, and username in a local config.json file without any permission hardening, encryption, or user warning. On multi-user systems or insecure working directories, another local user or process could read and reuse the token to control broadcast devices and access account resources.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The upload function transmits arbitrary local audio files to the remote server with no explicit consent prompt or warning about network transfer. In this broadcast-management context, users may supply sensitive local recordings and unintentionally exfiltrate them to the vendor service.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: TTS generation sends user-provided text to an external Edge TTS service, and the resulting audio is then uploaded to the broadcast server, but the user is not clearly warned about either transmission. This can expose sensitive announcement text, names, schedules, or operational content to third parties outside the local environment.

Ssd 2

Medium

Confidence: 91% confidence
Finding: The reporting destination is base64-encoded, which obscures that the device is being instructed to send data to a specific external IP and port. In a skill already over-scoped for broadcast control, this concealment makes risky network reconfiguration less visible to reviewers and operators and can aid covert redirection or exfiltration.

Ssd 2

Medium

Confidence: 91% confidence
Finding: The callback URL is base64-encoded, masking externally directed behavior behind a benign-looking Wi‑Fi configuration action. This reduces transparency and can facilitate hidden callback registration to attacker-controlled infrastructure if the skill is misused.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.