xiaoai-bridge

Security checks across malware telemetry and agentic risk

Overview

The bridge matches its stated purpose, but the package needs Review because it ships a Xiaomi session file and includes unsafe copy-paste command examples.

Do not install this version as-is. The publisher should remove scripts/.mi.json, revoke the exposed Xiaomi tokens, declare the required Xiaomi credentials in metadata, regenerate the lockfile with HTTPS sources, and replace exec string examples with spawn/execFile argument arrays. After cleanup, use only your own credentials, a distinctive trigger phrase, and private logs.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (10)

Context-Inappropriate Capability

Medium
Confidence
98% confidence
Finding
This example builds a shell command with interpolated response text: if the spoken or AI-generated text contains shell metacharacters, quoting breaks and arbitrary commands may execute in the host environment. Because the bridge is specifically designed to turn voice into assistant actions and then speak dynamic output, the input path is plausibly attacker-influenced, making this much more dangerous in context.

Context-Inappropriate Capability

Medium
Confidence
99% confidence
Finding
The documented speakViaXiaoAi function again uses exec with a composed command string containing untrusted text. Any caller passing attacker-controlled or model-generated content can trigger command injection, which can lead to arbitrary code execution under the privileges of the process running the skill.

Context-Inappropriate Capability

Medium
Confidence
99% confidence
Finding
The full integration example repeats the same unsafe pattern in a more realistic workflow, interpolating AI-produced response text into a shell command. Since AI output may include quotes, command separators, or payloads derived from user speech, this creates a direct bridge from voice/model content to host command execution.

Vague Triggers

Medium
Confidence
86% confidence
Finding
Allowing an empty trigger prefix causes the bridge to process every captured voice message, greatly expanding the activation surface and increasing the chance of accidental or unauthorized command handling. In a continuously listening, cloud-backed voice bridge, broad activation materially increases privacy and abuse risk.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
The skill description emphasizes convenience features but does not prominently warn that it performs continuous polling of cloud-backed voice transcripts and requires sensitive Xiaomi account credentials. Users may underestimate the privacy and account-security implications, especially given the always-on monitoring behavior and device control pathway.

Natural-Language Policy Violations

Medium
Confidence
98% confidence
Finding
The lockfile pins package tarballs to an unsecured HTTP mirror (`mirrors.tencentyun.com`), which exposes dependency downloads to tampering via man-in-the-middle attacks. Although integrity hashes provide some protection, using plain HTTP for the package source weakens supply-chain trust, can leak metadata, and may still enable downgrade or mirror-substitution risks in some install workflows and tooling.

Natural-Language Policy Violations

Medium
Confidence
93% confidence
Finding
The lockfile repeatedly routes dependency resolution through a region-specific Tencent Cloud mirror without any visible user opt-in or justification. This creates avoidable supply-chain trust and availability risk because builds depend on a third-party mirror outside the default npm trust path, which is more concerning in a voice-command bridge skill that may run with access to home automation or assistant integrations.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The script fetches recent voice conversation records and prints their contents directly to stdout, which can expose sensitive spoken queries, account activity, or personal data in terminals, CI logs, shell history capture, or shared debugging sessions. In the context of a voice-assistant bridge, these transcripts are especially privacy-sensitive because they may include home automation commands, names, schedules, or other intimate household information.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The script emits captured voice contents and timestamps directly to stdout as structured JSON, which can expose private spoken content to parent processes, logs, process supervisors, shells, or monitoring systems without any consent notice or minimization. In this skill’s context, the whole purpose is to bridge smart-speaker voice input into another automation system, so sensitive household commands or personal speech may be unintentionally disclosed beyond the original device boundary.

Ssd 3

Medium
Confidence
84% confidence
Finding
The documentation exposes both normalized text and full original utterances and encourages downstream handling without minimization guidance, which can propagate sensitive voice-derived content into logs, agents, or third-party systems. In this skill's context, captured speech may contain personal data, home activity, credentials, or other sensitive requests, so unrestricted reuse increases privacy and data-leak risk.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal