Security audit

VoiceTrust

Security checks across malware telemetry and agentic risk

Overview

VoiceTrust is a disclosed local voice-verification skill with setup-time model downloads and local owner enrollment storage, not a hidden data-exfiltration or destructive package.

Install only if you are comfortable running a local ML voice-verification runtime and storing owner voice embeddings on disk. Keep runtime/data/owners and runtime/data/voiceprints private, avoid syncing them to cloud backups, and run the model downloader only when you accept its documented GitHub mirror and SHA-256 verification flow.

SkillSpector

By NVIDIA

Vulnerability Patterns

Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (16)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 82% confidence
Finding: The skill manifest presents a lightweight interpretation skill, but the package appears to exercise file, network, shell, and write capabilities without declaring them. Undeclared capabilities are dangerous because they prevent accurate risk assessment and sandboxing, and they can expose local audio, enrollment data, or fetched model assets to unintended handling paths.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 93% confidence
Finding: The documented purpose says the skill interprets VoiceTrust outputs, but the underlying behavior reportedly performs direct audio analysis, speaker verification, enrollment, persistent voiceprint storage, remote model download, and command-related tooling. This mismatch is high risk because operators may authorize or trust the skill as a passive formatter while it actually processes sensitive biometric data, writes persistent identity artifacts, and reaches out to remote sources.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: This file goes beyond interpreting VoiceTrust results and exposes active biometric enrollment, loading, saving, and persistent owner-profile management. In the context of a skill described as result interpretation, this creates unexpected collection and retention of sensitive biometric data, increasing privacy, compliance, and misuse risk if called by higher-level agents without explicit consent and controls.

Context-Inappropriate Capability

High

Confidence: 92% confidence
Finding: The pipeline initializes an OwnerProfileStore rooted in a local data directory, enabling persistent storage of speaker profiles despite the skill being framed as an interpretation layer. Persisting biometric artifacts without strong justification, disclosure, and governance is dangerous because voiceprints are sensitive identifiers that can be reused, exfiltrated, or retained longer than users expect.

Intent-Code Divergence

Medium

Confidence: 80% confidence
Finding: The module docstring presents the code as focused on trust scoring and metadata, but the implementation also supports enrollment and persistence of voiceprints and owner profiles. That mismatch is security-relevant because reviewers, integrators, or policy layers may underestimate the module's ability to collect and retain biometrics, leading to unsafe deployment assumptions.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The module-level docstring claims anti-spoofing support via a pre-trained model, but AntiSpoofingDetector is a disabled placeholder that never initializes a model and returns neutral fallback values with is_synthetic set to False. In a voice owner-verification skill, this can cause downstream systems or operators to assume deepfake detection is active when it is not, enabling spoofed or synthetic audio to pass command-gating decisions under a false sense of security.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The skill is described as interpreting VoiceTrust results, but this script adds bootstrap behavior that downloads and writes model assets at runtime. That expands the trust boundary and attack surface beyond the declared purpose, creating supply-chain and unexpected network/file-system side effects if the script is invoked in environments that only expected local interpretation logic.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The script performs network downloads from a GitHub raw mirror and writes the results into runtime assets, which is not necessary for a skill whose stated function is to interpret VoiceTrust results. Even though SHA-256 verification is present, the capability still introduces remote dependency, egress, and supply-chain risk, and may violate least-privilege or operator expectations for an interpretation-only skill.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The README instructs users to collect and enroll 3 to 5 owner voice samples and notes storage under `data/owners/`, but it provides no warning that voiceprints and voice recordings are biometric/sensitive personal data. In a voice-verification skill, this omission increases the risk of improper collection, retention, sharing, or insecure local handling of biometric data, which can create privacy, compliance, and misuse exposure even if the runtime is otherwise functioning as intended.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: Enrollment and owner-sample append operations write biometric-derived data to persistent storage without any indication in this file of user disclosure, consent checks, or safety interlocks. Silent persistence of voice biometrics is risky because users and calling systems may treat the skill as a read-only interpreter while it actually creates retained identity artifacts on disk.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The save/load voiceprint methods directly operate on persistent biometric files, but this file provides no guardrails around who may invoke them, where files may be stored, or whether the user has agreed to this handling. In an agent skill context, that increases the chance of unauthorized persistence, relocation, or reuse of sensitive voiceprint data.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: save_voiceprint writes a speaker embedding to an arbitrary filesystem path without any consent, disclosure, access control, or protection for biometric data. Voiceprints are sensitive identifiers; if stored insecurely or in shared locations, they can be copied, correlated, or reused for unauthorized profiling or authentication-related abuse.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: load_voiceprint reads biometric embeddings from disk with no provenance validation, disclosure, or integrity protection. In this context, an attacker who can replace or supply a crafted voiceprint file could enroll a fraudulent identity or tamper with verification behavior, while the silent access to biometric data also creates privacy and compliance risk.

Known Vulnerable Dependency: torch — 10 advisory(ies): CVE-2025-2953 (PyTorch susceptible to local Denial of Service); CVE-2022-45907 (PyTorch vulnerable to arbitrary code execution); CVE-2025-32434 (PyTorch: `torch.load` with `weights_only=True` leads to remote code execution) +7 more

Critical

Category: Supply Chain
Confidence: 84% confidence
Finding: torch

Known Vulnerable Dependency: pyyaml — 8 advisory(ies): CVE-2019-20477 (Deserialization of Untrusted Data in PyYAML); CVE-2020-1747 (Improper Input Validation in PyYAML); CVE-2020-14343 (Improper Input Validation in PyYAML) +5 more

Critical

Category: Supply Chain
Confidence: 90% confidence
Finding: pyyaml

Known Vulnerable Dependency: black — 3 advisory(ies): CVE-2026-32274 (Black: Arbitrary file writes from unsanitized user input in cache file name); CVE-2024-21503 (Black vulnerable to Regular Expression Denial of Service (ReDoS)); CVE-2024-21503 (Versions of the package black before 24.3.0 are vulnerable to Regular Expression)

High

Category: Supply Chain
Confidence: 88% confidence
Finding: black

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal