Local Voice (FluidAudio TTS/STT)

Security checks across static analysis, malware telemetry, and agentic risk

Overview

The skill coherently implements a local TTS/STT daemon, with no evidence of exfiltration or destructive behavior, but users should notice that setup creates a persistent localhost service, builds third-party code, and may log short transcript snippets locally.

This appears suitable for a local Apple Silicon voice service if you are comfortable building third-party Swift dependencies and running a persistent localhost daemon. Before installing, understand how to stop/remove the LaunchAgent and consider disabling transcript logging if you will process private audio.

Static analysis

No static analysis findings were reported for this release.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

#
ASI10: Rogue Agents
Low
What this means

The local voice server can continue running in the background and listening on its localhost port after setup.

Why it was flagged

The setup script creates a LaunchAgent that starts at load/login and is kept alive, so the voice daemon persists beyond a single command.

Skill content
<key>RunAtLoad</key>\n    <true/>\n    <key>KeepAlive</key>\n    <true/>
Recommendation

Install only if you want an always-on local voice service, and keep track of the LaunchAgent so you can unload or remove it when no longer needed.

#
ASI04: Agentic Supply Chain Vulnerabilities
Low
What this means

Running the build may download and compile third-party code in addition to the reviewed skill files.

Why it was flagged

The build pulls third-party Swift packages from GitHub using version ranges rather than fully pinned revisions; this is common for Swift packages but still expands the trusted supply chain.

Skill content
.package(url: "https://github.com/FluidInference/FluidAudio.git", from: "0.12.0"),\n.package(url: "https://github.com/hummingbird-project/hummingbird.git", from: "2.0.0")
Recommendation

Review or pin dependency versions if reproducibility is important, and run setup only from a trusted copy of the skill.

#
ASI06: Memory and Context Poisoning
Low
What this means

Short snippets of transcribed speech may remain in local logs, which could matter if speech contains private information.

Why it was flagged

The STT endpoint logs the first part of transcription results; when run under the provided LaunchAgent, stdout is directed to local log files.

Skill content
print("👂 STT: \(String(format: "%.3f", elapsed))s -> \"\(text.prefix(50))...\"")
Recommendation

Avoid sending sensitive audio unless you are comfortable with local logging, or remove/redact transcript logging before running the daemon.