Local Voice (FluidAudio TTS/STT)
PassAudited by ClawScan on May 1, 2026.
Overview
The skill coherently implements a local TTS/STT daemon, with no evidence of exfiltration or destructive behavior, but users should notice that setup creates a persistent localhost service, builds third-party code, and may log short transcript snippets locally.
This appears suitable for a local Apple Silicon voice service if you are comfortable building third-party Swift dependencies and running a persistent localhost daemon. Before installing, understand how to stop/remove the LaunchAgent and consider disabling transcript logging if you will process private audio.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The local voice server can continue running in the background and listening on its localhost port after setup.
The setup script creates a LaunchAgent that starts at load/login and is kept alive, so the voice daemon persists beyond a single command.
<key>RunAtLoad</key>\n <true/>\n <key>KeepAlive</key>\n <true/>
Install only if you want an always-on local voice service, and keep track of the LaunchAgent so you can unload or remove it when no longer needed.
Running the build may download and compile third-party code in addition to the reviewed skill files.
The build pulls third-party Swift packages from GitHub using version ranges rather than fully pinned revisions; this is common for Swift packages but still expands the trusted supply chain.
.package(url: "https://github.com/FluidInference/FluidAudio.git", from: "0.12.0"),\n.package(url: "https://github.com/hummingbird-project/hummingbird.git", from: "2.0.0")
Review or pin dependency versions if reproducibility is important, and run setup only from a trusted copy of the skill.
Short snippets of transcribed speech may remain in local logs, which could matter if speech contains private information.
The STT endpoint logs the first part of transcription results; when run under the provided LaunchAgent, stdout is directed to local log files.
print("👂 STT: \(String(format: "%.3f", elapsed))s -> \"\(text.prefix(50))...\"")Avoid sending sensitive audio unless you are comfortable with local logging, or remove/redact transcript logging before running the daemon.
