Fish Audio S2 Pro TTS
PassAudited by ClawScan on May 17, 2026.
Overview
This documentation-only TTS/voice-cloning skill is coherent, but users should verify external software sources, avoid exposing its server publicly, and manage stored voice data carefully.
This skill appears benign as documentation for running Fish Audio S2 Pro. Before installing, verify the external package/container/model sources, run setup in an isolated environment, keep the API bound to localhost unless you add access controls, and only upload voice samples with proper consent.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Installing or running the referenced software means trusting external package, container, and model sources.
The setup pulls executable packages, container images, and model files from external registries; this is expected for this TTS model, but those external artifacts are outside the reviewed skill contents and are not pinned here.
pip install fish-speech ... docker pull fishaudio/fish-speech ... hf download fishaudio/s2-pro --local-dir checkpoints/s2-pro
Use an isolated environment, prefer official Fish Audio sources, pin versions where possible, and review or trust the external package/container before running it.
If exposed on a network, other users may be able to invoke the TTS service or consume local compute resources.
Binding to 0.0.0.0 can expose the TTS API beyond the local machine if the user runs this command; the docs do not show authentication or firewall controls.
python tools/api_server.py --llama-checkpoint-path checkpoints/s2-pro --decoder-checkpoint-path checkpoints/s2-pro/codec.pth --listen 0.0.0.0:8080
Bind to localhost unless remote access is required, and add firewall rules, authentication, or a trusted reverse proxy before exposing the server.
Uploaded voice profiles may remain on disk across sessions and could be reused for future synthesis if not deleted.
Voice samples or derived speaker profiles are sensitive personal data and are documented as being stored persistently for later reuse.
audio_sample=@voice.wav ... After upload: "voice": "my_voice". Persisted to `~/.cache/vllm-omni/speakers/*.safetensors`.
Upload only voices you are authorized to use, keep the cache directory protected, and delete stored voice profiles when they are no longer needed.
