Polyphone TTS
PassAudited by ClawScan on May 1, 2026.
Overview
This instruction-only skill is coherent for fixing Chinese TTS pronunciation, but users should notice it sends text to SenseAudio using an API key and writes local audio/response files.
This skill appears benign and purpose-aligned. Before installing or using it, make sure you trust SenseAudio with the text you synthesize, protect your API key, and only use cloned voices you have permission to use.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Running the workflow will contact an external service and create local output files.
The skill instructs use of local command-line tools to call the SenseAudio API and write decoded audio locally; this is central to the TTS purpose and is disclosed.
curl -s -X POST https://api.senseaudio.cn/v1/t2a_v2 ... jq -r '.data.audio' response.json | xxd -r -p > output.mp3
Review the text, voice ID, and output filenames before running the command, especially if the text is sensitive.
The skill can use the user's SenseAudio account quota or permissions to synthesize speech with a provided cloned voice.
The skill requires a SenseAudio API key and a cloned voice ID to access the TTS service; this is expected for the integration and is clearly documented.
requires: env: - SENSEAUDIO_API_KEY ... Authorization: Bearer $SENSEAUDIO_API_KEY ... "voice_id": "<CLONED_VOICE_ID>"
Use an appropriately scoped API key if available, keep it secret, and only use cloned voice IDs you are authorized to use.
Text submitted for synthesis and related pronunciation annotations are shared with the external TTS provider.
The API request sends the user's text, voice identifier, and pronunciation dictionary to the SenseAudio provider; this external data flow is necessary for the stated TTS function.
"text": "<TEXT>", ... "voice_id": "<CLONED_VOICE_ID>", ... "dictionary": <DICTIONARY_ARRAY>
Avoid submitting confidential or regulated text unless SenseAudio's handling and retention policies are acceptable for your use case.
