Wechat Voice

v0.1.0

专为微信 clawbot 设计的微信语音解析技能 / WeChat voice parsing skill for clawbot. 识别微信 SILK 语音,解码为 WAV,并用本地 Whisper 转写后回复。适用于微信语音、语音转文字、语音附件解析、‘这段语音说了什么’等场景。

1· 108·0 current·0 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Pending
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (WeChat SILK decode + local Whisper transcription) align with the included script and runtime notes. The script inspects file headers, decodes SILK via pysilk, falls back to ffmpeg, and transcribes with faster-whisper — all coherent with the declared purpose and expected local dependencies.
Instruction Scope
Instructions correctly limit actions to inspecting and decoding a provided audio file and transcribing it. They require access to the local attachment path (explicitly). One important operational note: faster-whisper/WhisperModel('base') will typically fetch model weights from the network on first run (and pip installs will download packages), so the skill may perform large external downloads and use significant local CPU/disk when invoked.
Install Mechanism
There is no automated install spec (instruction-only). The SKILL.md recommends pip user installs ('silk-python', 'faster-whisper') and expects ffmpeg and python3. This is low technical risk, but these steps involve network downloads from PyPI and (implicitly) model weight fetches — not written code from an arbitrary URL and no archive extraction in the install flow.
Credentials
The skill does not request environment variables, credentials, or unusual config paths. It only needs read access to the provided audio file and permission to write a temporary WAV (default /tmp/wechat-voice-decoded.wav), which is proportionate to its function.
Persistence & Privilege
always is false, the skill is user-invocable and not forced into every agent run. It does not modify other skills or request system-wide privileges.
Assessment
This skill appears to do what it says: read a provided audio file, decode SILK audio, and transcribe locally. Before installing or running it, consider: (1) Dependencies — you need python3, ffmpeg, and pip-installed packages (silk-python/pysilk and faster-whisper) which will be downloaded from the network; (2) Model weights — the Whisper model (base) will likely be downloaded the first time it runs and can be large and CPU/disk intensive; (3) Resource use — transcription runs locally and may be slow on CPU; (4) File access — the script reads the supplied audio path and writes a temporary WAV (default /tmp/wechat-voice-decoded.wav); ensure you only pass audio files you trust. If you want tighter control, preinstall packages and the model offline or run the skill in a sandboxed environment with restricted network access and disk quotas.

Like a lobster shell, security has layers — review code before you run it.

latestvk972tn9fggwqabezzp3ktas91183cypf

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments