Install
openclaw skills install wechat-voice-decode专为微信 clawbot 设计的微信语音解析技能 / WeChat voice parsing skill for clawbot. 识别微信 SILK 语音,解码为 WAV,并用本地 Whisper 转写后回复。适用于微信语音、语音转文字、语音附件解析、‘这段语音说了什么’等场景。
openclaw skills install wechat-voice-decode专为微信 clawbot 设计的微信语音解析技能。识别微信 SILK 语音、解码为 WAV、转写为文字,并基于语音内容回复。
#!SILK_V3).#!SILK_V3, treat it as WeChat SILK.scripts/transcribe_wechat_voice.py on the attachment.NO_SEGMENTS, tell the user the clip appears blank, too short, too quiet, or unclear.python3 /root/.openclaw/workspace/skills/wechat-voice/scripts/transcribe_wechat_voice.py <audio_path>
Optional WAV output path:
python3 /root/.openclaw/workspace/skills/wechat-voice/scripts/transcribe_wechat_voice.py <audio_path> /tmp/wechat-voice.wav
This skill is text-only and ClawHub-friendly, but it expects common local runtimes and Python packages to be available.
python3ffmpegInstall locally with:
python3 -m pip install --user silk-python faster-whisper
Use silk-python to decode WeChat SILK audio in Python.
Use faster-whisper for local CPU transcription.
Prefer faster-whisper over openai-whisper in this environment because it avoids a heavy PyTorch/CUDA installation chain and works well on CPU.
The script prints exactly one of these:
NO_SEGMENTS if no usable speech is detectedTreat NO_SEGMENTS as a valid outcome, not as a crash.
Check these in order:
#!SILK_V3?NO_SEGMENTS?For byte inspection, use a quick Python snippet such as:
python3 - <<'PY'
from pathlib import Path
p = Path('/path/to/audio')
b = p.read_bytes()[:64]
print('size=', p.stat().st_size)
print('hex=', b.hex())
print('ascii=', ''.join(chr(x) if 32 <= x < 127 else '.' for x in b))
PY
scripts/transcribe_wechat_voice.py: Decode/transcribe entry point.references/notes.md: Environment-specific notes and maintenance hints.