Qwen ASR (C-based Offline)
Offline Chinese and mixed Chinese-English speech-to-text recognition in pure C without Python or FFmpeg dependencies, suitable for edge devices.
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 34 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
OpenClaw
Suspicious
high confidencePurpose & Capability
The skill's stated purpose (offline C ASR suitable for edge) is plausible, but the package metadata lists no required binaries or env vars while the SKILL.md and run.sh clearly rely on external artifacts: a cloned repository (~~/.openclaw/workspace/qwen-asr), a compiled qwen_asr binary (make blas), and model files (~1.7–4.5GB). The SKILL.md also inconsistently claims 'no FFmpeg dependencies' while the runner invokes ffmpeg for preprocessing. These mismatches are disproportionate to the simple description and suggest sloppy or inconsistent packaging.
Instruction Scope
The included run.sh performs several runtime actions outside a narrow 'just run local binary' scope: it expects a repo in $HOME/.openclaw/workspace (or instructs the user to git clone it), may invoke make to compile code on the host, calls ffmpeg to transcode audio, and requires the user to run a model download script (which needs network and TTY). The script does not exfiltrate credentials or data, but it does rely on network access (for model download) and mutates the user's workspace. The SKILL.md's claim of fully offline inference is only true after a one-time download step, which is not automated.
Install Mechanism
There is no formal install spec (instruction-only), which minimizes automated risk, but the runner expects the user to clone and build upstream code and to have system tools (git, make, C compiler, ffmpeg, BLAS) available. The model download is left to the user (manual git/./download_model.sh), reducing silent remote downloads, but building unknown code on-device is a real risk if you don't trust the upstream repository.
Credentials
The skill requests no environment variables or credentials, and the run script does not attempt to read secret env vars or other unrelated config files. The lack of credential requests is appropriate for an offline ASR skill; the only external requirement is the model download (no credentials shown).
Persistence & Privilege
The skill is not force-enabled (always:false) and does not modify system-wide configs. It operates on files under $HOME/.openclaw/workspace and temporary WAV files; it does not request elevated privileges or persist credentials. Autonomous invocation is allowed by platform default but is not combined with additional concerning privileges here.
What to consider before installing
Key points before installing/using:
- The packaging is inconsistent: SKILL.md says 'no FFmpeg' but run.sh calls ffmpeg. Expect to need system tools: git, make, a C toolchain, BLAS (OpenBLAS/MKL), and ffmpeg.
- The skill does not auto-download models: you must run the upstream download_model.sh manually (this requires internet and TTY). After that, inference can be offline. Verify the model source and license before downloading (~1.7–4.5GB each).
- The runner compiles and executes code from ~/.openclaw/workspace/qwen-asr. Only proceed if you trust the upstream repository (https://github.com/antirez/qwen-asr) — inspect its code and the download script (download_model.sh) yourself.
- Run builds in a sandbox/container or non-privileged account if possible, and do not run as root. Check disk space and memory requirements before use.
- Because metadata omitted required binaries and contradicted the README, treat this as a poorly packaged integration rather than clearly malicious — but verify upstream sources and inspect scripts before running them.Like a lobster shell, security has layers — review code before you run it.
Current versionv1.0.0
Download zipaiasrclatestofflinespeech
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
qwen-asr — 离线中文语音识别(纯 C 实现)
使用 antirez/qwen-asr 的 qwen3-asr-0.6b 模型进行中文语音转文字,无 Python/GIL/FFmpeg 依赖,适合边缘部署。
依赖
| 平台 | 依赖项 | 说明 |
|---|---|---|
| macOS | Accelerate.framework | 系统自带,自动链接 |
| Linux | OpenBLAS 或 Intel MKL | 需手动安装 |
典型用法
# 转写音频(自动预处理为 16kHz/mono/WAV)
.skill qwen-asr --audio /path/to/audio.wav
# 指定模型(small=0.6B, large=1.7B)
.skill qwen-asr --audio /path/to/audio.wav --model large
# 指定线程数
.skill qwen-asr --audio /path/to/audio.wav --threads 4
输出
[中文] 现在已经可以用了吗?
支持中/英文混读(模型训练语料含双语)。
模型大小
| 模型 | 大小 | 推荐场景 |
|---|---|---|
qwen3-asr-0.6b | ~1.7GB | 推荐:低延迟、边缘设备 |
qwen3-asr-1.7b | ~4.5GB | 高精度(需 ≥4GB 内存) |
注意事项
- 音频必须为 16kHz/mono/16-bit PCM WAV(脚本会自动转换非合规音频)
- 首次运行会下载模型(~1.7GB),后续无需重复下载
- 仅支持
.ogg/.mp3/.wav→.wav预处理(FFmpeg 内置支持) - 推理为纯离线,无需网络(模型下载阶段除外)
作者
- GitHub: @antirez
- 技能封装: OpenClaw Agent
许可
MIT(qwen-asr)+ 阿里云 Qwen3 ASR Model License
Files
3 totalSelect a file
Select a file to preview.
Comments
Loading comments…
