Qwen Asr Skill
v1.3.0Provides high-accuracy speech-to-text conversion supporting 22 Chinese dialects and 30 languages with automatic language detection, running on CPU.
⭐ 0· 277·2 current·2 all-time
byShuai YUAN@yszheda
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Benign
medium confidencePurpose & Capability
Name/description (Qwen ASR, CPU-side dialect support) matches the code: a Node.js HTTP wrapper that invokes a Python asr.py using qwen-asr and torch. Declared package dependencies and Python requirements align with running a local ASR model. No unrelated cloud credentials or surprising binaries are requested.
Instruction Scope
SKILL.md instructions (git clone, npm install, pip install, npm start) map to the provided Node + Python code. The skill runs a local web server, accepts uploaded audio or base64, invokes the Python script, and deletes uploaded files. The SKILL.md claims model weights are downloaded from Hugging Face on first run — code uses from_pretrained which will perform that download. There is no instruction to read unrelated user files, but the code does load environment variables (via dotenv) if present.
Install Mechanism
There is no automatic download/install spec in the registry; the README asks the user to run npm and pip installs. Model weights are retrieved at runtime via Hugging Face from_pretrained (a common, known source). No obscure download URLs or archive extraction from untrusted hosts were found in the code.
Credentials
The skill declares no required env vars or credentials. It reads standard process.env values (MODEL_NAME, DEVICE, CACHE_DIR, PYTHON_PATH, etc.) and uses dotenv if a .env file exists — reasonable for configuration but means a local .env could alter behavior. If the chosen model is private, Hugging Face authentication (HF_TOKEN) might be needed even though it isn't declared. No requests for unrelated service credentials were found.
Persistence & Privilege
always:false and user-invocable:true. The skill runs as a local web service and does not request persistent platform-wide privileges or modify other skills. It executes a local Python script (asr.py) via python-shell, which is expected for this design.
Assessment
This skill appears coherent with its stated purpose, but review and precautions are recommended before installing: 1) The skill will download ~6GB of model weights from Hugging Face on first run — ensure you have disk, CPU, and bandwidth capacity. 2) It runs a local web server that accepts file uploads; uploaded audio is stored temporarily in an uploads/ directory and then deleted — run in an isolated environment or container if you are cautious. 3) The code uses dotenv and standard environment variables — check any local .env before starting to avoid unintentionally exposing secrets or changing behavior (and be aware that a private HF model would require an HF token not declared in the SKILL.md). 4) The repository origin is listed as a placeholder in SKILL.md; verify the source URL and review the code before deployment. If you want higher assurance, run the skill inside a sandbox (container/VM) and inspect network traffic during first model download.Like a lobster shell, security has layers — review code before you run it.
asrvk9705rxppfgs4wdb0bx22n9an982k13xchinesevk9738ecwp6t6s21h6wdjcdkg4982jd4hdialectvk9705rxppfgs4wdb0bx22n9an982k13xlatestvk9705rxppfgs4wdb0bx22n9an982k13xlightvk9705rxppfgs4wdb0bx22n9an982k13xminimalvk9705rxppfgs4wdb0bx22n9an982k13xqwenvk9705rxppfgs4wdb0bx22n9an982k13xspeech-recognitionvk9705rxppfgs4wdb0bx22n9an982k13x
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
