Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Qwen3 Audio

v0.1.1

High-performance audio library for Apple Silicon with text-to-speech (TTS) and speech-to-text (STT).

0· 355·0 current·0 all-time
bynoah@darknoah
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The name/description (Qwen3 audio TTS/STT) match the included code and SKILL.md: the script wraps mlx-audio models for TTS, voice cloning, and STT. Declared dependency (mlx-audio) and the run commands in SKILL.md align with the stated functionality.
Instruction Scope
Instructions require a local Python .venv and running the provided script; they reference the local voices/ directory and read/write audio and text files (expected for this purpose). They also direct the agent to verify the env-check-list and to run 'uv' commands. The SKILL.md and script do not ask for unrelated secrets or to read arbitrary system files, but they do permit reading/writing files you point at and will download or access model artifacts over the network.
Install Mechanism
There is no formal install spec; the included script performs runtime installation via os.system("uv add mlx-audio --prerelease=allow") if mlx-audio is missing. This is coherent with the pyproject dependency, but it means installation happens dynamically at runtime and will execute package installation commands on the host (moderate risk). Model files may be pulled from Hugging Face or a mirror.
Credentials
The skill declares no required environment variables or credentials and the code does not request secrets. It may set HF_ENDPOINT/HF_HUB_OFFLINE for model hub access. Network access to model hubs and a public test audio URL is expected for fetching models; no unrelated credentials are requested.
Persistence & Privilege
The skill is not always-enabled and uses normal agent invocation. It does not modify other skills or global agent settings; it stores voice profiles in a local voices/ directory (expected).
Scan Findings in Context
[os.system-invocation-uv-add] expected: The script calls os.system("uv add mlx-audio --prerelease=allow") to install its runtime dependency if missing. This matches the need to have mlx-audio available but installs third-party code at runtime and should be reviewed before execution.
[network-access-hf-check] expected: The script performs a HEAD request to a Hugging Face endpoint (or mirror) to check availability and may set HF_HUB_OFFLINE. Network access and model downloads are expected for fetching model weights; confirm whether processing is local or uses remote inference endpoints in the installed mlx-audio implementation.
[remote-test-audio-url] expected: SKILL.md references a public audio URL (qianwen-res.oss-cn-beijing.aliyuncs.com). This is likely an example/test asset and is consistent with multimedia samples, but it indicates network references to external storage.
[temp-dir-write-and-delete] expected: The script writes temporary audio chunk files to a temp directory and deletes them. Temporary filesystem writes are necessary for chunked STT processing, but they will create files on disk during execution.
Assessment
This skill appears to do what it claims (local TTS/STT using the mlx-audio package), but it will: (1) install a third‑party Python package at runtime via the 'uv' tool; (2) access network endpoints to check/download model data (Hugging Face or a mirror) and reference an external test audio URL; and (3) read/write audio/text files (including creating a voices/ folder and temp chunk files). Before installing/using: review the mlx-audio package and its reputation; run the skill in an isolated environment (VM or container) if you want to limit blast radius; confirm whether model inference happens locally or via remote inference (if remote, audio might be transmitted to external servers); and inspect or vet any models you download for license/privacy implications. Because the skill's source/homepage is unknown, exercise extra caution and avoid providing sensitive audio unless you are confident about where processing occurs.

Like a lobster shell, security has layers — review code before you run it.

latestvk978sq7wmv5wjkxyk6kpt3zbgn828czr

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments