Aliyun Speech Transcriber
v0.1.0Transcribe publicly accessible audio or video URLs with Aliyun speech services. Use when the user wants speech-to-text via Aliyun DashScope, needs transcript...
Security Scan
OpenClaw
Benign
medium confidencePurpose & Capability
The skill's name, description, SKILL.md, and included script all align: they submit public media URLs to Aliyun DashScope and return transcript JSON/plain text. The declared required environment variable (ASR_DASHSCOPE_API_KEY with a DASHSCOPE_API_KEY fallback) matches the code. One incongruity: registry metadata lists no required binaries, but the runtime instructions and included file require running 'node scripts/transcribe.js' (i.e., Node.js must be available).
Instruction Scope
The SKILL.md directs the agent to run the bundled Node script which only interacts with DashScope endpoints and the transcription result URLs. However, the script will fetch any transcription_url returned by DashScope and include that content in the printed JSON. If DashScope (or a malicious intermediary) returned a URL pointing at an internal endpoint or other unintended resource, the script would fetch and expose that content in the transcript output. The SKILL.md does include a safety rule to only send URLs the user intends to transcribe, but there is an inherent risk in following provider-supplied result URLs without additional validation.
Install Mechanism
No install spec (instruction-only with an included script). All code is provided in the package, so nothing is downloaded from unknown external URLs at install time. This is low installation risk.
Credentials
Only Aliyun DashScope API key environment variables are required (ASR_DASHSCOPE_API_KEY or DASHSCOPE_API_KEY); optional vars control model, language hints, and polling/timeouts. There are no unrelated credentials or broad access requests.
Persistence & Privilege
The skill does not request permanent presence (always:false) and uses normal agent invocation. It does not modify other skills or system configs. This is proportionate for the stated function.
Assessment
This skill appears to do what it says: submit public media URLs to Aliyun DashScope and return transcripts. Before installing: ensure Node.js is available on the environment (the package instructs running node but 'required binaries' was left empty in metadata); keep your ASR_DASHSCOPE_API_KEY secret (do not hardcode it); only transcribe URLs you control or explicitly trust. Be aware the script will fetch any transcription_url the provider returns and include that content in its output — if an unexpected or malicious URL is returned it could cause the agent to retrieve and expose unintended data. If you need stronger guarantees, ask the author for URL validation (restrict to known storage hosts) or to avoid automatically fetching provider-supplied result URLs.scripts/transcribe.js:33
Environment variable access combined with network send.
Confirmed safe by external scanners
Static analysis detected API credential-access patterns, but both VirusTotal and OpenClaw confirmed this skill is safe. These patterns are common in legitimate API integration skills.Like a lobster shell, security has layers — review code before you run it.
Runtime requirements
🎤 Clawdis
EnvASR_DASHSCOPE_API_KEY
latest
Aliyun Speech Transcriber
Use this skill to turn externally accessible media URLs into transcript results.
Current scope
Current implementation focuses on DashScope file transcription using the paraformer-v2 model, aligned with the existing Java service pattern.
Required environment variables
ASR_DASHSCOPE_API_KEY
Fallback supported:
DASHSCOPE_API_KEY
Optional:
ALIYUN_SPEECH_MODEL- defaults toparaformer-v2ALIYUN_SPEECH_LANG_HINTS- defaults tozh,enALIYUN_SPEECH_POLL_SECONDS- defaults to5ALIYUN_SPEECH_TIMEOUT_SECONDS- defaults to1800
Inputs
Pass one or more externally accessible URLs:
node scripts/transcribe.js --file-url "https://example.com/audio.mp3"
Multiple files:
node scripts/transcribe.js --file-url "https://a.com/1.mp3" --file-url "https://a.com/2.mp3"
Output
The script returns JSON with:
successproviderenginetaskIdrequestIdresultstext
text is a best-effort plain-text extraction from the final JSON result.
Chaining from Qiniu
Typical workflow:
- Use
qiniu-uploadto upload a local file. - Prefer a signed private URL if the domain is not anonymously readable.
- Pass the returned URL into this skill.
Safety rules
- Never hardcode Aliyun credentials.
- Fail fast if
DASHSCOPE_API_KEYis missing. - Only send URLs the user intends to transcribe.
Comments
Loading comments...
