Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Azure Speech TTS

Azure Speech TTS skill for generating local audio files from text or SSML with Azure Speech. Use when the user asks to use Azure Speech / Azure TTS / Microso...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 40 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
high confidence
Purpose & Capability
The name/description, README, SKILL.md, and the included script all consistently implement Azure Speech Text-to-Speech (text/SSML → local audio file). Network calls target Microsoft Cognitive Services endpoints and the options (voice, format, SSML, output file) match the stated purpose.
Instruction Scope
Runtime instructions and the script operate within the stated scope: they read text/SSML input, optionally read files/stdin, construct SSML, call Azure token and synthesis endpoints, and write local audio and optional SSML files. There are no hidden external endpoints, token exfiltration code, or instructions to read unrelated system files in the provided code and documentation.
Install Mechanism
No install spec is present (instruction-only install). A Python script is included and uses only the standard library (urllib, pathlib, etc.), so there is no package download or archive extraction risk in the manifest.
!
Credentials
SKILL.md and the script require AZURE_SPEECH_KEY and AZURE_SPEECH_REGION (and optionally AZURE_SPEECH_VOICE / FORMAT) but the registry metadata lists no required environment variables and no primary credential. The omission of the required Azure credentials from the declared metadata is an inconsistency and a proportionally important one — the skill needs subscription credentials to function and that should be declared explicitly.
Persistence & Privilege
The skill is not always-enabled and does not request elevated or persistent platform privileges. It writes output under a local download/ directory only and does not modify other skills or system-wide configuration.
What to consider before installing
This skill appears to be a legitimate Azure Speech TTS helper, but the package metadata fails to declare the required Azure credentials (AZURE_SPEECH_KEY and AZURE_SPEECH_REGION). Before installing or running it: (1) confirm the source/publisher and that the missing credential declaration is intentional; (2) never paste secrets into config.json — use environment variables as documented; (3) run with --dry-run first to inspect generated SSML; (4) provide only a limited/rotatable Azure Speech key and region you control (rotate the key afterwards if you supply an existing secret); (5) if you need stricter assurance, run the script in an isolated environment (container) and inspect the full script — it only uses Microsoft cognitive endpoints, but you should verify no other network endpoints are present. If you plan to publish or reuse this skill, ask the publisher to update the registry metadata to declare the required env vars / primary credential so the credential requirements are visible up front.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.1
Download zip
latestvk97bjwvrfghptmp03ppmf9yjg183ygx9

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Azure Speech TTS

Use Azure Speech to turn text or SSML into a local audio file under download/.

What this skill does

  • Synthesize plain text into speech
  • Synthesize full SSML payloads directly
  • Choose voice, output format, rate, pitch, style, and role
  • Save the result as a local audio file and print a JSON summary

Configuration

This skill uses a small default config file plus environment variables.

Default config file

File:

  • config.json

Default values:

  • default_voice: zh-CN-Yunqi:DragonHDOmniLatestNeural
  • default_format: mp3
  • default_output_dir: download
  • default_timeout_seconds: 60

Secret values

Set these in the local shell environment:

  • AZURE_SPEECH_KEY
  • AZURE_SPEECH_REGION

Optional environment overrides

  • AZURE_SPEECH_VOICE
  • AZURE_SPEECH_FORMAT

Precedence

Use this order:

  1. CLI flag
  2. Environment variable
  3. config.json
  4. Built-in fallback

Quick start

python3 scripts/azure_tts.py \
  --text "你好,这是一段测试语音。" \
  --voice zh-CN-Yunqi:DragonHDOmniLatestNeural \
  --format mp3 \
  --output download/test.mp3

For SSML:

python3 scripts/azure_tts.py \
  --ssml-file temp/input.ssml \
  --format wav \
  --output download/test.wav

Workflow

  1. Decide whether the input is plain text or full SSML.
  2. Use --text / --text-file for normal narration.
  3. Use --ssml / --ssml-file only when the payload already contains a complete <speak> document.
  4. Pick the voice and output format, or let config.json supply the defaults.
  5. Run scripts/azure_tts.py.
  6. Return the generated audio path to the user.

Rules

  • Prefer plain text unless the user needs pauses, emphasis, multi-voice content, or expressive styling.
  • --ssml input must include a full <speak> root element.
  • Default voice is zh-CN-Yunqi:DragonHDOmniLatestNeural if nothing else is set.
  • Default output folder is download/.
  • If the user does not specify format, use the default MP3 output.
  • Do not put secrets in config.json.

Common formats

See references/azure-speech-cheatsheet.md for the format map and examples.

Short aliases supported by the script:

  • mp3
  • wav
  • pcm
  • ogg

Useful options

  • --voice: Azure voice name, for example en-US-AriaNeural
  • --language: SSML xml:lang for plain-text mode
  • --rate: speaking rate, for example +10%
  • --pitch: pitch adjustment, for example +2st
  • --style: expressive style such as cheerful, sad, chat
  • --style-degree: strength of the expressive style
  • --role: voice role when supported
  • --save-ssml: write the generated SSML to a file for inspection
  • --dry-run: print the generated SSML without calling Azure

Output

The helper script writes the audio file and prints JSON like:

{
  "ok": true,
  "output_path": "download/test.mp3",
  "format": "audio-24khz-48kbitrate-mono-mp3",
  "voice": "zh-CN-Yunqi:DragonHDOmniLatestNeural",
  "language": "zh-CN",
  "bytes": 123456
}

Use the printed output_path as the deliverable path.

Files

5 total
Select a file
Select a file to preview.

Comments

Loading comments…