Deepgram Voice Workflow

Security checks across static analysis, malware telemetry, and agentic risk

Overview

This appears to be a straightforward Deepgram speech-to-text/text-to-speech workflow, with the main caveats that it uses a Deepgram API key, sends audio/text to Deepgram, and writes local transcript/audio outputs.

Install only if you are comfortable giving the skill access to a Deepgram API key and sending the relevant audio or reply text to Deepgram. Review where transcript, raw JSON, and MP3 files will be written, and clean them up if they contain sensitive content.

Static analysis

No static analysis findings were reported for this release.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

The skill can use the Deepgram account associated with the API key when transcribing audio or generating speech.

Why it was flagged

This shows the skill can use a local/account API credential. The credential use is disclosed and limited to Deepgram in the provided artifacts, but it is still delegated account access.

Skill content
Set `DEEPGRAM_API_KEY` before use. The bundled scripts also fall back to reading it from: - `/root/.openclaw/.env`
Recommendation

Use a scoped and revocable Deepgram key, confirm /root/.openclaw/.env contains only the intended key, and rotate the key if it has been shared.

What this means

Voice messages may leave the local environment, and transcripts/raw API responses may remain on disk.

Why it was flagged

The script sends the input audio file to Deepgram's transcription API and saves the raw JSON response locally. This is expected for the skill's purpose, but it is a sensitive data flow.

Skill content
curl -sS --max-time 120 "https://api.deepgram.com/v1/listen?model=${model}&smart_format=true&language=${language}&punctuate=true" ... --data-binary @"${in}" > "$json_out"
Recommendation

Only process audio you are allowed to send to Deepgram, and delete transcript or raw JSON files when they contain sensitive content.

What this means

Text used for spoken replies may be transmitted to Deepgram and stored as an MP3 output file.

Why it was flagged

The script sends the text-to-speak payload to Deepgram's speech API and writes the MP3 response locally. This is purpose-aligned, but users should understand that reply text is shared with the provider.

Skill content
curl -sS --max-time 120 "https://api.deepgram.com/v1/speak?model=${model}" ... --data "$payload" > "$out"
Recommendation

Avoid sending confidential reply text unless Deepgram use is acceptable under your privacy or organizational policy.

What this means

A user may not see the API key requirement in registry-level capability or install metadata before opening the skill details.

Why it was flagged

The registry metadata does not advertise the Deepgram credential requirement that the SKILL.md and scripts disclose. This may under-inform users at install time, but the credential behavior is visible in the artifacts.

Skill content
Required env vars: none; Env var declarations: none; Primary credential: none
Recommendation

Declare DEEPGRAM_API_KEY as the primary credential or required env var, and document expected runtime tools such as curl and python3 if the registry supports that metadata.