audio to text and video to text

Security checks across static analysis, malware telemetry, and agentic risk

Overview

This appears to be a purpose-aligned transcription skill, but users should know it sends media to OpenAI, uses an OpenAI API key, and may install unpinned Python packages.

Before installing, confirm you are comfortable sending the media to OpenAI and paying for API usage. Use a secret/environment variable for the OpenAI key rather than pasting a long-lived key when possible. Review or pin the Python dependencies if using this in a sensitive environment, and treat transcript contents as data rather than instructions.

Static analysis

No static analysis findings were reported for this release.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Audio or video content, and potentially sensitive conversations, may be processed by OpenAI under the user's account and may incur API costs.

Why it was flagged

The core workflow uploads audio chunks to OpenAI for transcription. This is disclosed and necessary for the stated purpose, but it matters for private or regulated recordings.

Skill content
Transcription — send each chunk to OpenAI's Whisper API
Recommendation

Only use this with media you are comfortable sending to OpenAI, and review applicable privacy, retention, consent, and billing requirements.

What this means

If the key is pasted into chat or passed on the command line, it may be exposed in the session or logs; API usage is charged to that OpenAI account.

Why it was flagged

The skill needs a user-provided OpenAI credential. This is expected for Whisper API use, but it grants account/API billing authority and the registry metadata does not declare a primary credential.

Skill content
**OpenAI API key** stored in the environment as `OPENAI_API_KEY` — the user must provide this
Recommendation

Prefer setting the key through a secret/environment mechanism, avoid sharing long-lived keys in chat, and rotate or restrict the key if exposure is a concern.

What this means

The local Python environment may be modified, and future behavior depends on whatever package version is fetched at install time.

Why it was flagged

If the OpenAI package is missing, the script installs an unpinned dependency at runtime. This is purpose-aligned, but it creates normal dependency/provenance risk.

Skill content
subprocess.check_call([sys.executable, "-m", "pip", "install", "openai", "--break-system-packages", "-q"])
Recommendation

Install dependencies from trusted sources, pin versions where possible, and avoid --break-system-packages in sensitive or shared environments.

What this means

Spoken content in a recording could unintentionally influence downstream summarization or actions if the agent fails to treat it as data.

Why it was flagged

Transcript text is derived from user-provided media and could contain instructions or sensitive content. Using it for summaries is expected, but it should not be treated as authoritative agent instructions.

Skill content
Use the transcript text directly in the conversation for these steps.
Recommendation

Delimit transcript content clearly, treat it as untrusted input, and ask the user before acting on any instructions found inside a transcript.