audio to text and video to text
Security checks across static analysis, malware telemetry, and agentic risk
Overview
This appears to be a purpose-aligned transcription skill, but users should know it sends media to OpenAI, uses an OpenAI API key, and may install unpinned Python packages.
Before installing, confirm you are comfortable sending the media to OpenAI and paying for API usage. Use a secret/environment variable for the OpenAI key rather than pasting a long-lived key when possible. Review or pin the Python dependencies if using this in a sensitive environment, and treat transcript contents as data rather than instructions.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Audio or video content, and potentially sensitive conversations, may be processed by OpenAI under the user's account and may incur API costs.
The core workflow uploads audio chunks to OpenAI for transcription. This is disclosed and necessary for the stated purpose, but it matters for private or regulated recordings.
Transcription — send each chunk to OpenAI's Whisper API
Only use this with media you are comfortable sending to OpenAI, and review applicable privacy, retention, consent, and billing requirements.
If the key is pasted into chat or passed on the command line, it may be exposed in the session or logs; API usage is charged to that OpenAI account.
The skill needs a user-provided OpenAI credential. This is expected for Whisper API use, but it grants account/API billing authority and the registry metadata does not declare a primary credential.
**OpenAI API key** stored in the environment as `OPENAI_API_KEY` — the user must provide this
Prefer setting the key through a secret/environment mechanism, avoid sharing long-lived keys in chat, and rotate or restrict the key if exposure is a concern.
The local Python environment may be modified, and future behavior depends on whatever package version is fetched at install time.
If the OpenAI package is missing, the script installs an unpinned dependency at runtime. This is purpose-aligned, but it creates normal dependency/provenance risk.
subprocess.check_call([sys.executable, "-m", "pip", "install", "openai", "--break-system-packages", "-q"])
Install dependencies from trusted sources, pin versions where possible, and avoid --break-system-packages in sensitive or shared environments.
Spoken content in a recording could unintentionally influence downstream summarization or actions if the agent fails to treat it as data.
Transcript text is derived from user-provided media and could contain instructions or sensitive content. Using it for summaries is expected, but it should not be treated as authoritative agent instructions.
Use the transcript text directly in the conversation for these steps.
Delimit transcript content clearly, treat it as untrusted input, and ask the user before acting on any instructions found inside a transcript.
