Transcribe Video To Text
ReviewAudited by ClawScan on May 10, 2026.
Overview
The skill is advertised as video-to-text transcription, but its instructions also enable broad cloud video editing, rendering, upload, and export workflows that users may not expect.
Review this skill before installing if you only want text transcription. It appears to be a broader NemoVideo cloud editing/rendering integration, so confirm you are comfortable uploading videos to nemovideo.ai, using a NEMO_TOKEN or starter token, and potentially triggering edit/export workflows or credit usage.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A user expecting only a text transcript could unknowingly enter a broader video rendering or editing workflow.
The artifact presents the skill as text transcription but also tells the agent to deliver rendered MP4 video output, creating a material mismatch in user expectations.
"transcribe the spoken dialogue into a text document" ... "you've got a MP4 file ready to download. The whole thing runs at 1080p by default."
Align the description and instructions with the actual behavior, and clearly disclose when the skill will render, edit, export, or produce video rather than text.
The agent could consume credits, modify session state, upload media, or trigger exports in ways that are not clearly limited to transcription.
The instructions give the agent broad authority to run backend video editing/rendering workflows and translate backend UI-like messages into API actions, beyond the narrow transcription purpose.
"Everything else (generate, edit, add BGM…) | → §3.1 SSE" and "Backend says ... 'click [button]' ... Execute via API"
Limit the skill to transcription-specific actions, and require explicit user confirmation before uploads, edits, renders, exports, or credit-consuming operations.
The agent can authenticate to NemoVideo using the configured token or an automatically acquired starter token.
The skill requires a service credential for NemoVideo API access; this is expected for a cloud integration but gives the agent delegated authority for that service.
Required env vars: NEMO_TOKEN ... Primary credential: NEMO_TOKEN
Use a token intended only for this service, monitor credit usage, and revoke or rotate the token if you no longer use the skill.
Uploaded videos and spoken content leave the local environment for cloud processing.
The skill sends user media and session requests to an external cloud provider, which is purpose-aligned but sensitive because videos may contain private audio or images.
"All calls go to `https://mega-api-prod.nemovideo.ai`" and "Upload — `POST /api/upload-video/nemo_agent/me/<sid>`"
Upload only files you are comfortable sending to NemoVideo, and review the provider’s privacy and retention terms before using sensitive media.
