Speech Video
ReviewAudited by ClawScan on May 16, 2026.
Overview
Speech Video appears coherent and benign, but it works by using a NemoVideo bearer token and uploading selected media to a third-party cloud service for rendering.
This skill is reasonable to use if you are comfortable with a third-party cloud service processing your speech recordings and videos. Review what you upload, avoid highly sensitive media unless you trust the provider, and keep your NEMO_TOKEN private.
Findings (5)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Users have less independent information for verifying who maintains the skill or the service integration.
The skill does not provide a source repository or homepage. This is a provenance gap, although the artifact set shows no install script or local code to execute.
Source: unknown Homepage: none
Use this only if you trust the registry entry and the NemoVideo service; avoid providing highly sensitive recordings unless you are comfortable with the provider.
Opening or invoking the skill may contact NemoVideo and create a service session even before a file is uploaded.
The skill tells the agent to make a remote API connection and create a session automatically on invocation. This is disclosed and aligned with the cloud-rendering purpose, but it is still network activity before the substantive user task.
Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".
Only invoke the skill when you are comfortable contacting the NemoVideo backend; the skill could improve transparency by clearly saying when a remote session is created.
The skill can act within the NemoVideo session associated with the provided or anonymous token, including checking credits and starting render/export jobs.
The skill uses a bearer token for NemoVideo API access. This is expected for the service, and the artifacts do not show token logging, unrelated use, or hardcoded credentials.
Every API call needs `Authorization: Bearer <NEMO_TOKEN>`
Use a token intended for this service, keep it private, and monitor any credit or usage impact.
Audio, video, or linked media supplied to the skill will be transferred to a third-party cloud service for processing.
The workflow sends user-selected local files or URLs to the NemoVideo backend for processing. That is central to speech-to-video generation, but uploaded recordings may contain private content.
**Upload**: POST `/api/upload-video/nemo_agent/me/<sid>` — file: multipart `-F "files=@/path"`, or URL: `{"urls":["<url>"],"source_type":"url"}`Do not upload confidential recordings unless you trust NemoVideo's handling of the data and have permission from anyone recorded.
A render may continue briefly on the provider side even if the local UI is closed before completion.
The artifacts disclose that cloud render jobs can continue or become orphaned after the local client/tab is closed. This is purpose-aligned server-side processing, not hidden persistence.
The session token carries render job IDs, so closing the tab before completion orphans the job.
Wait for render jobs to finish when possible, and avoid starting unnecessary exports.
