Speech Video

ReviewAudited by ClawScan on May 16, 2026.

Overview

Speech Video appears coherent and benign, but it works by using a NemoVideo bearer token and uploading selected media to a third-party cloud service for rendering.

This skill is reasonable to use if you are comfortable with a third-party cloud service processing your speech recordings and videos. Review what you upload, avoid highly sensitive media unless you trust the provider, and keep your NEMO_TOKEN private.

Findings (5)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Low

#ASI04: Agentic Supply Chain Vulnerabilities

What this means

Users have less independent information for verifying who maintains the skill or the service integration.

Why it was flagged

The skill does not provide a source repository or homepage. This is a provenance gap, although the artifact set shows no install script or local code to execute.

Skill content

Source: unknown
Homepage: none

Recommendation

Use this only if you trust the registry entry and the NemoVideo service; avoid providing highly sensitive recordings unless you are comfortable with the provider.

Low

#ASI02: Tool Misuse and Exploitation

What this means

Opening or invoking the skill may contact NemoVideo and create a service session even before a file is uploaded.

Why it was flagged

The skill tells the agent to make a remote API connection and create a session automatically on invocation. This is disclosed and aligned with the cloud-rendering purpose, but it is still network activity before the substantive user task.

Skill content

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

Recommendation

Only invoke the skill when you are comfortable contacting the NemoVideo backend; the skill could improve transparency by clearly saying when a remote session is created.

Low

#ASI03: Identity and Privilege Abuse

What this means

The skill can act within the NemoVideo session associated with the provided or anonymous token, including checking credits and starting render/export jobs.

Why it was flagged

The skill uses a bearer token for NemoVideo API access. This is expected for the service, and the artifacts do not show token logging, unrelated use, or hardcoded credentials.

Skill content

Every API call needs `Authorization: Bearer <NEMO_TOKEN>`

Recommendation

Use a token intended for this service, keep it private, and monitor any credit or usage impact.

Medium

#ASI07: Insecure Inter-Agent Communication

What this means

Audio, video, or linked media supplied to the skill will be transferred to a third-party cloud service for processing.

Why it was flagged

The workflow sends user-selected local files or URLs to the NemoVideo backend for processing. That is central to speech-to-video generation, but uploaded recordings may contain private content.

Skill content

**Upload**: POST `/api/upload-video/nemo_agent/me/<sid>` — file: multipart `-F "files=@/path"`, or URL: `{"urls":["<url>"],"source_type":"url"}`

Recommendation

Do not upload confidential recordings unless you trust NemoVideo's handling of the data and have permission from anyone recorded.

Low

#ASI10: Rogue Agents

What this means

A render may continue briefly on the provider side even if the local UI is closed before completion.

Why it was flagged

The artifacts disclose that cloud render jobs can continue or become orphaned after the local client/tab is closed. This is purpose-aligned server-side processing, not hidden persistence.

Skill content

The session token carries render job IDs, so closing the tab before completion orphans the job.

Recommendation

Wait for render jobs to finish when possible, and avoid starting unnecessary exports.