Youtube Video Caption Generator
PassAudited by ClawScan on Apr 30, 2026.
Overview
This instruction-only skill is mainly a disclosed cloud video-captioning workflow, but users should understand that videos, prompts, and a service token are sent to a third-party backend.
Before installing, be comfortable with sending your video files, URLs, and editing prompts to the NemoVideo cloud backend. Protect NEMO_TOKEN like a password, and use the skill only with media you intend to process through that third-party service.
Findings (7)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
The remote service can guide the agent's next editing or export steps within the video workflow.
The skill tells the agent to translate backend-provided GUI-style messages into API actions. This is part of the intended workflow, but it makes backend text operationally influential.
Backend Response Translation ... "click [button]" / "点击" | Execute via API
Use the skill for its intended video-captioning workflow and review important actions such as uploads and exports.
Files or URLs you provide may be uploaded to the NemoVideo backend and processed in the cloud.
The skill uses network API operations to upload media, send messages, render, poll status, and download results. These operations are central to the stated caption-generation purpose.
Upload: POST `/api/upload-video/nemo_agent/me/<sid>` ... Export ... POST `/api/render/proxy/lambda`
Only provide videos or URLs you intend to send to the service, and confirm that the rendered output is what you expect.
The backend session and requests are authorized using NEMO_TOKEN or an anonymous token created for this service.
The skill requires a service token or creates an anonymous starter token for the backend. This is expected for the integrated rendering service, and no hardcoded token or unrelated credential use is shown.
If `NEMO_TOKEN` is in the environment, use it directly ... Otherwise, acquire a free starter token ... Include `Authorization: Bearer <NEMO_TOKEN>`
Treat NEMO_TOKEN as a credential, avoid sharing it, and revoke or rotate it if you no longer trust the service.
Users have limited registry-provided information for independently verifying the publisher or service provider.
The package has limited provenance information. There is no installable code in the artifact, but the skill depends on a named external backend service.
Source: unknown; Homepage: none
Verify that you are comfortable using the NemoVideo backend before uploading private or unreleased videos.
The service may retain state about your video project during the session, and that state can influence later export or status actions.
The skill retrieves and relies on backend session state for the video draft and generated media. This is expected for cloud rendering, but it is persistent task context.
Session state: GET `/api/state/nemo_agent/me/<sid>/latest` — key fields: `data.state.draft`, `data.state.video_infos`, `data.state.generated_media`
Keep projects and sessions scoped to the videos you intend to process, especially for private media.
Your prompts and project context are sent to the backend agent that performs the captioning/editing workflow.
The skill communicates with an external backend agent over SSE using a bearer token and session ID. This is disclosed and purpose-aligned, but it is an inter-service agent workflow.
Send message (SSE): POST `/run_sse` — body `{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>"...}` with `Accept: text/event-stream`Avoid sending confidential video content or prompts unless you are comfortable with the backend service handling them.
The chat may not display every technical backend step, even though a third-party session is being established.
The skill instructs the agent not to show connection details during setup. This may keep the conversation simple, but users should still understand that a backend token/session is being created and used.
Tell the user you're ready. Keep the technical details out of the chat.
The skill should keep user-facing explanations concise while still being transparent that files are processed by a cloud backend.
