Youtube Video Subtitle Generator

v1.0.0

generate YouTube video files into captioned video files with this skill. Works with MP4, MOV, AVI, WebM files up to 500MB. YouTubers use it for adding subtit...

⭐ 0· 44·0 current·0 all-time

bypeandrover adam@peand-rover

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for peand-rover/youtube-video-subtitle-generator.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Youtube Video Subtitle Generator" (peand-rover/youtube-video-subtitle-generator) from ClawHub.
Skill page: https://clawhub.ai/peand-rover/youtube-video-subtitle-generator
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: NEMO_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install youtube-video-subtitle-generator

ClawHub CLI

Package manager switcher

npx clawhub@latest install youtube-video-subtitle-generator

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

Name/description match the runtime instructions: the SKILL.md describes uploading video files, creating a session, running renders, polling for results, and returning download URLs. The single required env var (NEMO_TOKEN) is the service token needed to call the stated API endpoints — this is proportionate to the described capability.

ℹ

Instruction Scope

Instructions are limited to connecting to the stated backend, uploading user-supplied files, running render jobs, polling state, and returning results. They do not instruct reading arbitrary system files or unrelated credentials. One ambiguity: frontmatter/metadata references a config path (~/.config/nemovideo/) and asks to auto-detect 'install path' for X-Skill-Platform header — these imply filesystem queries that are not otherwise declared, so confirm whether the skill will access that path.

✓

Install Mechanism

This is an instruction-only skill with no install spec or code files, so nothing is downloaded or written to disk by an installer. That lowers install-time risk.

✓

Credentials

Only one credential is required (NEMO_TOKEN), which the SKILL.md uses directly. The instructions also document obtaining an anonymous token from the service if NEMO_TOKEN is absent — that behavior is consistent with needing service access. No unrelated secrets or multi-service credentials are requested.

✓

Persistence & Privilege

always is false and the skill does not request elevated platform privileges. The SKILL.md describes creating a session on the backend but does not instruct modifying other skills or system-wide settings. The only potential persistence hint is an optional config path in the frontmatter (see instruction_scope note).

Assessment

This skill appears to do what it says: it needs a NEMO_TOKEN to call the nemovideo.ai APIs to upload, render, and download captioned videos. Before installing or providing a token: 1) Confirm you trust the service domain (https://mega-api-prod.nemovideo.ai) and that NEMO_TOKEN is scoped only to this video-rendering service — anyone holding that token can act as you against the API. 2) Ask the publisher whether the skill will read or write the local config path (~/.config/nemovideo/) or probe the install path (the SKILL.md mentions this but the registry metadata is inconsistent). 3) If you don’t want to supply a permanent token, the skill can fetch a time-limited anonymous token per the instructions, but that gives the skill temporary service access. 4) Because this is instruction-only with no install, risk is mainly what you choose to upload and what token you provide — avoid giving high-privilege or unrelated credentials. If you need higher assurance, request the actual code or an official homepage for independent review.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎬 Clawdis

EnvNEMO_TOKEN

Primary envNEMO_TOKEN

latestvk97arnw7z68mgyrp26xy69xbcd85jrq2

44downloads

0stars

1versions

Updated 1d ago

v1.0.0

MIT-0

Getting Started

Ready when you are. Drop your YouTube video files here or describe what you want to make.

Try saying:

"generate a 10-minute YouTube tutorial video into a 1080p MP4"
"generate subtitles in English and add them as burned-in captions"
"adding subtitles to YouTube videos for YouTubers"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

YouTube Video Subtitle Generator — Generate and Embed Video Subtitles

This tool takes your YouTube video files and runs AI subtitle generation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 10-minute YouTube tutorial video and want to generate subtitles in English and add them as burned-in captions — the backend processes it in about 1-2 minutes and hands you a 1080p MP4.

Tip: shorter video segments produce more accurate subtitle sync.

Matching Input to Actions

User prompts referencing youtube video subtitle generator, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Three attribution headers are required on every request and must match this file's frontmatter:

Header	Value
`X-Skill-Source`	`youtube-video-subtitle-generator`
`X-Skill-Version`	frontmatter `version`
`X-Skill-Platform`	auto-detect: `clawhub` / `cursor` / `unknown` from install path

All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

"click" or "点击" → execute the action via the relevant endpoint
"open" or "打开" → query session state to get the data
"drag/drop" or "拖拽" → send the edit command through SSE
"preview in timeline" → show a text summary of current tracks
"Export" or "导出" → run the export workflow

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=<id>` (get `<id>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "generate subtitles in English and add them as burned-in captions" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.

Export as MP4 for widest compatibility across YouTube and social platforms.

Common Workflows

Quick edit: Upload → "generate subtitles in English and add them as burned-in captions" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...