{"skill":{"slug":"subtitles-from-video","displayName":"Subtitles From Video","summary":"Skip the learning curve of professional editing software. Describe what you want — generate subtitles in English and burn them into the video — and get capti...","description":"---\nname: subtitles-from-video\nversion: \"1.0.0\"\ndisplayName: \"Subtitles From Video — Generate and Embed Video Subtitles\"\ndescription: >\n  Skip the learning curve of professional editing software. Describe what you want — generate subtitles in English and burn them into the video — and get captioned videos back in 30-60 seconds. Upload MP4, MOV, AVI, WebM files up to 500MB, and the AI handles subtitle generation automatically. Ideal for YouTubers, content creators, educators who need accurate captions quickly without manual transcription.\nmetadata: {\"openclaw\": {\"emoji\": \"🎬\", \"requires\": {\"env\": [\"NEMO_TOKEN\"], \"configPaths\": [\"~/.config/nemovideo/\"]}, \"primaryEnv\": \"NEMO_TOKEN\", \"variant\": \"greeting_v2\"}}\n---\n\n## Getting Started\n\n> Ready when you are. Drop your video files here or describe what you want to make.\n\n**Try saying:**\n- \"generate a 3-minute interview recorded on a smartphone into a 1080p MP4\"\n- \"generate subtitles in English and burn them into the video\"\n- \"adding subtitles to YouTube or social media videos for YouTubers, content creators, educators\"\n\n### Quick Start Setup\n\nThis skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know (\"Connecting...\").\n\n**Token check**: Look for `NEMO_TOKEN` in the environment. If found, skip to session creation. Otherwise:\n- Generate a UUID as client identifier\n- POST `https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token` with `X-Client-Id` header\n- Extract `data.token` from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)\n\n**Session**: POST `https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent` with Bearer auth and body `{\"task_name\":\"project\"}`. Keep the returned `session_id` for all operations.\n\nLet the user know with a brief \"Ready!\" when setup is complete. Don't expose tokens or raw API output.\n\n# Subtitles From Video — Generate and Embed Video Subtitles\n\nSend me your video files and describe the result you want. The subtitle generation runs on remote GPU nodes — nothing to install on your machine.\n\nA quick example: upload a 3-minute interview recorded on a smartphone, type \"generate subtitles in English and burn them into the video\", and you'll get a 1080p MP4 back in roughly 30-60 seconds. All rendering happens server-side.\n\nWorth noting: shorter clips under 5 minutes produce the most accurate subtitle sync.\n\n## Matching Input to Actions\n\nUser prompts referencing subtitles from video, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.\n\n| User says... | Action | Skip SSE? |\n|-------------|--------|----------|\n| \"export\" / \"导出\" / \"download\" / \"send me the video\" | → §3.5 Export | ✅ |\n| \"credits\" / \"积分\" / \"balance\" / \"余额\" | → §3.3 Credits | ✅ |\n| \"status\" / \"状态\" / \"show tracks\" | → §3.4 State | ✅ |\n| \"upload\" / \"上传\" / user sends file | → §3.2 Upload | ✅ |\n| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |\n\n## Cloud Render Pipeline Details\n\nEach export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.\n\nHeaders are derived from this file's YAML frontmatter. `X-Skill-Source` is `subtitles-from-video`, `X-Skill-Version` comes from the `version` field, and `X-Skill-Platform` is detected from the install path (`~/.clawhub/` = `clawhub`, `~/.cursor/skills/` = `cursor`, otherwise `unknown`).\n\n**All requests** must include: `Authorization: Bearer <NEMO_TOKEN>`, `X-Skill-Source`, `X-Skill-Version`, `X-Skill-Platform`. Missing attribution headers will cause export to fail with 402.\n\n**API base**: `https://mega-api-prod.nemovideo.ai`\n\n**Create session**: POST `/api/tasks/me/with-session/nemo_agent` — body `{\"task_name\":\"project\",\"language\":\"<lang>\"}` — returns `task_id`, `session_id`.\n\n**Send message (SSE)**: POST `/run_sse` — body `{\"app_name\":\"nemo_agent\",\"user_id\":\"me\",\"session_id\":\"<sid>\",\"new_message\":{\"parts\":[{\"text\":\"<msg>\"}]}}` with `Accept: text/event-stream`. Max timeout: 15 minutes.\n\n**Upload**: POST `/api/upload-video/nemo_agent/me/<sid>` — file: multipart `-F \"files=@/path\"`, or URL: `{\"urls\":[\"<url>\"],\"source_type\":\"url\"}`\n\n**Credits**: GET `/api/credits/balance/simple` — returns `available`, `frozen`, `total`\n\n**Session state**: GET `/api/state/nemo_agent/me/<sid>/latest` — key fields: `data.state.draft`, `data.state.video_infos`, `data.state.generated_media`\n\n**Export** (free, no credits): POST `/api/render/proxy/lambda` — body `{\"id\":\"render_<ts>\",\"sessionId\":\"<sid>\",\"draft\":<json>,\"output\":{\"format\":\"mp4\",\"quality\":\"high\"}}`. Poll GET `/api/render/proxy/lambda/<id>` every 30s until `status` = `completed`. Download URL at `output.url`.\n\nSupported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.\n\n### Reading the SSE Stream\n\nText events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty `data:` lines mean the backend is still working — show \"⏳ Still working...\" every 2 minutes.\n\nAbout 30% of edit operations close the stream without any text. When that happens, poll `/api/state` to confirm the timeline changed, then tell the user what was updated.\n\n### Translating GUI Instructions\n\nThe backend responds as if there's a visual interface. Map its instructions to API calls:\n\n- \"click\" or \"点击\" → execute the action via the relevant endpoint\n- \"open\" or \"打开\" → query session state to get the data\n- \"drag/drop\" or \"拖拽\" → send the edit command through SSE\n- \"preview in timeline\" → show a text summary of current tracks\n- \"Export\" or \"导出\" → run the export workflow\n\n**Draft field mapping**: `t`=tracks, `tt`=track type (0=video, 1=audio, 7=text), `sg`=segments, `d`=duration(ms), `m`=metadata.\n\n```\nTimeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: \"Urban Dreams\" (0-3s)\n```\n\n### Error Handling\n\n| Code | Meaning | Action |\n|------|---------|--------|\n| 0 | Success | Continue |\n| 1001 | Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |\n| 1002 | Session not found | New session §3.0 |\n| 2001 | No credits | Anonymous: show registration URL with `?bind=<id>` (get `<id>` from create-session or state response when needed). Registered: \"Top up credits in your account\" |\n| 4001 | Unsupported file | Show supported formats |\n| 4002 | File too large | Suggest compress/trim |\n| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |\n| 402 | Free plan export blocked | Subscription tier issue, NOT credits. \"Register or upgrade your plan to unlock export.\" |\n| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |\n\n## Common Workflows\n\n**Quick edit**: Upload → \"generate subtitles in English and burn them into the video\" → Download MP4. Takes 30-60 seconds for a 30-second clip.\n\n**Batch style**: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.\n\n**Iterative**: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.\n\n## Tips and Tricks\n\nThe backend processes faster when you're specific. Instead of \"make it look better\", try \"generate subtitles in English and burn them into the video\" — concrete instructions get better results.\n\nMax file size is 500MB. Stick to MP4, MOV, AVI, WebM for the smoothest experience.\n\nExport as MP4 for widest compatibility across platforms and devices.\n","tags":{"latest":"1.0.0"},"stats":{"comments":0,"downloads":318,"installsAllTime":0,"installsCurrent":0,"stars":0,"versions":1},"createdAt":1777831572456,"updatedAt":1778492838254},"latestVersion":{"version":"1.0.0","createdAt":1777831572456,"changelog":"Initial release of subtitles-from-video: generate and embed subtitles in videos easily via cloud processing.\n\n- Upload MP4, MOV, AVI, or WebM files up to 500MB for automatic English subtitle generation and burned-in captions.\n- No manual transcription or complex software required; returns captioned video in 30-60 seconds.\n- Seamless setup: automatic backend token creation and session handling (100 free credits, 7-day expiry).\n- Integrated action handling for upload, export, credits, and session status with user-friendly responses and error messages.\n- Supports batch and iterative workflows; draft timeline can be previewed and refined before export.\n- Clear documentation on supported formats, limits, error handling, and best practices for accurate, fast results.","license":"MIT-0"},"metadata":{"setup":[{"key":"NEMO_TOKEN","required":true}],"os":null,"systems":null},"owner":{"handle":"mhogan2013-9","userId":"s1734kgyh0j16a86enc51zqp6184cy81","displayName":"mhogan2013-9","image":"https://avatars.githubusercontent.com/u/208827708?v=4"},"moderation":{"isSuspicious":false,"isMalwareBlocked":false,"verdict":"clean","reasonCodes":["review.llm_review"],"summary":"Review: review.llm_review","engineVersion":"v2.4.24","updatedAt":1780090737994}}