Podcast Video Recording Software

v1.0.0

convert raw audio/video into polished podcast videos with this skill. Works with MP4, MOV, WAV, MP3 files up to 500MB. podcasters use it for converting podca...

⭐ 0· 82·0 current·0 all-time

by@mory128

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for mory128/podcast-video-recording-software.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Podcast Video Recording Software" (mory128/podcast-video-recording-software) from ClawHub.
Skill page: https://clawhub.ai/mory128/podcast-video-recording-software
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: NEMO_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install podcast-video-recording-software

ClawHub CLI

Package manager switcher

npx clawhub@latest install podcast-video-recording-software

Security Scan

VirusTotal

Benign

View report →

OpenClaw

Benign

medium confidence

✓

Purpose & Capability

Name/description (podcast -> polished video) align with the actual behavior: uploading media, creating render jobs, polling export endpoints. Requesting a single service credential (NEMO_TOKEN) and a config path under ~/.config/nemovideo/ is consistent with a cloud-rendering service client.

ℹ

Instruction Scope

The SKILL.md instructs the agent to obtain an anonymous token (if none provided), create a session, upload user media, and poll render status — all appropriate for the stated purpose. It explicitly tells the agent to not show raw API responses or token values. Important privacy implication: user media files are sent to https://mega-api-prod.nemovideo.ai, which is expected but should be known to the user before uploading sensitive content.

✓

Install Mechanism

No install spec and no code files (instruction-only), so nothing is written to disk by an installer. This is the lowest-risk install model and is coherent with the skill being a thin client for a cloud API.

ℹ

Credentials

Only one credential (NEMO_TOKEN) is required and is the declared primaryEnv — appropriate for a hosted API. The SKILL.md also references a config path (~/.config/nemovideo/) for session storage, which is reasonable but gives the skill a location to persist session tokens/IDs; consider whether you want that stored locally. Headers include X-Skill-Platform which may reveal install-path-derived metadata.

✓

Persistence & Privilege

always:false (not force-included) and model invocation not disabled (normal). The skill does not request system-wide privileges or modifications to other skills. Session persistence via the stated config path is expected for a client that resumes or polls jobs.

Assessment

This skill is coherent with its stated purpose, but it uploads your audio/video to a third-party backend (mega-api-prod.nemovideo.ai) and will create or use a NEMO_TOKEN that grants render credits. Before installing or using: (1) confirm you trust the service (unknown homepage/source here), (2) avoid uploading sensitive or confidential media unless you accept the service's privacy/retention policies, (3) if you prefer control, provide your own NEMO_TOKEN (instead of letting the skill obtain an anonymous token) and review or remove any session/token files under ~/.config/nemovideo/ after use, and (4) verify the domain and terms if possible. Because this is instruction-only with no shipped code, there is nothing else to audit locally — lack of source/homepage reduces confidence, so proceed cautiously.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎙️ Clawdis

EnvNEMO_TOKEN

Primary envNEMO_TOKEN

latestvk97486sdtnrca0hq3d7s2bpnp184p03m

82downloads

0stars

1versions

Updated 2w ago

v1.0.0

MIT-0

Getting Started

Share your raw audio/video and I'll get started on AI podcast video creation. Or just tell me what you're thinking.

Try saying:

"convert my raw audio/video"
"export 1080p MP4"
"turn my podcast audio into a"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Podcast Video Recording Software — Convert Podcasts Into Shareable Videos

Drop your raw audio/video in the chat and tell me what you need. I'll handle the AI podcast video creation on cloud GPUs — you don't need anything installed locally.

Here's a typical use: you send a a 42-minute podcast audio recording with guest interview, ask for turn my podcast audio into a video with waveform animation, captions, and speaker labels, and about 1-3 minutes later you've got a MP4 file ready to download. The whole thing runs at 1080p by default.

One thing worth knowing — uploading a clean audio track separately from video gives the AI better caption accuracy.

Matching Input to Actions

User prompts referencing podcast video recording software, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"	→ §3.3 Credits	✅
"status" / "状态" / "show tracks"	→ §3.4 State	✅
"upload" / "上传" / user sends file	→ §3.2 Upload	✅
Everything else (generate, edit, add BGM…)	→ §3.1 SSE	❌

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

Session — POST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"<lang>"}. Gives you a session_id.
Chat (SSE) — POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
Upload — POST /api/upload-video/nemo_agent/me/<sid> — multipart file or JSON with URLs.
Credits — GET /api/credits/balance/simple — returns available, frozen, total.
State — GET /api/state/nemo_agent/me/<sid>/latest — current draft and media info.
Export — POST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/<id> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is podcast-video-recording-software, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

Include Authorization: Bearer <NEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

"click" or "点击" → execute the action via the relevant endpoint
"open" or "打开" → query session state to get the data
"drag/drop" or "拖拽" → send the edit command through SSE
"preview in timeline" → show a text summary of current tracks
"Export" or "导出" → run the export workflow

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result	Process internally, don't forward
`heartbeat` / empty `data:`	Keep waiting. Every 2 min: "⏳ Still working..."
Stream closes	Process final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Error Handling

Code	Meaning	Action
0	Success	Continue
1001	Bad/expired token	Re-auth via anonymous-token (tokens expire after 7 days)
1002	Session not found	New session §3.0
2001	No credits	Anonymous: show registration URL with `?bind=<id>` (get `<id>` from create-session or state response when needed). Registered: "Top up credits in your account"
4001	Unsupported file	Show supported formats
4002	File too large	Suggest compress/trim
400	Missing X-Client-Id	Generate Client-Id and retry (see §1)
402	Free plan export blocked	Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429	Rate limit (1 token/client/7 days)	Retry in 30s once

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn my podcast audio into a video with waveform animation, captions, and speaker labels" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, WAV, MP3 for the smoothest experience.

Export as MP4 with H.264 codec for broadest compatibility across YouTube, Spotify, and social platforms.

Common Workflows

Quick edit: Upload → "turn my podcast audio into a video with waveform animation, captions, and speaker labels" → Download MP4. Takes 1-3 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...