Podcast Video Recording Software

v1.0.0

convert raw audio/video into polished podcast videos with this skill. Works with MP4, MOV, WAV, MP3 files up to 500MB. podcasters use it for converting podca...

0· 82·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for mory128/podcast-video-recording-software.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Podcast Video Recording Software" (mory128/podcast-video-recording-software) from ClawHub.
Skill page: https://clawhub.ai/mory128/podcast-video-recording-software
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: NEMO_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install podcast-video-recording-software

ClawHub CLI

Package manager switcher

npx clawhub@latest install podcast-video-recording-software
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description (podcast -> polished video) align with the actual behavior: uploading media, creating render jobs, polling export endpoints. Requesting a single service credential (NEMO_TOKEN) and a config path under ~/.config/nemovideo/ is consistent with a cloud-rendering service client.
Instruction Scope
The SKILL.md instructs the agent to obtain an anonymous token (if none provided), create a session, upload user media, and poll render status — all appropriate for the stated purpose. It explicitly tells the agent to not show raw API responses or token values. Important privacy implication: user media files are sent to https://mega-api-prod.nemovideo.ai, which is expected but should be known to the user before uploading sensitive content.
Install Mechanism
No install spec and no code files (instruction-only), so nothing is written to disk by an installer. This is the lowest-risk install model and is coherent with the skill being a thin client for a cloud API.
Credentials
Only one credential (NEMO_TOKEN) is required and is the declared primaryEnv — appropriate for a hosted API. The SKILL.md also references a config path (~/.config/nemovideo/) for session storage, which is reasonable but gives the skill a location to persist session tokens/IDs; consider whether you want that stored locally. Headers include X-Skill-Platform which may reveal install-path-derived metadata.
Persistence & Privilege
always:false (not force-included) and model invocation not disabled (normal). The skill does not request system-wide privileges or modifications to other skills. Session persistence via the stated config path is expected for a client that resumes or polls jobs.
Assessment
This skill is coherent with its stated purpose, but it uploads your audio/video to a third-party backend (mega-api-prod.nemovideo.ai) and will create or use a NEMO_TOKEN that grants render credits. Before installing or using: (1) confirm you trust the service (unknown homepage/source here), (2) avoid uploading sensitive or confidential media unless you accept the service's privacy/retention policies, (3) if you prefer control, provide your own NEMO_TOKEN (instead of letting the skill obtain an anonymous token) and review or remove any session/token files under ~/.config/nemovideo/ after use, and (4) verify the domain and terms if possible. Because this is instruction-only with no shipped code, there is nothing else to audit locally — lack of source/homepage reduces confidence, so proceed cautiously.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎙️ Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN
latestvk97486sdtnrca0hq3d7s2bpnp184p03m
82downloads
0stars
1versions
Updated 2w ago
v1.0.0
MIT-0

Getting Started

Share your raw audio/video and I'll get started on AI podcast video creation. Or just tell me what you're thinking.

Try saying:

  • "convert my raw audio/video"
  • "export 1080p MP4"
  • "turn my podcast audio into a"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  1. Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  2. Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Podcast Video Recording Software — Convert Podcasts Into Shareable Videos

Drop your raw audio/video in the chat and tell me what you need. I'll handle the AI podcast video creation on cloud GPUs — you don't need anything installed locally.

Here's a typical use: you send a a 42-minute podcast audio recording with guest interview, ask for turn my podcast audio into a video with waveform animation, captions, and speaker labels, and about 1-3 minutes later you've got a MP4 file ready to download. The whole thing runs at 1080p by default.

One thing worth knowing — uploading a clean audio track separately from video gives the AI better caption accuracy.

Matching Input to Actions

User prompts referencing podcast video recording software, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

All calls go to https://mega-api-prod.nemovideo.ai. The main endpoints:

  1. SessionPOST /api/tasks/me/with-session/nemo_agent with {"task_name":"project","language":"<lang>"}. Gives you a session_id.
  2. Chat (SSE)POST /run_sse with session_id and your message in new_message.parts[0].text. Set Accept: text/event-stream. Up to 15 min.
  3. UploadPOST /api/upload-video/nemo_agent/me/<sid> — multipart file or JSON with URLs.
  4. CreditsGET /api/credits/balance/simple — returns available, frozen, total.
  5. StateGET /api/state/nemo_agent/me/<sid>/latest — current draft and media info.
  6. ExportPOST /api/render/proxy/lambda with render ID and draft JSON. Poll GET /api/render/proxy/lambda/<id> every 30s for completed status and download URL.

Formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is podcast-video-recording-software, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

Include Authorization: Bearer <NEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

  • "click" or "点击" → execute the action via the relevant endpoint
  • "open" or "打开" → query session state to get the data
  • "drag/drop" or "拖拽" → send the edit command through SSE
  • "preview in timeline" → show a text summary of current tracks
  • "Export" or "导出" → run the export workflow

SSE Event Handling

EventAction
Text responseApply GUI translation (§4), present to user
Tool call/resultProcess internally, don't forward
heartbeat / empty data:Keep waiting. Every 2 min: "⏳ Still working..."
Stream closesProcess final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Error Handling

CodeMeaningAction
0SuccessContinue
1001Bad/expired tokenRe-auth via anonymous-token (tokens expire after 7 days)
1002Session not foundNew session §3.0
2001No creditsAnonymous: show registration URL with ?bind=<id> (get <id> from create-session or state response when needed). Registered: "Top up credits in your account"
4001Unsupported fileShow supported formats
4002File too largeSuggest compress/trim
400Missing X-Client-IdGenerate Client-Id and retry (see §1)
402Free plan export blockedSubscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429Rate limit (1 token/client/7 days)Retry in 30s once

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn my podcast audio into a video with waveform animation, captions, and speaker labels" — concrete instructions get better results.

Max file size is 500MB. Stick to MP4, MOV, WAV, MP3 for the smoothest experience.

Export as MP4 with H.264 codec for broadest compatibility across YouTube, Spotify, and social platforms.

Common Workflows

Quick edit: Upload → "turn my podcast audio into a video with waveform animation, captions, and speaker labels" → Download MP4. Takes 1-3 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...