Ai Image To Video Openai

v1.0.0

Turn a single product photo or illustration into 1080p animated video clips just by typing what you need. Whether it's converting static images into short AI...

0· 17·0 current·0 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
The skill claims to convert images into short videos and all runtime instructions target a nemo-video backend (mega-api-prod.nemovideo.ai) and require a NEMO_TOKEN — this is coherent with the stated purpose. However, the skill name includes 'Openai' which is misleading because no OpenAI endpoints or credentials are referenced. The SKILL.md frontmatter also mentions a config path (~/.config/nemovideo/) even though the registry metadata listed none; minor metadata inconsistency.
Instruction Scope
The SKILL.md contains detailed API workflows (anonymous token acquisition, session creation, SSE messaging, file upload, export polling). All actions are limited to the nemovideo API and to handling user-provided media and session state; it does not instruct the agent to read unrelated files or unrelated credentials. It does, however, instruct the agent to automatically obtain and store tokens and session IDs for future calls, and to upload user images to the remote service (expected for this functionality).
Install Mechanism
No install spec and no code files are present (instruction-only). That minimizes disk-write/execute risk; runtime activity is limited to HTTP calls described in SKILL.md.
Credentials
Only one environment variable is declared (NEMO_TOKEN) and it directly maps to the remote service authentication described in the instructions. The skill also describes how to obtain an anonymous token if none is present. There are no unrelated secret env vars requested.
Persistence & Privilege
The skill asks the agent to store the NEMO_TOKEN and session_id for subsequent calls (normal for session-based APIs). always:false (default) is set. Be aware that sessions may keep render jobs active server-side if the user disconnects; the skill does not request elevated system privileges or to modify other skills.
Assessment
This skill appears to be an instruction-only connector to the 'nemovideo' cloud rendering API and requires uploading your images and holding a NEMO_TOKEN. Before installing: (1) confirm you trust mega-api-prod.nemovideo.ai (images and any sensitive visual data will be uploaded to that service); (2) ask where and how the skill will store the anonymous token/session (environment, config file?) and ensure storage is acceptable; (3) note the skill name mentions 'Openai' but does not use OpenAI — treat that as a labeling inconsistency; (4) avoid sending sensitive or private images until you verify the service's privacy policy and retention rules; (5) if you need stronger assurance, request the maintainer to clarify token storage location and to provide a homepage or official project link for audit.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🎬 Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN
latestvk971nabv87c3v83bsmgtafsvvd8564ha
17downloads
0stars
1versions
Updated 3h ago
v1.0.0
MIT-0

Getting Started

Send me your images and I'll handle the AI video creation. Or just describe what you're after.

Try saying:

  • "convert a single product photo or illustration into a 1080p MP4"
  • "convert this image into a short animated video clip"
  • "converting static images into short AI-generated videos for marketers, content creators, social media managers"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  1. Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  2. Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

AI Image to Video — Convert Images into Video Clips

Send me your images and describe the result you want. The AI video creation runs on remote GPU nodes — nothing to install on your machine.

A quick example: upload a single product photo or illustration, type "convert this image into a short animated video clip", and you'll get a 1080p MP4 back in roughly 30-60 seconds. All rendering happens server-side.

Worth noting: high-contrast images with clear subjects tend to produce smoother motion results.

Matching Input to Actions

User prompts referencing ai image to video openai, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Every API call needs Authorization: Bearer <NEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is ai-image-to-video-openai, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=<id>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow

Reading the SSE Stream

Text events go straight to the user (after GUI translation). Tool calls stay internal. Heartbeats and empty data: lines mean the backend is still working — show "⏳ Still working..." every 2 minutes.

About 30% of edit operations close the stream without any text. When that happens, poll /api/state to confirm the timeline changed, then tell the user what was updated.

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "convert this image into a short animated video clip" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, GIF for the smoothest experience.

Export as MP4 for widest compatibility across social platforms.

Common Workflows

Quick edit: Upload → "convert this image into a short animated video clip" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...