Photo To Video Ai Online

v1.0.0

convert images into animated photo videos with this skill. Works with JPG, PNG, WEBP, HEIC files up to 200MB. social media creators use it for turning photo...

0· 90·0 current·0 all-time

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for francemichaell-15/photo-to-video-ai-online.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Photo To Video Ai Online" (francemichaell-15/photo-to-video-ai-online) from ClawHub.
Skill page: https://clawhub.ai/francemichaell-15/photo-to-video-ai-online
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: NEMO_TOKEN
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install photo-to-video-ai-online

ClawHub CLI

Package manager switcher

npx clawhub@latest install photo-to-video-ai-online
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description match the instructions: the SKILL.md documents uploading images, creating a session, queuing cloud GPU render jobs, and returning MP4s. The single declared credential (NEMO_TOKEN) is consistent with authenticating to the nemovideo.ai backend that the skill calls.
Instruction Scope
Instructions are focused on contacting mega-api-prod.nemovideo.ai endpoints, uploading files, managing sessions, and polling render status. They do instruct the agent to detect an install path (~/.clawhub/, ~/.cursor/skills/) to populate an attribution header and reference a local config path (~/.config/nemovideo/) in the YAML metadata; these filesystem checks are limited but worth noting as they touch local paths for attribution/config detection.
Install Mechanism
No install spec and no code files (instruction-only). That minimizes write-to-disk and arbitrary code install risk.
Credentials
The skill requires a single API token (NEMO_TOKEN), which is appropriate for a remote service integration. The instructions also describe creating an anonymous token when NEMO_TOKEN is absent. Slight inconsistency: registry metadata showed no required config paths, but the SKILL.md YAML frontmatter lists configPaths ("~/.config/nemovideo/") — this mismatch should be clarified because it implies the skill may read a local config directory.
Persistence & Privilege
The skill is not always-enabled and is user-invocable; it does not request persistent installation or elevated platform privileges in the provided instructions.
Assessment
This skill appears to do what it says: it will upload images to nemovideo.ai and return rendered MP4s. Before installing/using it, consider: (1) Privacy: your images are uploaded to an external service — do not send sensitive images you wouldn't want shared. (2) Token scope: only provide a NEMO_TOKEN scoped for this service; prefer using the anonymous-token fallback if you don't want to supply a persistent token. (3) Metadata mismatch: the SKILL.md references a local config path (~/.config/nemovideo/) and checks install paths for attribution headers while the registry lists no config paths — ask the publisher to confirm whether the skill reads that directory and what it stores. (4) Verify the service domain (mega-api-prod.nemovideo.ai) and the skill provenance (source is unknown); if you need stricter guarantees, request source code or a trusted publisher. If these points are acceptable, the skill is internally coherent for its stated purpose.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

🖼️ Clawdis
EnvNEMO_TOKEN
Primary envNEMO_TOKEN
latestvk97djme7v7bhhs1x2fdzph8eah84nv2f
90downloads
0stars
1versions
Updated 2w ago
v1.0.0
MIT-0

Getting Started

Send me your images and I'll handle the AI video creation. Or just describe what you're after.

Try saying:

  • "convert five vacation photos in JPG format into a 1080p MP4"
  • "turn these photos into a slideshow video with smooth transitions"
  • "turning photo collections into shareable videos for social media creators"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

  • Generate a UUID as client identifier
  • POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
  • The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

Photo to Video AI Online — Convert Photos into Shareable Videos

Drop your images in the chat and tell me what you need. I'll handle the AI video creation on cloud GPUs — you don't need anything installed locally.

Here's a typical use: you send a five vacation photos in JPG format, ask for turn these photos into a slideshow video with smooth transitions, and about 30-60 seconds later you've got a MP4 file ready to download. The whole thing runs at 1080p by default.

One thing worth knowing — using 5-10 photos gives the best pacing for short social clips.

Matching Input to Actions

User prompts referencing photo to video ai online, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Base URL: https://mega-api-prod.nemovideo.ai

EndpointMethodPurpose
/api/tasks/me/with-session/nemo_agentPOSTStart a new editing session. Body: {"task_name":"project","language":"<lang>"}. Returns session_id.
/run_ssePOSTSend a user message. Body includes app_name, session_id, new_message. Stream response with Accept: text/event-stream. Timeout: 15 min.
/api/upload-video/nemo_agent/me/<sid>POSTUpload a file (multipart) or URL.
/api/credits/balance/simpleGETCheck remaining credits (available, frozen, total).
/api/state/nemo_agent/me/<sid>/latestGETFetch current timeline state (draft, video_infos, generated_media).
/api/render/proxy/lambdaPOSTStart export. Body: {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll status every 30s.

Accepted file types: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is photo-to-video-ai-online, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

Include Authorization: Bearer <NEMO_TOKEN> and all attribution headers on every request — omitting them triggers a 402 on export.

Error Handling

CodeMeaningAction
0SuccessContinue
1001Bad/expired tokenRe-auth via anonymous-token (tokens expire after 7 days)
1002Session not foundNew session §3.0
2001No creditsAnonymous: show registration URL with ?bind=<id> (get <id> from create-session or state response when needed). Registered: "Top up credits in your account"
4001Unsupported fileShow supported formats
4002File too largeSuggest compress/trim
400Missing X-Client-IdGenerate Client-Id and retry (see §1)
402Free plan export blockedSubscription tier issue, NOT credits. "Register or upgrade your plan to unlock export."
429Rate limit (1 token/client/7 days)Retry in 30s once

SSE Event Handling

EventAction
Text responseApply GUI translation (§4), present to user
Tool call/resultProcess internally, don't forward
heartbeat / empty data:Keep waiting. Every 2 min: "⏳ Still working..."
Stream closesProcess final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Translating GUI Instructions

The backend responds as if there's a visual interface. Map its instructions to API calls:

  • "click" or "点击" → execute the action via the relevant endpoint
  • "open" or "打开" → query session state to get the data
  • "drag/drop" or "拖拽" → send the edit command through SSE
  • "preview in timeline" → show a text summary of current tracks
  • "Export" or "导出" → run the export workflow

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "turn these photos into a slideshow video with smooth transitions" — concrete instructions get better results.

Max file size is 200MB. Stick to JPG, PNG, WEBP, HEIC for the smoothest experience.

Export as MP4 for widest compatibility across all platforms and devices.

Common Workflows

Quick edit: Upload → "turn these photos into a slideshow video with smooth transitions" → Download MP4. Takes 30-60 seconds for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Comments

Loading comments...