Install
openclaw skills install video-gen-scriptHow to generate video scripts for the Video Generator from user prompts.
openclaw skills install video-gen-scriptThis skill provides instructions on how to transform a user's storytelling prompt into a valid input-scripts.json entry for the video generator.
Every script must be an object within the main array in input/input-scripts.json.
{
"id": "unique-id",
"title": "Display Title",
"orientation": "landscape" | "portrait",
"voice": "en-US-JennyNeural" | "en-US-GuyNeural",
"script": "The actual narrative content..."
}
To ensure high-quality, relevant stock footage, use "Director Mode" tags. Place them at the start of the sentence or block they describe.
[Visual: Descriptive Query][Visual: nature], use [Visual: green forest sunlight rays].[Visual: ] tag appears.Example:
"[Visual: futuristic city neon night] The city never sleeps. [Visual: robotic arm assembly] High-tech manufacturing is the backbone of the economy."
You can choose from several high-quality neural voices. Specify these in the voice field of your JSON job.
| Gender | Voice ID | Style/Region |
|---|---|---|
| 👨 Male | en-US-GuyNeural | Deep, Authoritative (Recommended) |
| 👨 Male | en-US-ChristopherNeural | Calm, Steady |
| 👨 Male | en-GB-RyanNeural | British Accent |
| 👨 Male | en-IN-PrabhatNeural | Indian Accent |
| 👩 Female | en-US-JennyNeural | Warm, Professional (Recommended) |
| 👩 Female | en-US-AriaNeural | Friendly, Helpful |
| 👩 Female | en-US-SaraNeural | Cheerful, Bright |
| 👩 Female | en-GB-SoniaNeural | British Accent |
| Key | Type | Description |
|---|---|---|
id | String | Unique slug for the video (used for the folder name). |
title | String | The main title displayed in the video. |
orientation | String | landscape (16:9) or portrait (9:16). |
voice | String | Use one of the Voice IDs from the table above. |
showText | Boolean | (Optional) Set to false to hide captions. |
defaultVideo | String | (Optional) Local filename for fallback (in input-assests/). |
script | String | The content to be spoken, including [Visual: ...] tags. |
script text is exactly what will be spoken. Do NOT include instructions like (Scene 1) in the script text, as the TTS will read it.