Visla AI Video Creation

Creates AI-generated videos from text scripts, URLs, or PPT/PDF documents using Visla. Use when the user asks to generate a video, turn a webpage into a video, or convert a PPT/PDF into a video, or when the user asks to check Visla account credits/balance.

Audits

Pass

ClawScanPass

Agentic behavior and permission review.

Static analysisPass

Pattern checks against bundled files.

VirusTotalPass

Multi-engine malware detections and file reputation.

Install

openclaw skills install visla

Visla Video Generation

Version: 260501-1423

Create AI-generated videos from text scripts, web URLs, or documents (PPT/PDF) using Visla's OpenAPI.

Before You Start

Credentials (NEVER output API keys/secrets in responses):

IMPORTANT: Never output API keys/secrets in responses.

Check if ~/.config/visla/.credentials exists (do NOT read it yet).
If the file exists, use a choice-based confirmation to ask the user: "Found saved credentials. Allow reading ~/.config/visla/.credentials?" Options: Allow / No
If the user selects Allow: proceed with the command.
If the user selects No, or the file does not exist: Ask the user to provide credentials via one of:
- Environment variables (VISLA_API_KEY, VISLA_API_SECRET)
- CLI arguments (--key, --secret)
- Direct input of API key and secret
If provided credentials fail with VISLA_CLI_ERROR_CODE=missing_credentials or VISLA_CLI_ERROR_CODE=auth_failed, ask the user to re-enter valid credentials.

Only process local files (scripts/docs) explicitly provided by the user, and remind users to avoid uploading sensitive data.

Tell the user: this is a one-time setup (once configured, they won't need to do this again)
Tell the user: get API Key and Secret from https://www.visla.us/visla-api
Do not repeat the secrets back in the response.

Credential validity check (practical):

If credentials exist but running account fails with VISLA_CLI_ERROR_CODE=missing_credentials or VISLA_CLI_ERROR_CODE=auth_failed, treat credentials as invalid and ask the user to provide real ones.

File format (bash/zsh):

export VISLA_API_KEY="your_key"
export VISLA_API_SECRET="your_secret"

For PowerShell (temporary session):

$env:VISLA_API_KEY = "your_key"
$env:VISLA_API_SECRET = "your_secret"

Scripts: scripts/visla_cli.py (Python), scripts/visla_cli.sh (Bash)

Platform Execution

Default strategy:

Prefer Bash on macOS when dependencies are available (the Bash CLI avoids Python SSL-stack issues on some macOS setups).
Prefer Python when you're already using a well-configured Python (or when Bash dependencies are missing).

Bash (recommended on macOS; also works on Linux-like environments):

# With user consent, you may source ~/.config/visla/.credentials
export VISLA_API_KEY="your_key"
export VISLA_API_SECRET="your_secret"
./scripts/visla_cli.sh <command>

Python (cross-platform):

python3 scripts/visla_cli.py --key "your_key" --secret "your_secret" <command>
# Or, credentials are auto-detected from ~/.config/visla/.credentials (with user consent):
python3 scripts/visla_cli.py <command>

Windows native (PowerShell/CMD without Bash; Python):

# PowerShell
$env:VISLA_API_KEY = "your_key"
$env:VISLA_API_SECRET = "your_secret"
python scripts/visla_cli.py <command>

Windows note:

The agent should prefer running the Python CLI on Windows unless it has verified a Bash environment (WSL/Git Bash) is available.
For simple scripts, pass directly: python scripts/visla_cli.py script "Scene 1: ..."
For multi-line or complex scripts, use stdin with - (recommended, no temp files):
```
@"
Scene 1: ...
Scene 2: ...
"@ | python scripts/visla_cli.py script -
```
If you have Python Launcher installed, py -3 scripts/visla_cli.py <command> may work better than python.
Credentials:
- The Python CLI auto-detects ~/.config/visla/.credentials when present.
- On Windows the default path is typically: %USERPROFILE%\\.config\\visla\\.credentials.

Note: do not print credentials. Prefer environment variables or auto-detected credentials with explicit user consent.

Commands

Command	Description
`/visla script <script-or-@file>`	Create video from a script (text or a local file)
`/visla url <URL>`	Create video from web page URL
`/visla doc <file>`	Create video from document (PPT/PDF)
`/visla idea <text-or-@file>`	Create video from an idea
`/visla visual <file> [file ...]`	Create video from visual resources (images/videos), supports multiple files
`/visla speech <file> [file ...]`	Create video from speech (audio/video file), supports multiple files
`/visla account`	Show account info and credit balance
`/visla avatar`	List available AI avatars
`/visla voice`	List available AI voices

Important: For avatar and voice commands:

Run the full CLI command (./visla_cli.sh avatar or ./visla_cli.sh voice).
You may filter the output before presenting to the user:
- For avatar: remove Thumbnail: lines
- For voice: remove URL: lines
Categorize and format avatar results as follows:
- Group avatars by gender category (Female, Male, Neutral, Dynamic)
- List each avatar name with (n) where n = number of looks
- For each look, show: Look Name (lookUuid)
- Format: - AvatarName (n): Look1 (uuid), Look2 (uuid), ...
- Example:
```
**Female (16):**
- Emma (5): Blue Dress (1000145), Patterned Dress (1000146), Black Blazer (1000147), Light Gray Blazer (1000148), Emerald Green Pantsuit (1000149)
```
Categorize voice results by language/region (e.g., System, US English, Chinese, Japanese, French, etc.)
You must NOT omit any items from the list. The user must see all available avatars/voices, even if the list is long.
Agents must use the exact ID from the listing when configuring videos.

Optional Parameters

Parameter	Description
`-c, --config <file>`	Path to JSON config file with video options
`--avatar <id>`	Avatar ID to use for the video (get list from `avatar` command)
`--voice <id>`	Voice ID to use for the video (get list from `voice` command)

visual command specific

Parameter	Description
`--script, -s <text>`	Script or description text (or @filename)
`--style <style>`	Video style: `montage`, `storytelling` (default), `explainer`

speech command specific

Parameter	Description
`--function <func>`	Speech to video function: `SPEECH_TO_VIDEO_SUMMARY` or `SPEECH_TO_VIDEO_FULL_LENGTH`

All other options (aspect_ratio, pace, burn_subtitles, footage_options, bgm_options, etc.) can be set in the config file.

Cleanup: After video creation completes, delete the config file unless it's intended for reuse.

Config File Format (JSON)

All video options can be stored in a JSON config file (nested structure matches API request body):

{
  "video_title": "My Video",
  "video_description": "Video description",
  "project_function": "SPEECH_TO_VIDEO_SUMMARY",
  "script_text_mode": "ai_rewrite",
  "doc_usage": "page_by_page_walkthrough",
  "speaker_notes_verbatim": false,
  "target_video": {
    "aspect_ratio": "16:9",
    "video_pace": "fast",
    "burn_subtitles": false,
    "video_duration_in_seconds": 60
  },
  "avatar_options": {
    "use_avatar": false,
    "look_id": 12345,
    "avatar_layout": "smart_composition",
    "enable_auto_wallpaper": true,
    "enable_in_preview": true
  },
  "voice_options": {
    "use_voice": false,
    "voice_id": 1
  },
  "footage_options": {
    "enable_footage": true,
    "use_free_stocks": true,
    "use_premium_stocks": true,
    "use_premium_stocks_getty": true,
    "use_private_stocks": true,
    "private_stock_ids": 123456
  },
  "bgm_options": {
    "enable_bgm": true,
    "use_free_stocks": true,
    "use_premium_stocks": true
  }
}

Note: avatar_options.avatar_layout accepts only: host_only, host_pip, smart_composition.

CLI arguments (avatar, voice) override config file values.

Source of truth for the exact CLI surface: run scripts/visla_cli.sh --help or python3 scripts/visla_cli.py --help.

Script Format

**Scene 1** (0-10 sec):
**Visual:** A futuristic calendar flipping to 2025 with digital patterns.
**Narrator:** "AI is evolving rapidly! Here are 3 game-changing AI trends."

**Scene 2** (10-25 sec):
**Visual:** Text: "Trend #1: Generative AI Everywhere." Show tools like ChatGPT.
**Narrator:** "Generative AI is dominating industries—creating content and images."

Workflow

The script, url, doc, idea, visual, and speech commands execute the complete flow automatically:

Create project
Poll until generation completes (may take a few minutes)
Auto-export and return download link

Execution Instructions:

Inform user that video generation takes some time
Report progress status periodically during polling

Timeout Guidance

This workflow typically takes 3-10 minutes, but can take up to ~30 minutes in the worst case. Set the task/command timeout to >= 30 minutes (Windows defaults are often ~10 minutes and need to be increased). If you cannot change the timeout, warn the user up front and, on timeout, ask whether to continue or switch to a step-by-step run.
If timeout occurs, the CLI returns project_uuid in the output. Inform the user they can manually check project status and continue later using the Visla web interface or API.

Examples

/visla script @myscript.txt
/visla script "Scene 1: ..."
/visla url https://blog.example.com/article
/visla doc presentation.pptx
/visla idea "Create a video about machine learning"
/visla idea @my_idea.txt
/visla visual image.jpg
/visla visual photo1.jpg photo2.jpg photo3.jpg
/visla visual image.jpg --script "Description of the images..."
/visla visual image.jpg --style montage
/visla speech interview.m4a
/visla speech podcast.mp3 audio1.mp3 audio2.mp3
/visla speech podcast.mp3 --function SPEECH_TO_VIDEO_SUMMARY
/visla account
/visla avatar
/visla voice

# With config file
/visla script "Scene 1: Hello" -c config.json

# With avatar/voice (CLI overrides config)
/visla script "Scene 1: Hello" --avatar avatar_123 --voice voice_456

Supported Document Formats

PowerPoint: .pptx, .ppt
PDF: .pdf

Supported Media Formats

Visual Resources (visual command)

Images: .jpg, .jpeg, .png, .gif, .webp
Videos: .mp4, .mov, .avi, .mkv

Audio/Speech (speech command)

Audio: .mp3, .wav, .m4a, .aac, .flac
Videos: .mp4, .mov, .avi, .mkv

Output Format

Start: Display "Visla Skill v260501-1423" when skill begins
End: Display "Visla Skill v260501-1423 completed" when skill finishes

Security

The CLI scripts enforce the following safety measures to prevent unauthorized file access:

Path traversal: Paths containing .. are rejected.
System directories: Access to /etc/, /proc/, /sys/, /dev/, /run/, /var/log/ (and Windows equivalents) is denied.
Text file extension restriction: The @file syntax in script, idea, and visual --script commands only accepts .txt, .md, .srt, .vtt, .csv files.
Document/media file validation: The doc, visual, and speech commands validate file extensions against supported formats before upload.
Credentials: The Python CLI auto-detects ~/.config/visla/.credentials only. No arbitrary credential file paths are accepted.
User consent: The agent must ask for user consent before accessing local files, as specified in the "Before You Start" section.