Visla AI Video Creation

Creates AI-generated videos from text scripts, URLs, or PPT/PDF documents using Visla. Use when the user asks to generate a video, turn a webpage into a video, or convert a PPT/PDF into a video, or when the user asks to check Visla account credits/balance.

Audits

Pass

Install

openclaw skills install visla

Visla Video Generation

Version: 260501-1423

Create AI-generated videos from text scripts, web URLs, or documents (PPT/PDF) using Visla's OpenAPI.

Before You Start

Credentials (NEVER output API keys/secrets in responses):

IMPORTANT: Never output API keys/secrets in responses.

  1. Check if ~/.config/visla/.credentials exists (do NOT read it yet).
  2. If the file exists, use a choice-based confirmation to ask the user: "Found saved credentials. Allow reading ~/.config/visla/.credentials?" Options: Allow / No
  3. If the user selects Allow: proceed with the command.
  4. If the user selects No, or the file does not exist: Ask the user to provide credentials via one of:
    • Environment variables (VISLA_API_KEY, VISLA_API_SECRET)
    • CLI arguments (--key, --secret)
    • Direct input of API key and secret
  5. If provided credentials fail with VISLA_CLI_ERROR_CODE=missing_credentials or VISLA_CLI_ERROR_CODE=auth_failed, ask the user to re-enter valid credentials.

Only process local files (scripts/docs) explicitly provided by the user, and remind users to avoid uploading sensitive data.

  • Tell the user: this is a one-time setup (once configured, they won't need to do this again)
  • Tell the user: get API Key and Secret from https://www.visla.us/visla-api
  • Do not repeat the secrets back in the response.

Credential validity check (practical):

  • If credentials exist but running account fails with VISLA_CLI_ERROR_CODE=missing_credentials or VISLA_CLI_ERROR_CODE=auth_failed, treat credentials as invalid and ask the user to provide real ones.

File format (bash/zsh):

export VISLA_API_KEY="your_key"
export VISLA_API_SECRET="your_secret"

For PowerShell (temporary session):

$env:VISLA_API_KEY = "your_key"
$env:VISLA_API_SECRET = "your_secret"

Scripts: scripts/visla_cli.py (Python), scripts/visla_cli.sh (Bash)

Platform Execution

Default strategy:

  • Prefer Bash on macOS when dependencies are available (the Bash CLI avoids Python SSL-stack issues on some macOS setups).
  • Prefer Python when you're already using a well-configured Python (or when Bash dependencies are missing).

Bash (recommended on macOS; also works on Linux-like environments):

# With user consent, you may source ~/.config/visla/.credentials
export VISLA_API_KEY="your_key"
export VISLA_API_SECRET="your_secret"
./scripts/visla_cli.sh <command>

Python (cross-platform):

python3 scripts/visla_cli.py --key "your_key" --secret "your_secret" <command>
# Or, credentials are auto-detected from ~/.config/visla/.credentials (with user consent):
python3 scripts/visla_cli.py <command>

Windows native (PowerShell/CMD without Bash; Python):

# PowerShell
$env:VISLA_API_KEY = "your_key"
$env:VISLA_API_SECRET = "your_secret"
python scripts/visla_cli.py <command>

Windows note:

  • The agent should prefer running the Python CLI on Windows unless it has verified a Bash environment (WSL/Git Bash) is available.
  • For simple scripts, pass directly: python scripts/visla_cli.py script "Scene 1: ..."
  • For multi-line or complex scripts, use stdin with - (recommended, no temp files):
    @"
    Scene 1: ...
    Scene 2: ...
    "@ | python scripts/visla_cli.py script -
    
  • If you have Python Launcher installed, py -3 scripts/visla_cli.py <command> may work better than python.
  • Credentials:
    • The Python CLI auto-detects ~/.config/visla/.credentials when present.
    • On Windows the default path is typically: %USERPROFILE%\\.config\\visla\\.credentials.

Note: do not print credentials. Prefer environment variables or auto-detected credentials with explicit user consent.

Commands

CommandDescription
/visla script <script-or-@file>Create video from a script (text or a local file)
/visla url <URL>Create video from web page URL
/visla doc <file>Create video from document (PPT/PDF)
/visla idea <text-or-@file>Create video from an idea
/visla visual <file> [file ...]Create video from visual resources (images/videos), supports multiple files
/visla speech <file> [file ...]Create video from speech (audio/video file), supports multiple files
/visla accountShow account info and credit balance
/visla avatarList available AI avatars
/visla voiceList available AI voices

Important: For avatar and voice commands:

  • Run the full CLI command (./visla_cli.sh avatar or ./visla_cli.sh voice).
  • You may filter the output before presenting to the user:
    • For avatar: remove Thumbnail: lines
    • For voice: remove URL: lines
  • Categorize and format avatar results as follows:
    • Group avatars by gender category (Female, Male, Neutral, Dynamic)
    • List each avatar name with (n) where n = number of looks
    • For each look, show: Look Name (lookUuid)
    • Format: - AvatarName (n): Look1 (uuid), Look2 (uuid), ...
    • Example:
      **Female (16):**
      - Emma (5): Blue Dress (1000145), Patterned Dress (1000146), Black Blazer (1000147), Light Gray Blazer (1000148), Emerald Green Pantsuit (1000149)
      
  • Categorize voice results by language/region (e.g., System, US English, Chinese, Japanese, French, etc.)
  • You must NOT omit any items from the list. The user must see all available avatars/voices, even if the list is long.
  • Agents must use the exact ID from the listing when configuring videos.

Optional Parameters

ParameterDescription
-c, --config <file>Path to JSON config file with video options
--avatar <id>Avatar ID to use for the video (get list from avatar command)
--voice <id>Voice ID to use for the video (get list from voice command)

visual command specific

ParameterDescription
--script, -s <text>Script or description text (or @filename)
--style <style>Video style: montage, storytelling (default), explainer

speech command specific

ParameterDescription
--function <func>Speech to video function: SPEECH_TO_VIDEO_SUMMARY or SPEECH_TO_VIDEO_FULL_LENGTH

All other options (aspect_ratio, pace, burn_subtitles, footage_options, bgm_options, etc.) can be set in the config file.

Cleanup: After video creation completes, delete the config file unless it's intended for reuse.

Config File Format (JSON)

All video options can be stored in a JSON config file (nested structure matches API request body):

{
  "video_title": "My Video",
  "video_description": "Video description",
  "project_function": "SPEECH_TO_VIDEO_SUMMARY",
  "script_text_mode": "ai_rewrite",
  "doc_usage": "page_by_page_walkthrough",
  "speaker_notes_verbatim": false,
  "target_video": {
    "aspect_ratio": "16:9",
    "video_pace": "fast",
    "burn_subtitles": false,
    "video_duration_in_seconds": 60
  },
  "avatar_options": {
    "use_avatar": false,
    "look_id": 12345,
    "avatar_layout": "smart_composition",
    "enable_auto_wallpaper": true,
    "enable_in_preview": true
  },
  "voice_options": {
    "use_voice": false,
    "voice_id": 1
  },
  "footage_options": {
    "enable_footage": true,
    "use_free_stocks": true,
    "use_premium_stocks": true,
    "use_premium_stocks_getty": true,
    "use_private_stocks": true,
    "private_stock_ids": 123456
  },
  "bgm_options": {
    "enable_bgm": true,
    "use_free_stocks": true,
    "use_premium_stocks": true
  }
}

Note: avatar_options.avatar_layout accepts only: host_only, host_pip, smart_composition.

CLI arguments (avatar, voice) override config file values.

Source of truth for the exact CLI surface: run scripts/visla_cli.sh --help or python3 scripts/visla_cli.py --help.

Script Format

**Scene 1** (0-10 sec):
**Visual:** A futuristic calendar flipping to 2025 with digital patterns.
**Narrator:** "AI is evolving rapidly! Here are 3 game-changing AI trends."

**Scene 2** (10-25 sec):
**Visual:** Text: "Trend #1: Generative AI Everywhere." Show tools like ChatGPT.
**Narrator:** "Generative AI is dominating industries—creating content and images."

Workflow

The script, url, doc, idea, visual, and speech commands execute the complete flow automatically:

  1. Create project
  2. Poll until generation completes (may take a few minutes)
  3. Auto-export and return download link

Execution Instructions:

  • Inform user that video generation takes some time
  • Report progress status periodically during polling

Timeout Guidance

  • This workflow typically takes 3-10 minutes, but can take up to ~30 minutes in the worst case. Set the task/command timeout to >= 30 minutes (Windows defaults are often ~10 minutes and need to be increased). If you cannot change the timeout, warn the user up front and, on timeout, ask whether to continue or switch to a step-by-step run.
  • If timeout occurs, the CLI returns project_uuid in the output. Inform the user they can manually check project status and continue later using the Visla web interface or API.

Examples

/visla script @myscript.txt
/visla script "Scene 1: ..."
/visla url https://blog.example.com/article
/visla doc presentation.pptx
/visla idea "Create a video about machine learning"
/visla idea @my_idea.txt
/visla visual image.jpg
/visla visual photo1.jpg photo2.jpg photo3.jpg
/visla visual image.jpg --script "Description of the images..."
/visla visual image.jpg --style montage
/visla speech interview.m4a
/visla speech podcast.mp3 audio1.mp3 audio2.mp3
/visla speech podcast.mp3 --function SPEECH_TO_VIDEO_SUMMARY
/visla account
/visla avatar
/visla voice

# With config file
/visla script "Scene 1: Hello" -c config.json

# With avatar/voice (CLI overrides config)
/visla script "Scene 1: Hello" --avatar avatar_123 --voice voice_456

Supported Document Formats

  • PowerPoint: .pptx, .ppt
  • PDF: .pdf

Supported Media Formats

Visual Resources (visual command)

  • Images: .jpg, .jpeg, .png, .gif, .webp
  • Videos: .mp4, .mov, .avi, .mkv

Audio/Speech (speech command)

  • Audio: .mp3, .wav, .m4a, .aac, .flac
  • Videos: .mp4, .mov, .avi, .mkv

Output Format

  • Start: Display "Visla Skill v260501-1423" when skill begins
  • End: Display "Visla Skill v260501-1423 completed" when skill finishes

Security

The CLI scripts enforce the following safety measures to prevent unauthorized file access:

  • Path traversal: Paths containing .. are rejected.
  • System directories: Access to /etc/, /proc/, /sys/, /dev/, /run/, /var/log/ (and Windows equivalents) is denied.
  • Text file extension restriction: The @file syntax in script, idea, and visual --script commands only accepts .txt, .md, .srt, .vtt, .csv files.
  • Document/media file validation: The doc, visual, and speech commands validate file extensions against supported formats before upload.
  • Credentials: The Python CLI auto-detects ~/.config/visla/.credentials only. No arbitrary credential file paths are accepted.
  • User consent: The agent must ask for user consent before accessing local files, as specified in the "Before You Start" section.