Install
openclaw skills install ponyflashGenerate images, videos, speech audio, and music using the PonyFlash Python SDK. Also handle local media editing with FFmpeg, including clip, concat, transcode, extract audio, frame capture, subtitle capability checks, and ASS subtitle prep. Use when the user asks to create, generate, produce, edit, trim, merge, concatenate, transcode, subtitle, or render AI-generated media content.
openclaw skills install ponyflashThis skill now contains two capability families:
Cloud generation via PonyFlash Python SDK
Local media editing via FFmpeg toolchain
ffmpeg / ffprobe support.Before doing anything, classify the request:
Only do this section when the request needs PonyFlash cloud capabilities.
The FIRST time this skill is activated for a cloud generation task, tell the user the following in your own words:
playbooks/ directory.rk_)On subsequent SDK activations, check whether PONYFLASH_API_KEY is set in the environment. If not, ask the user for the key again.
Once received, set it up:
export PONYFLASH_API_KEY="rk_xxx"
Then install the SDK:
pip install ponyflash
Always verify the key works before any generation task:
from ponyflash import PonyFlash
pony_flash = PonyFlash(api_key="<key from user>")
balance = pony_flash.account.credits()
print(f"Balance: {balance.balance} {balance.currency}")
If verification fails:
Only do this section when the request needs local editing, subtitle, or export work.
bash "{baseDir}/scripts/check_ffmpeg.sh"
bash "{baseDir}/scripts/check_ffmpeg.sh" --require-subtitles-filter
ffmpeg / ffprobe or required filters are missing:| Capability | Resource | Description |
|---|---|---|
| Image generation | pony_flash.images | Text-to-image, image editing with mask/reference images |
| Video generation | pony_flash.video | Text-to-video, first-frame-to-video, OmniHuman, Motion Transfer |
| Speech synthesis | pony_flash.speech | Text-to-speech with voice cloning, emotion control, speed, pitch |
| Music generation | pony_flash.music | Text-to-music with lyrics, style, instrumental mode, continuation |
| Model listing | pony_flash.models | List available models, get model details and supported modes |
| File management | pony_flash.files | Upload, list, get, delete files |
| Account | pony_flash.account | Check credit balance, get recharge link |
| Local media editing | scripts/media_ops.sh | Clip, concat, transcode, extract audio, frame capture |
| FFmpeg environment checks | scripts/check_ffmpeg.sh | Detect ffmpeg / ffprobe and subtitle capabilities |
| Subtitle font prep | scripts/ensure_subtitle_fonts.sh | Keep a reusable local copy of the default subtitle font when explicitly requested |
| ASS subtitle prep | scripts/build_ass_subtitles.py | Adaptive ASS subtitle generation with pre-wrapping |
The playbooks/ directory contains Creative Playbooks — step-by-step production workflow guides for specific content types. Playbooks act as a director layer: they tell you what to create and in what order, while this SKILL.md tells you how to execute generation and editing.
playbooks/ and follow its workflow.Once a playbook is loaded:
When the user asks to create a new playbook, generate a markdown file in playbooks/ following this template:
---
name: Playbook Name
description: One-line summary of what this playbook produces
tags: [keyword1, keyword2, keyword3]
difficulty: beginner | intermediate | advanced
estimated_credits: credit range estimate
output_format: format description (e.g., "vertical 9:16 MP4")
---
# Playbook Name
## Use Cases
When to use this playbook.
## Workflow
### Step 1: Asset Preparation
What the user needs to provide; how to generate missing assets.
### Step 2: Visual Content Generation
Which models to use, recommended parameters, prompt guidance.
### Step 3: Voice / Music
Speech synthesis + background music guidance.
### Step 4: Editing / Assembly
How to assemble, trim, subtitle, transcode, and export with the local FFmpeg workflow.
### Step 5: Output / Optimization
Render settings, format recommendations.
## Prompt Templates
Reusable prompt examples for this content type.
## Notes
Best practices, common pitfalls.
After creating the file, update playbooks/INDEX.md to include the new playbook.
from ponyflash import PonyFlash
pony_flash = PonyFlash(api_key="rk_xxx")
Reads PONYFLASH_API_KEY from environment if api_key is omitted.
All file parameters accept any of these types:
| Input type | Example | Behavior |
|---|---|---|
| URL string | "https://example.com/photo.jpg" | Passed directly to API |
| file_id string | "file_abc123" | Passed directly to API |
Path object | Path("photo.jpg") | Auto-uploaded via presigned URL |
open() file | open("photo.jpg", "rb") | Auto-uploaded via presigned URL |
bytes | image_bytes | Auto-uploaded via presigned URL |
(filename, bytes) tuple | ("photo.jpg", data) | Auto-uploaded with filename |
Temp uploads are cleaned up automatically after generate() completes.
Plain local string paths such as "./photo.jpg" are not supported. For local files, always use Path(...) or open(..., "rb").
Generation object fields: request_id, status, outputs, usage, error.
Convenience properties:
gen.url — first output URL (or None)gen.urls — list of all output URLsgen.credits — credits consumedgen = pony_flash.images.generate(
model="nano-banana-pro",
prompt="A sunset over mountains",
resolution="2K",
aspect_ratio="16:9",
)
print(gen.url)
gen = pony_flash.video.generate(
model="veo-3.1-fast",
prompt="A timelapse of a city at night",
duration=4,
resolution="720p",
aspect_ratio="16:9",
generate_audio=False,
)
print(gen.url)
gen = pony_flash.speech.generate(
model="speech-2.8-hd",
input="Hello, welcome to PonyFlash!",
voice="English_Graceful_Lady",
)
print(gen.url)
gen = pony_flash.music.generate(
model="music-2.5",
prompt="An upbeat electronic dance track",
duration=30,
)
print(gen.url)
page = pony_flash.models.list()
for model in page.items:
print(f"{model.id} ({model.type})")
balance = pony_flash.account.credits()
print(f"Balance: {balance.balance} {balance.currency}")
No PonyFlash API key is needed for local editing, but local FFmpeg capability checks are mandatory.
Use the local FFmpeg workflow when the user asks to:
bash "{baseDir}/scripts/check_ffmpeg.sh"
bash "{baseDir}/scripts/check_ffmpeg.sh" --require-subtitles-filter
bash "{baseDir}/scripts/media_ops.sh" help
taskDir="$(mktemp -d "${TMPDIR:-/tmp}/ponyflash-task.XXXXXX")"
Use this directory for:
.srt / .ass files;Validate outputs after execution.
After the task finishes, delete the temporary task workspace unless the user explicitly asked to keep intermediate artifacts.
basic: requires ffmpeg + ffprobe + libx264 + aacfull: basic plus subtitles filter supportbash "{baseDir}/scripts/media_ops.sh" probe --input "input.mp4"
bash "{baseDir}/scripts/media_ops.sh" clip --input "$taskDir/input.mp4" --output "$taskDir/clip.mp4" --start "00:00:05" --duration "8"
Fast copy mode only when the user explicitly wants speed / near-lossless slicing:
bash "{baseDir}/scripts/media_ops.sh" clip --mode copy --input "$taskDir/input.mp4" --output "$taskDir/clip.mp4" --start "00:00:05" --duration "8"
bash "{baseDir}/scripts/media_ops.sh" concat --input "$taskDir/part1.mp4" --input "$taskDir/part2.mp4" --output "$taskDir/merged.mp4"
Fallback to reencode if copy concat fails:
bash "{baseDir}/scripts/media_ops.sh" concat --mode reencode --input "$taskDir/part1.mp4" --input "$taskDir/part2.mp4" --output "$taskDir/merged.mp4"
bash "{baseDir}/scripts/media_ops.sh" extract-audio --input "$taskDir/input.mp4" --output "$taskDir/audio.m4a"
bash "{baseDir}/scripts/media_ops.sh" transcode --input "$taskDir/input.mov" --output "$taskDir/output.mp4"
bash "{baseDir}/scripts/media_ops.sh" frame --input "$taskDir/input.mp4" --output "$taskDir/cover.jpg" --time "00:00:03"
For .srt / .ass burn-in:
bash "{baseDir}/scripts/check_ffmpeg.sh" --require-subtitles-filter
If subtitle style is unspecified, the agent should use the default subtitle workflow, which stages its runtime font temporarily and cleans it up after export.
Preferred stable entrypoint:
bash "{baseDir}/scripts/media_ops.sh" subtitle-burn --input "$taskDir/input.mp4" --subtitle-file "$taskDir/subtitles.srt" --output "final-output.mp4"
If the task needs adaptive line wrapping or controlled subtitle layout, or if you need to understand the underlying steps:
python3 "{baseDir}/scripts/build_ass_subtitles.py" --help
Default burn pattern:
ffprobe -hide_banner -v error -select_streams v:0 -show_entries stream=width,height -of csv=p=0:s=x "input.mp4"
bash "{baseDir}/scripts/media_ops.sh" subtitle-burn --input "$taskDir/input.mp4" --subtitle-file "$taskDir/subtitles.srt" --output "final-output.mp4"
This path keeps only the final final-output.mp4 by default and removes temporary ASS files and staged fonts. If the user did not explicitly request any staged files, the agent should also delete $taskDir after moving or confirming the final deliverable.
bash "{baseDir}/scripts/ensure_subtitle_fonts.sh"
python3 "{baseDir}/scripts/build_ass_subtitles.py" \
--subtitle-file "$taskDir/subtitles.srt" \
--output-ass "$taskDir/subtitles.ass" \
--video-width 1920 \
--video-height 1080 \
--latin-font-file "$HOME/.cache/ponyflash/fonts/NotoSansCJKsc-Regular.otf" \
--cjk-font-file "$HOME/.cache/ponyflash/fonts/NotoSansCJKsc-Regular.otf"
ffmpeg -i "$taskDir/input.mp4" \
-vf "subtitles=$taskDir/subtitles.ass:fontsdir=$HOME/.cache/ponyflash/fonts" \
-c:v libx264 -preset medium -crf 18 -c:a aac -b:a 192k -movflags +faststart "final-output.mp4"
Default subtitle references:
{baseDir}/assets/fonts.md{baseDir}/assets/subtitle-style.md{baseDir}/reference/operations.md{baseDir}/reference/examples.mdreencode; use copy only when the user explicitly wants speed / minimal loss.copy; fallback to reencode if source parameters differ..m4a.mp4 + libx264 + aac.subtitles; use drawtext only as a plain text fallback.taskDir and delete taskDir at the end of the task.ffmpeg or ffprobe is missing, pause the task, help the user install FFmpeg if needed, and rerun the checks first.subtitles is missing, do not claim the machine can burn .srt / .ass.drawtext exists, explain that this is text overlay fallback, not full subtitle burn-in.reencode.ffmpeg command.CRITICAL: You MUST actually send generated files to the user — never just print a file path as text.
Always save generated files (rendered videos, downloaded media, etc.) to your current working directory (e.g., ./output.mp4), NOT to /tmp/ or other system directories. Many agent platforms restrict file-sending to the workspace directory only. Saving to /tmp/ will cause file delivery to fail silently.
gen.url is already a downloadable URL. Send it to the user, and also use your platform's file-sending capability to send the file directly in the conversation.timeline.render("output.mp4")) — save the output to your working directory, then use your platform's file-sending tool to send the actual file to the user as an attachment. Do NOT just send the file path as a text message.from ponyflash import (
PonyFlash,
InsufficientCreditsError,
RateLimitError,
GenerationFailedError,
AuthenticationError,
)
pony_flash = PonyFlash()
try:
gen = pony_flash.images.generate(model="nanobanana-pro", prompt="A cat")
except AuthenticationError:
print("Invalid or missing API key.")
print("Get your API key at: https://api.ponyflash.com/api-key")
except InsufficientCreditsError as e:
print(f"Not enough credits. Balance: {e.balance}, required: {e.required}")
print("Top up credits at: https://api.ponyflash.com/usage")
except RateLimitError:
print("Rate limited — wait and retry")
except GenerationFailedError as e:
print(f"Generation failed: {e.generation.error.code}")
For advanced PonyFlash SDK usage: See examples/advanced.md
For FFmpeg task patterns:
For complete method signatures, parameter types, and return type fields:
For all available models and their specific parameters, capabilities, and examples: See reference/models/INDEX.md