Slideshow Video

v0.2.2

Generate SEO/GEO-friendly TikTok-style slideshow videos with AI-powered visuals. Combines GPT Image 2 for stunning image generation, automated caption creati...

⭐ 0· 114·1 current·1 all-time

by@x-rayluan

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for x-rayluan/slideshow-video.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Slideshow Video" (x-rayluan/slideshow-video) from ClawHub.
Skill page: https://clawhub.ai/x-rayluan/slideshow-video
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install slideshow-video

ClawHub CLI

Package manager switcher

npx clawhub@latest install slideshow-video

Security Scan

Capability signals

CryptoRequires walletCan make purchasesRequires sensitive credentials

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Benign

View report →

OpenClaw

Suspicious

medium confidence

ℹ

Purpose & Capability

The code (Python scripts) implements the described slideshow pipeline (image resolution, slide PNG generation, ffmpeg export, per-line audio sync). However the README contains Node/npm usage and .mjs examples that do not match the shipped Python scripts, and the image-generation path relies on a local OpenClaw tool-call rather than directly requiring an OPENAI_API_KEY — a design choice that is plausible but not documented in the skill metadata. These documentation/code mismatches are surprising.

Instruction Scope

Runtime instructions and scripts perform network operations (download/caching of remote images, Pinterest HTML requests and API-like calls), call external binaries (ffmpeg/ffprobe), and may POST prompts to a local OpenClaw API endpoint if an OpenClaw session key is present. The SKILL.md and top-level metadata do not clearly state the environment variables the scripts will read (OPENCLAW_BASE_URL, OPENCLAW_SESSION_KEY). The resolve_images code scrapes Pinterest pages and crafts authenticated-like requests using cookies extracted from the site — this conducts broader network activity than a simple local-only slideshow generator and has privacy / TOS implications.

ℹ

Install Mechanism

There is no automated install spec in the registry (instruction-only), and the repository ships Python scripts only. The SKILL.md asks to pip-install Pillow and have ffmpeg available — reasonable and minimal for the task. But the README also includes npm/node commands and .mjs usage which do not match the included Python implementation, indicating outdated or inconsistent packaging instructions.

Credentials

The registry lists no required environment variables, but resolve_images.py reads OPENCLAW_BASE_URL and OPENCLAW_SESSION_KEY to call a local OpenClaw / tool-call API for GPT Image 2 generation. README also suggests adding OPENAI_API_KEY and ELEVENLABS_API_KEY to an .env, but those keys are not directly consumed by the shipped Python code (they would instead be used by the OpenClaw session/tool). This mismatch means required credentials are not declared and the skill may rely on a local agent/session to perform third-party API calls — potentially surprising to users.

✓

Persistence & Privilege

The skill does not request always:true, does not modify other skills, and does not install background daemons. It writes cached images and generated outputs to disk under specified output/cache folders (expected for this tool). The notable privilege is conditional: if given an OpenClaw session key, it can invoke a local session's tool endpoint which could broaden what the skill does — but that is a consequence of the session key being present, not the skill forcibly persisting or elevating itself.

What to consider before installing

This skill largely implements a legitimate slideshow pipeline, but there are important mismatches and things to check before installing: - Undeclared env vars: the Python code will read OPENCLAW_BASE_URL and OPENCLAW_SESSION_KEY to request GPT Image 2 via a local OpenClaw session if you choose the --image-source openai/kie flow. The skill metadata claims no required env vars — that is inaccurate. Do not set a session key for your agent unless you understand and trust the skill and want it to be able to invoke your local session's tools. - Documentation mismatch: README contains Node/npm/.mjs examples that don't match the provided Python scripts. Expect to run the provided Python scripts (they require Pillow and ffmpeg). Confirm which implementation you will run. - Network activity and scraping: resolve_images.py downloads remote images, caches them to disk, and has code that scrapes Pinterest pages and crafts requests using cookies. This is expected for remote image sourcing but has privacy and terms-of-service implications; review licensing for any remotely sourced images before publishing. - Data flow: when image generation is enabled via the OpenClaw session, your prompts and other content will be POSTed to the OpenClaw API endpoint. That local session/tool may then call external providers (OpenAI, KIE, ElevenLabs) using credentials stored in the OpenClaw environment — be deliberate about what credentials are exposed to that session. What would reduce concern: the registry metadata or SKILL.md explicitly listing the environment variables it uses (OPENCLAW_BASE_URL, OPENCLAW_SESSION_KEY and any provider keys), or adding code paths that accept direct provider keys with clear user consent. If you are unsure, run the scripts in an isolated environment, inspect the code (it is included), or avoid giving this skill an OpenClaw session key and use only unsplash/pinterest/local images.

Like a lobster shell, security has layers — review code before you run it.

latestvk972wfz1cpayrybpt9g4yd54es85f0fs

114downloads

0stars

3versions

Updated 3d ago

v0.2.2

MIT-0

Slideshow Video

Generate a repeatable short-form slideshow pipeline from local images, remote image URLs, or lightweight image queries and a JSON project file. This skill covers query resolution, PNG slide generation, MP4 export, optional background music, remote image caching, sentence-level sync exports, and a simple project wrapper that saves output metadata for downstream scheduling.

Image queries can resolve in three ways:

stock-image lookup via Pinterest or Unsplash
native GPT image generation via openai/gpt-image-2
Kie-hosted GPT image generation via kie/gpt-image-2-text-to-image

Quick start

Prepare 5 to 8 local images, remote image URLs, or image queries for one slideshow.
Copy references/pipeline.example.json to a working JSON file and replace the image sources and copy.
Run the full pipeline:

python3 ~/.openclaw/skills/slideshow-video/scripts/run_pipeline.py your-project.json --output-root build --overwrite

To process a directory of project files, use:

python3 ~/.openclaw/skills/slideshow-video/scripts/batch_pipeline.py /path/to/projects --output-root build --overwrite

Review the generated slides and MP4 on a phone-sized canvas.
Use summary.json for caption and hashtag handoff into your posting workflow.

Core resources

scripts/resolve_images.py: resolve imageQuery values into usable remote image URLs or generated local image files
scripts/generate_slides.py: generate 1080x1920 PNG slides from local images, remote image URLs, and text blocks
scripts/export_mp4.py: convert ordered slide PNGs into an H.264 vertical MP4, with optional background music
scripts/export_sync_mp4.py: export a voice-synced MP4 from slide PNGs plus per-line audio files, holding each slide for that line's measured duration
scripts/run_pipeline.py: run one project and emit summary.json
scripts/batch_pipeline.py: run multiple JSON project files from a directory
references/pipeline.example.json: starter project file with slide, caption, hashtag, and video settings
references/slides-config.example.json: simpler slide-only config when you do not need project metadata
references/workflow.md: structure, command examples, shorts sync workflow, and practical caveats

Project JSON format

At the top level, use:

slug: identifier for output folders and the mp4 name
caption: final post caption
hashtags: list of hashtags
defaultImageQuery: optional fallback query for image sourcing
video: export options
audio: optional background music options
slides: the slide array

Inside video:

enabled: set false to skip MP4 export
secondsPerSlide: hold time per slide
fps: output FPS, usually 30
zoom: enable a light Ken Burns style zoom
fade: optional fade in duration per slide

Inside audio:

path: local audio file
url: remote audio URL if ffmpeg can read it in your environment
volume: optional background music volume multiplier, defaults around 0.22

For shorts that need strict voice sync, keep the project JSON focused on slide images plus on-screen text, then generate one audio file per spoken line outside the project JSON and export with scripts/export_sync_mp4.py.

Each slide accepts:

imagePath: local source image
imageUrl: remote source image
imageQuery: short sourcing query such as minimal finance desk
overlay: optional black overlay opacity from 0 to 255
blur: optional Gaussian blur radius
brightness: optional brightness multiplier, for example 0.9
output: optional output filename
text: array of text blocks

Each text block accepts:

text: required displayed text
size: font size in pixels
bold: boolean shortcut for heavier font selection
weight: optional string, bold also works
x: horizontal anchor, defaults to center
y: vertical anchor
align: left, center, or right
maxWidth: wrapping width in pixels
color: hex color, defaults to white
lineSpacing: defaults to 1.2
shadow: defaults to true
strokeWidth and strokeFill: optional text outline
fontPath: optional absolute or local font path

Dependencies

Install Pillow for slide generation:

python3 -m pip install pillow

Install ffmpeg for MP4 export if it is not already present.

Remote images are downloaded and cached automatically when you use imageUrl or when imagePath is itself an http/https URL.

When a slide only has imageQuery, the pipeline can resolve it into either a remote image URL or a generated local image file first, writes resolved-project.json, then continues normally. Review resolved images before posting because query-based sourcing is convenience-first, not quality-safe, and model-generated imagery should also be checked for brand fit.

GPT Image 2 support

Use imageQuery with either --image-source openai or --image-source kie when you want the slideshow pipeline to generate slide art instead of searching the web.

Examples:

python3 ~/.openclaw/skills/slideshow-video/scripts/resolve_images.py project.json --source openai --image-size 1024x1536 --output build/resolved-project.json
python3 ~/.openclaw/skills/slideshow-video/scripts/resolve_images.py project.json --source kie --image-size 1024x1536 --output build/resolved-project.json
python3 ~/.openclaw/skills/slideshow-video/scripts/run_pipeline.py project.json --image-source kie --image-size 1024x1536 --output-root build --overwrite

Notes:

openai maps to openai/gpt-image-2
kie maps to kie/gpt-image-2-text-to-image
GPT image resolution requires an active OpenClaw session runtime so resolve_images.py can call the image_generate tool through the local session API
generated slide assets are written into the resolved project as imagePath values

Good defaults

Keep slide 1 to one strong hook and one supporting line.
Start hooks around 84 to 96 px.
Start body lines around 48 to 60 px.
Keep most text blocks within 820 to 940 px max width.
Use one visual subject per slide when possible.
Start with 3 seconds per slide and zoom: true for a more alive MP4.
Start background music around 0.18 to 0.25 volume so it does not overpower on-screen text.
For TikTok-native shorts, shorten on-screen text until each slide only carries one core idea.
For voice-led shorts, prefer one spoken sentence per slide and use synced export instead of fixed secondsPerSlide.

Editing guidance

Adjust readability in this order:

raise overlay
reduce maxWidth
lower font size slightly
move the y positions away from busy background areas
add strokeWidth if the image is still noisy

If the MP4 feels too static, enable zoom. If it feels too synthetic, disable it and keep the PNG slideshow output instead.

Output expectations

Shorts sync workflow

Use this when voice, image, and on-screen text must stay aligned.

Write one spoken sentence per target slide.
Generate one numbered audio file per sentence, for example line_01.mp3, line_02.mp3.
Build slide PNGs with matching numbered order.
Export with scripts/export_sync_mp4.py so each slide duration is based on the matching line audio length.
Keep captions shorter than the spoken line. Treat the slide text as reinforcement, not a transcript.

Example:

python3 ~/.openclaw/skills/slideshow-video/scripts/generate_slides.py project.json --output-dir build/slides --cache-dir build/cache
python3 ~/.openclaw/skills/slideshow-video/scripts/export_sync_mp4.py build/slides ./line-audio build/post-sync.mp4 --overwrite

The sync export also writes <output>.sync.json with per-slide measured durations.

Output expectations

The pipeline writes:

build/<slug>/resolved-project.json
build/<slug>/slides/*.png
build/<slug>/<slug>.mp4
build/<slug>/summary.json
build/<slug>/cache/* for downloaded remote images

summary.json includes audio metadata when present.

Keep generated outputs outside the skill folder unless you are intentionally updating bundled examples.

Comments

Loading comments...