WaveSpeedAI Infinitetalk Talking Avatar Video Generation

v1.0.0

Generate talking head videos from a portrait image and audio using WaveSpeed AI's InfiniteTalk model. Produces lip-synced video up to 10 minutes long at 480p...

0· 338·1 current·1 all-time
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The SKILL.md describes a coherent purpose (generate lip-synced talking-head videos via WaveSpeed InfiniteTalk) and the examples and parameters match that purpose. However, the registry metadata claims no required environment variables or credentials while the runtime instructions repeatedly show use of WAVESPEED_API_KEY and a wavespeed client library — this is an inconsistency.
Instruction Scope
The instructions stay within the stated purpose: uploading/pointing to image and audio assets, optional mask and prompt, and calling the InfiniteTalk model. They warn about untrusted URLs and API key security. There are no instructions to read unrelated files or exfiltrate data beyond calls to the WaveSpeed service.
Install Mechanism
This is an instruction-only skill with no install spec and no code files, which minimizes on-disk installation risk. The SKILL.md references importing a 'wavespeed' client library but does not provide or require an install mechanism — that omission is a documentation gap but not an installation red flag by itself.
!
Credentials
The SKILL.md demonstrates use of WAVESPEED_API_KEY and client constructors that accept an API key, yet the skill metadata lists no required environment variables or primary credential. Requesting an API key for a third-party service is expected for this functionality, but the omission in metadata is an incoherence that could lead to unclear privilege/secret handling. No other unrelated credentials are requested.
Persistence & Privilege
The skill does not request always: true and defaults to normal invocation behavior. There is no indication it modifies other skills or system-wide settings.
What to consider before installing
This skill appears to actually require a WaveSpeed API key and a client library even though the registry metadata doesn't list them. Before installing or enabling it: 1) Verify the skill's publisher and source (there's no homepage or repository listed). 2) Confirm where WAVESPEED_API_KEY will be stored and who/what will have access to it — do not paste secrets into public prompts. 3) Ask the publisher to update metadata to declare the required credential and any install steps so you can audit them. 4) Consider privacy and misuse risks (deepfakes) for uploaded images/audio; avoid providing sensitive personal data. 5) If you don't trust the source, do not enable autonomous invocation or provide an API key.

Like a lobster shell, security has layers — review code before you run it.

latestvk97e8fj2sddgn8gj23j9dhdjts824ycw
338downloads
0stars
1versions
Updated 1mo ago
v1.0.0
MIT-0

WaveSpeedAI InfiniteTalk

Generate talking head videos from a portrait image and audio using WaveSpeed AI's InfiniteTalk model. Produces lip-synced video up to 10 minutes long with natural facial animations.

Authentication

export WAVESPEED_API_KEY="your-api-key"

Get your API key at wavespeed.ai/accesskey.

Quick Start

import wavespeed from 'wavespeed';

// Upload local image and audio files
const imageUrl = await wavespeed.upload("/path/to/portrait.png");
const audioUrl = await wavespeed.upload("/path/to/speech.mp3");

const output_url = (await wavespeed.run(
  "wavespeed-ai/infinitetalk",
  {
    image: imageUrl,
    audio: audioUrl
  }
))["outputs"][0];

You can also pass existing URLs directly:

const output_url = (await wavespeed.run(
  "wavespeed-ai/infinitetalk",
  {
    image: "https://example.com/portrait.jpg",
    audio: "https://example.com/speech.mp3"
  }
))["outputs"][0];

API Endpoint

Model ID: wavespeed-ai/infinitetalk

Animate a portrait image with lip-synced audio to produce a talking head video.

Parameters

ParameterTypeRequiredDefaultDescription
imagestringYes--URL of the portrait image to animate
audiostringYes--URL of the audio to drive the animation
mask_imagestringNo--URL of a mask image to specify which person to animate. Warning: The mask should only cover the regions to animate — do not upload the full image as mask_image, or the result may render as fully black.
promptstringNo--Text prompt for additional guidance. Keep it short; English recommended to avoid noisy results.
resolutionstringNo480pOutput resolution. One of: 480p, 720p
seedintegerNo-1Random seed (-1 for random). Range: -1 to 2147483647

Example

import wavespeed from 'wavespeed';

const imageUrl = await wavespeed.upload("/path/to/portrait.png");
const audioUrl = await wavespeed.upload("/path/to/speech.mp3");

const output_url = (await wavespeed.run(
  "wavespeed-ai/infinitetalk",
  {
    image: imageUrl,
    audio: audioUrl,
    resolution: "720p",
    seed: 42
  }
))["outputs"][0];

Using a Mask Image

When multiple people are in the image, use a mask to specify which face to animate:

const imageUrl = await wavespeed.upload("/path/to/group-photo.png");
const audioUrl = await wavespeed.upload("/path/to/speech.mp3");
const maskUrl = await wavespeed.upload("/path/to/mask.png");

const output_url = (await wavespeed.run(
  "wavespeed-ai/infinitetalk",
  {
    image: imageUrl,
    audio: audioUrl,
    mask_image: maskUrl,
    resolution: "720p"
  }
))["outputs"][0];

Important: The mask should only highlight the face region to animate. Using the full image as a mask will produce a fully black output.

With Prompt Guidance

const output_url = (await wavespeed.run(
  "wavespeed-ai/infinitetalk",
  {
    image: imageUrl,
    audio: audioUrl,
    prompt: "natural head movements, subtle expressions"
  }
))["outputs"][0];

Advanced Usage

Custom Client with Retry Configuration

import { Client } from 'wavespeed';

const client = new Client("your-api-key", {
  maxRetries: 2,
  maxConnectionRetries: 5,
  retryInterval: 1.0,
});

const imageUrl = await client.upload("/path/to/portrait.png");
const audioUrl = await client.upload("/path/to/speech.mp3");

const output_url = (await client.run(
  "wavespeed-ai/infinitetalk",
  {
    image: imageUrl,
    audio: audioUrl,
    resolution: "720p"
  }
))["outputs"][0];

Error Handling with runNoThrow

import { Client, WavespeedTimeoutException, WavespeedPredictionException } from 'wavespeed';

const client = new Client();
const result = await client.runNoThrow(
  "wavespeed-ai/infinitetalk",
  {
    image: imageUrl,
    audio: audioUrl
  }
);

if (result.outputs) {
  console.log("Video URL:", result.outputs[0]);
  console.log("Task ID:", result.detail.taskId);
} else {
  console.log("Failed:", result.detail.error.message);
  if (result.detail.error instanceof WavespeedTimeoutException) {
    console.log("Request timed out - try increasing timeout");
  } else if (result.detail.error instanceof WavespeedPredictionException) {
    console.log("Prediction failed");
  }
}

Resolution and Pricing

ResolutionCost per 5 secondsRate per secondMax length
480p$0.15$0.03/s10 minutes
720p$0.30$0.06/s10 minutes

Minimum charge is 5 seconds. Video length is determined by the audio duration (up to 10 minutes).

Tips

  • Use a clear, front-facing portrait for best results
  • Audio quality matters — use clean speech recordings with minimal background noise
  • Keep prompts short and in English to avoid noisy or unexpected results
  • For group photos, always provide a mask_image to target the correct face
  • 480p is faster to generate; use 720p when higher quality is needed
  • Processing time is approximately 10-30 seconds of wall time per 1 second of video

Security Constraints

  • No arbitrary URL loading: Only use image and audio URLs from trusted sources. Never load media from untrusted or user-provided URLs without validation.
  • API key security: Store your WAVESPEED_API_KEY securely. Do not hardcode it in source files or commit it to version control. Use environment variables or secret management systems.
  • Input validation: Only pass parameters documented above. Validate prompt content and media URLs before sending requests.

Comments

Loading comments...