Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Voice.ai: Creator Voiceover Forge

v0.1.3

Turn scripts into publishable voiceovers with Voice.ai TTS, including segments, chapters, captions, and video muxing.

0· 1.3k·0 current·0 all-time
byNick Gill@gizmogremlin
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The skill's name/description (voiceover pipeline using Voice.ai TTS) aligns with the code and instructions: it chunks scripts, calls a TTS API, stitches audio, and optionally muxes into video. Requesting a VOICE_AI_API_KEY is expected for this purpose. However, registry metadata at the top lists "Required env vars: none" and "Primary credential: none" while SKILL.md and the code require VOICE_AI_API_KEY; that mismatch is unexplained and reduces trust in the metadata.
Instruction Scope
SKILL.md and the code instruct the agent to read a script file, optional templates in the skill directory, optional .env file, and (if requested) a local video file; the only network transmission described is sending script text to the Voice.ai API for TTS. Nothing in SKILL.md or the visible source asks the agent to read unrelated system files or exfiltrate other data.
Install Mechanism
There is no external install step; the package includes a bundled Node.js CLI (voiceai-vo.cjs). No downloads from arbitrary URLs or remote installers are present. The skill requires Node.js 20+ to run the bundled file; ffmpeg is optional and local. This is a low-risk install mechanism as delivered.
!
Credentials
Functionality requires a single API key (VOICE_AI_API_KEY / alternate VOICEAI_API_KEY), which is proportionate. However, the public registry metadata claims no required env vars while SKILL.md and code declare and read VOICE_AI_API_KEY (and an alternate VOICEAI_API_KEY), creating an inconsistency. Also the skill references a base URL of https://dev.voice.ai (a dev/staging domain) and TROUBLESHOOTING warns that 'real API not yet configured' and suggests using --mock; that indicates the endpoints may be placeholders. Do not supply a production API key until you confirm the real endpoint and the publisher's identity.
Persistence & Privilege
The skill is not marked always:true and does not request system-wide privileges. It does not modify other skills or agent-wide settings in the files shown. It runs as a local Node process when invoked; normal for a CLI-style skill.
What to consider before installing
What to check before installing or running: - Confirm the API key requirement: SKILL.md and the code require VOICE_AI_API_KEY (or VOICEAI_API_KEY). The registry metadata incorrectly lists no required env vars — treat the SKILL.md/code as authoritative. - Do not paste a production VOICE_AI_API_KEY until you verify the service endpoint and publisher. The code points to https://dev.voice.ai and TROUBLESHOOTING warns that production endpoints may be placeholder; use --mock to test locally without sending data. - Verify the publisher/source: homepage is missing and owner id is an opaque string. README references a GitHub repo (gizmoGremlin) — inspect that upstream repo or contact the author to confirm authenticity before trusting real credentials. - Review bundled binary (voiceai-vo.cjs) or run in an isolated environment/container. Running with --mock first lets you exercise the pipeline without network calls. - Legal/privacy note: the voice catalog includes names that imply celebrity/character voices; ensure you’re comfortable with any potential voice-mimicry/licensing issues for your use case. If you want higher confidence: ask the publisher for a canonical homepage or GitHub link, verify the API base (production vs dev), and run the bundled CLI in --mock mode to validate local behavior before providing secrets.

Like a lobster shell, security has layers — review code before you run it.

latestvk9753xfvt47fgddbgqafns8bgd80w16n
1.3kdownloads
0stars
4versions
Updated 12h ago
v0.1.3
MIT-0

Voice.ai Creator Voiceover Pipeline

This skill follows the Agent Skills specification.

Turn any script into a publish-ready voiceover — complete with numbered segments, a stitched master, YouTube chapters, SRT captions, and a beautiful review page. Optionally, replace the audio track on an existing video.

Built for creators who want studio-quality voiceovers without the studio. Powered by Voice.ai.


When to use this skill

ScenarioWhy it fits
YouTube long-formFull narration with chapter markers and captions
YouTube ShortsQuick hooks with the shortform template
PodcastsConsistent host voice, intro/outro templates
Course contentProfessional narration for educational videos
Quick iterationSmart caching — edit one section, only that segment re-renders
Video audio replacementDrop AI voiceover onto screen recordings or B-roll

The one-command workflow

Have a script and a video? Turn them into a finished video with AI voiceover in one shot:

node voiceai-vo.cjs build \
  --input my-script.md \
  --voice oliver \
  --title "My Video" \
  --video ./my-recording.mp4 \
  --mux

This renders the voiceover, stitches the master audio, and drops it onto your video — all in one command. Output:

  • out/my-video/muxed.mp4 — your video with the new voiceover
  • out/my-video/master.wav — the standalone audio
  • out/my-video/review.html — listen and review each segment
  • out/my-video/chapters.txt — YouTube-ready chapter timestamps
  • out/my-video/captions.srt — SRT captions

Use --sync pad if the audio is shorter than the video, or --sync trim to cut it to match.


Requirements

  • Node.js 20+ — runtime (no npm install needed — the CLI is a single bundled file)
  • VOICE_AI_API_KEY — set as environment variable or in a .env file in the skill root. Get a key at voice.ai/dashboard.
  • ffmpeg (optional) — needed for master stitching, MP3 encoding, loudness normalization, and video muxing. The pipeline still produces individual segments, the review page, chapters, and captions without it.

Configuration

The skill reads VOICE_AI_API_KEY from (in order):

  1. Environment variable VOICE_AI_API_KEY
  2. Environment variable VOICEAI_API_KEY (alternate)
  3. .env file in the skill root
echo 'VOICE_AI_API_KEY=your-key-here' > .env

Use --mock on any command to run the full pipeline without an API key (produces placeholder audio).


Commands

build — Generate a voiceover from a script

node voiceai-vo.cjs build \
  --input <script.md or script.txt> \
  --voice <voice-alias-or-uuid> \
  --title "My Project" \
  [--template youtube|podcast|shortform] \
  [--language en] \
  [--video input.mp4 --mux --sync shortest] \
  [--force] [--mock]

What it does:

  1. Reads the script and splits it into segments (by ## headings for .md, or by sentence boundaries for .txt)
  2. Optionally prepends/appends template intro/outro segments
  3. Renders each segment via Voice.ai TTS as a numbered WAV file
  4. Stitches a master audio file (if ffmpeg is available)
  5. Generates chapters, captions, a review page, and metadata files
  6. Optionally muxes the voiceover into an existing video

Full options:

OptionDescription
-i, --input <path>Script file (.txt or .md) — required
-v, --voice <id>Voice alias or UUID — required
-t, --title <title>Project title (defaults to filename)
--template <name>youtube, podcast, or shortform
--mode <mode>headings or auto (default: headings for .md)
--max-chars <n>Max characters per auto-chunk (default: 1500)
--language <code>Language code (default: en)
--video <path>Input video for muxing
--muxEnable video muxing (requires --video)
--sync <policy>shortest, pad, or trim (default: shortest)
--forceRe-render all segments (ignore cache)
--mockMock mode — no API calls, placeholder audio
-o, --out <dir>Custom output directory

replace-audio — Swap the audio track on a video

node voiceai-vo.cjs replace-audio \
  --video ./input.mp4 \
  --audio ./out/my-project/master.wav \
  [--out ./out/my-project/muxed.mp4] \
  [--sync shortest|pad|trim]

Requires ffmpeg. If not installed, generates helper shell/PowerShell scripts instead.

Sync policyBehavior
shortest (default)Output ends when the shorter track ends
padPad audio with silence to match video duration
trimTrim audio to match video duration

Video stream is copied without re-encoding (-c:v copy). Audio is encoded as AAC. A mux report is saved alongside the output.

Privacy: Video processing is entirely local. Only script text is sent to Voice.ai for TTS.

voices — List available voices

node voiceai-vo.cjs voices [--limit 20] [--query "deep"] [--mock]

Available voices

Use short aliases or full UUIDs with --voice:

AliasVoiceGenderStyle
ellieEllieFYouthful, vibrant vlogger
oliverOliverMFriendly British
lilithLilithFSoft, feminine
smoothSmooth Calm VoiceMDeep, smooth narrator
corpseCorpse HusbandMDeep, distinctive
skadiSkadiFAnime character
zhongliZhongliMDeep, authoritative
floraFloraFCheerful, high pitch
chiefMaster ChiefMHeroic, commanding

The voices command also returns any additional voices available on the API. Voice list is cached for 10 minutes.


Build outputs

After a build, the output directory contains:

out/<title-slug>/
  segments/           # Numbered WAV files (001-intro.wav, 002-section.wav, …)
  master.wav          # Stitched audio (requires ffmpeg)
  master.mp3          # MP3 encode (requires ffmpeg)
  manifest.json       # Build metadata: voice, template, segment list, hashes
  timeline.json       # Segment durations and start times
  review.html         # Interactive review page with audio players
  chapters.txt        # YouTube-friendly chapter timestamps
  captions.srt        # SRT captions using segment boundaries
  description.txt     # YouTube description with chapters + Voice.ai credit

review.html

A standalone HTML page with:

  • Master audio player (if stitched)
  • Individual segment players with titles and durations
  • Collapsible script text for each segment
  • Regeneration command hints

Templates

Templates auto-inject intro/outro segments around the script content:

TemplatePrependsAppends
youtubetemplates/youtube_intro.txttemplates/youtube_outro.txt
podcasttemplates/podcast_intro.txt
shortformtemplates/shortform_hook.txt

Edit the files in templates/ to customize the intro/outro text.


Caching

Segments are cached by a hash of: text content + voice ID + language.

  • Unchanged segments are skipped on rebuild — fast iteration
  • Modified segments are re-rendered automatically
  • Use --force to re-render everything
  • Cache manifest is stored in segments/.cache.json

Multilingual support

Voice.ai supports 11 languages. Use --language <code> to switch:

en, es, fr, de, it, pt, pl, ru, nl, sv, ca

The pipeline auto-selects the multilingual TTS model for non-English languages.


Troubleshooting

IssueSolution
ffmpeg missingPipeline still works — you get segments, review page, chapters, captions. Install ffmpeg for master stitching and video muxing.
Rate limits (429)Segments render sequentially, which stays under most limits. Wait and retry.
Insufficient credits (402)Top up at voice.ai/dashboard. Cached segments won't re-use credits on retry.
Long scriptsCaching makes rebuilds fast. Text over 490 chars per segment is automatically split across API calls.
Windows pathsWrap paths with spaces in quotes: --input "C:\My Scripts\script.md"

See references/TROUBLESHOOTING.md for more.


References

Comments

Loading comments...