Text To Speech
Convert text to natural speech with DIA TTS, Kokoro, Chatterbox, and more via inference.sh CLI. Models: DIA TTS (conversational), Kokoro TTS, Chatterbox, Hig...
MIT-0 · Free to use, modify, and redistribute. No attribution required.
⭐ 0 · 1.1k · 9 current installs · 9 all-time installs
byÖmer Karışman@okaris
MIT-0
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
Name, description, and examples consistently target text-to-speech via the inference.sh CLI and its infsh/* apps. Nothing in the SKILL.md attempts to perform unrelated tasks.
Instruction Scope
The runtime instructions tell the agent to run an installer by piping https://cli.inference.sh into sh (curl | sh) and to run infsh login. That executes remote code and initiates an authentication flow; the doc does not show local verification steps in the provided command. While using a remote CLI is coherent for this skill, executing an un-audited install script is risky and the SKILL.md's example command does not itself perform checksum verification despite a claim that the install verifies checksums.
Install Mechanism
There is no registry install spec, but the instructions direct users to download and run a script from cli.inference.sh which fetches binaries from dist.inference.sh. Piping a remote script to sh is a high-risk install method. The README claims checksum verification is performed, but the example curl | sh command does not include an explicit checksum verification step; this is a provenance/verification gap.
Credentials
The skill declares no required environment variables or credentials, but the instructions call out 'infsh login' (an authentication step). The skill omits declaring what credentials or tokens the CLI will require/store. That mismatch means the skill will prompt for or persist credentials outside of the registry metadata — users should expect to authenticate to the external service.
Persistence & Privilege
The skill itself does not request always-on presence or special agent privileges. However, the recommended installer will create a CLI binary and the login flow will likely store tokens locally — normal for a CLI but worth noting because the registry metadata does not describe those side-effects.
What to consider before installing
This skill appears to be a simple wrapper around the inference.sh CLI (TTS) and is not obviously malicious, but there are a few practical risks to consider before installing or invoking it:
- Avoid blindly running curl https://cli.inference.sh | sh. Instead, fetch the installer script first, inspect it, and run it only after review. The example command executes remote code without local verification.
- The SKILL.md claims checksum verification, but the provided example does not perform an explicit checksum check. If you install, verify the downloaded binary's SHA-256 against the project-provided checksums (https://dist.inference.sh/cli/checksums.txt) before executing.
- Expect to authenticate to inference.sh (infsh login). The skill metadata does not declare required credentials or tokens, so be prepared that the CLI will request/store credentials locally and communicate with external servers.
- If you have sensitive data or a high-security environment, consider running the installer and CLI inside a sandboxed VM or container and limit network access until you've vetted the service and its privacy policy.
- Verify the inference.sh domain and project reputation (source code, repo, or official docs) before trusting it with your content and credentials.
If you want a lower-risk option, look for a skill that uses a pre-approved package manager install or one that declares required credentials up front and provides reproducible verification steps.Like a lobster shell, security has layers — review code before you run it.
Current versionv0.1.5
Download ziplatest
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
SKILL.md
Text-to-Speech
Convert text to natural speech via inference.sh CLI.

Quick Start
# Install CLI
curl -fsSL https://cli.inference.sh | sh && infsh login
# Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Hello, welcome to our product demo."}'
Install note: The install script only detects your OS/architecture, downloads the matching binary from
dist.inference.sh, and verifies its SHA-256 checksum. No elevated permissions or background processes. Manual install & verification available.
Available Models
| Model | App ID | Best For |
|---|---|---|
| DIA TTS | infsh/dia-tts | Conversational, expressive |
| Kokoro TTS | infsh/kokoro-tts | Fast, natural |
| Chatterbox | infsh/chatterbox | General purpose |
| Higgs Audio | infsh/higgs-audio | Emotional control |
| VibeVoice | infsh/vibevoice | Podcasts, long-form |
Browse All Audio Apps
infsh app list --category audio
Examples
Basic Text-to-Speech
infsh app run infsh/kokoro-tts --input '{"text": "Welcome to our tutorial."}'
Conversational TTS with DIA
infsh app sample infsh/dia-tts --save input.json
# Edit input.json:
# {
# "text": "Hey! How are you doing today? I'm really excited to share this with you.",
# "voice": "conversational"
# }
infsh app run infsh/dia-tts --input input.json
Long-form Audio (Podcasts)
infsh app sample infsh/vibevoice --save input.json
# Edit input.json with your podcast script
infsh app run infsh/vibevoice --input input.json
Expressive Speech with Higgs
infsh app sample infsh/higgs-audio --save input.json
# {
# "text": "This is absolutely incredible!",
# "emotion": "excited"
# }
infsh app run infsh/higgs-audio --input input.json
Use Cases
- Voiceovers: Product demos, explainer videos
- Audiobooks: Convert text to spoken word
- Podcasts: Generate podcast episodes
- Accessibility: Make content accessible
- IVR: Phone system voice prompts
- Video Narration: Add narration to videos
Combine with Video
Generate speech, then create a talking head video:
# 1. Generate speech
infsh app run infsh/kokoro-tts --input '{"text": "Your script here"}' > speech.json
# 2. Use the audio URL with OmniHuman for avatar video
infsh app run bytedance/omnihuman-1-5 --input '{
"image_url": "https://portrait.jpg",
"audio_url": "<audio-url-from-step-1>"
}'
Related Skills
# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@inference-sh
# AI avatars (combine TTS with talking heads)
npx skills add inference-sh/skills@ai-avatar-video
# AI music generation
npx skills add inference-sh/skills@ai-music-generation
# Speech-to-text (transcription)
npx skills add inference-sh/skills@speech-to-text
# Video generation
npx skills add inference-sh/skills@ai-video-generation
Browse all apps: infsh app list
Documentation
- Running Apps - How to run apps via CLI
- Audio Transcription Example - Audio processing workflows
- Apps Overview - Understanding the app ecosystem
Files
1 totalSelect a file
Select a file to preview.
Comments
Loading comments…
