Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

Speech To Text

Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 1.9k · 24 current installs · 24 all-time installs
byÖmer Karışman@okaris
MIT-0
Security Scan
VirusTotalVirusTotal
Suspicious
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description align with the instructions: the SKILL.md consistently instructs use of the inference.sh CLI and specific Whisper apps for transcription, translation, timestamps, etc. Nothing requested in the doc is unrelated to transcription.
!
Instruction Scope
The instructions tell the user/agent to run a network installer via `curl -fsSL https://cli.inference.sh | sh` and to run `infsh login` and `infsh app run ... --input '{"audio_url": "https://..."}'`. That means audio will be uploaded/sent to inference.sh apps. The SKILL.md does not declare what credentials are required, how login works, or the privacy/retention policies for uploaded audio. It also encourages installing and running code fetched from the network, which expands the runtime scope beyond local-only transcription.
!
Install Mechanism
Although the registry shows no formal install spec, the instructions explicitly recommend piping a remote script to sh (download-and-run) and say it downloads binaries from dist.inference.sh. Downloading and running a remote installer is higher-risk than an instruction-only skill; while the URLs are consistent (inference.sh / dist.inference.sh) and a checksum file is referenced, the installer pattern (curl | sh) and archive extraction are not enforced by the registry metadata and may write binaries to disk.
Credentials
The skill declares no required environment variables or primary credential, which matches the registry. However, the SKILL.md tells the user to run `infsh login` (implying credentials or an account are needed) and to send audio URLs to remote apps — this implicitly requires account credentials or interactive login and causes data to leave the host. The lack of declared credential requirements and absence of privacy/retention guidance is a gap.
Persistence & Privilege
Registry flags do not request persistent/always-on privileges and the skill is instruction-only. The only persistence risk comes from the installer it recommends (a local binary), but the skill itself does not request always-on or modify other skills' configs.
What to consider before installing
This skill appears to do what it says (use inference.sh to transcribe audio) but exercise caution before following the installer steps. Avoid piping unknown remote scripts directly into sh; if you want to use this skill, manually inspect the installer at https://cli.inference.sh and verify checksums from https://dist.inference.sh/checksums.txt before running. Understand that `infsh app run ... --input {"audio_url": ...}` will send audio to inference.sh's servers—check their privacy, retention, and security policies before uploading sensitive audio. Expect to create or provide an inference.sh account (login/API key) even though no env vars are declared. If you need offline/local-only transcription or stronger privacy guarantees, consider an alternative that runs models locally without uploading data.

Like a lobster shell, security has layers — review code before you run it.

Current versionv0.1.5
Download zip
latestvk979c27dp3d69xd9kh7c1sae1981ck6e

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Speech-to-Text

Transcribe audio to text via inference.sh CLI.

Speech-to-Text

Quick Start

curl -fsSL https://cli.inference.sh | sh && infsh login

infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://audio.mp3"}'

Install note: The install script only detects your OS/architecture, downloads the matching binary from dist.inference.sh, and verifies its SHA-256 checksum. No elevated permissions or background processes. Manual install & verification available.

Available Models

ModelApp IDBest For
Fast Whisper V3infsh/fast-whisper-large-v3Fast transcription
Whisper V3 Largeinfsh/whisper-v3-largeHighest accuracy

Examples

Basic Transcription

infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "https://meeting.mp3"}'

With Timestamps

infsh app sample infsh/fast-whisper-large-v3 --save input.json

# {
#   "audio_url": "https://podcast.mp3",
#   "timestamps": true
# }

infsh app run infsh/fast-whisper-large-v3 --input input.json

Translation (to English)

infsh app run infsh/whisper-v3-large --input '{
  "audio_url": "https://french-audio.mp3",
  "task": "translate"
}'

From Video

# Extract audio from video first
infsh app run infsh/video-audio-extractor --input '{"video_url": "https://video.mp4"}' > audio.json

# Transcribe the extracted audio
infsh app run infsh/fast-whisper-large-v3 --input '{"audio_url": "<audio-url>"}'

Workflow: Video Subtitles

# 1. Transcribe video audio
infsh app run infsh/fast-whisper-large-v3 --input '{
  "audio_url": "https://video.mp4",
  "timestamps": true
}' > transcript.json

# 2. Use transcript for captions
infsh app run infsh/caption-videos --input '{
  "video_url": "https://video.mp4",
  "captions": "<transcript-from-step-1>"
}'

Supported Languages

Whisper supports 99+ languages including: English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Hindi, Russian, and many more.

Use Cases

  • Meetings: Transcribe recordings
  • Podcasts: Generate transcripts
  • Subtitles: Create captions for videos
  • Voice Notes: Convert to searchable text
  • Interviews: Transcription for research
  • Accessibility: Make audio content accessible

Output Format

Returns JSON with:

  • text: Full transcription
  • segments: Timestamped segments (if requested)
  • language: Detected language

Related Skills

# Full platform skill (all 150+ apps)
npx skills add inference-sh/skills@inference-sh

# Text-to-speech (reverse direction)
npx skills add inference-sh/skills@text-to-speech

# Video generation (add captions)
npx skills add inference-sh/skills@ai-video-generation

# AI avatars (lipsync with transcripts)
npx skills add inference-sh/skills@ai-avatar-video

Browse all audio apps: infsh app list --category audio

Documentation

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…