Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

audio-transcribe-summarize

Transcribe audio/video files to text and generate structured summaries using SenseAudio ASR API. Use when the user asks to transcribe, summarize, or take not...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 67 · 0 current installs · 0 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
high confidence
Purpose & Capability
The skill's name/description (transcribe & summarize using SenseAudio) align with the included code and API reference. However the registry metadata declared no required environment variables while SKILL.md and scripts/transcribe.py clearly require a SENSEAUDIO_API_KEY — an inconsistency between declared requirements and actual needs.
Instruction Scope
SKILL.md instructs the agent to run the included Python script which uploads audio to api.senseaudio.cn and then writes local transcript (.txt/.json) files. The instructions and script operate within the stated purpose and do not attempt to read unrelated system files or additional environment variables beyond SENSEAUDIO_API_KEY. They do call ffmpeg/ffprobe via subprocess which is expected to split large audio files.
Install Mechanism
There is no install spec (instruction-only with an included script). No packages are downloaded at install time. The risk surface is limited to running the provided Python script and any subprocesses it spawns (ffmpeg).
!
Credentials
The script requires SENSEAUDIO_API_KEY (used in Authorization header) but the registry metadata did not declare this environment variable. Requesting an API key for the remote ASR service is proportional to the functionality, but the metadata omission is misleading and could cause users to miss a sensitive requirement. Other environment access is minimal (PATH lookups for ffmpeg).
Persistence & Privilege
The skill is not always-enabled and is user-invocable. It does not request elevated or persistent platform privileges and does not modify other skills or system-wide configuration. Autonomous invocation is allowed by default but is not combined with other high-risk patterns here.
What to consider before installing
This skill appears to do what it claims (send audio to SenseAudio and produce transcripts/summaries), but note two things before installing/using it: (1) It requires a SENSEAUDIO_API_KEY (the SKILL.md and script require it) even though the registry metadata omitted that — make sure you supply a key and understand where it will be stored. (2) All audio is uploaded to https://api.senseaudio.cn, so transcripts and possibly speaker/emotion metadata are sent to a third party — consider privacy/confidentiality and cost. If you proceed, verify the API host, only use a dedicated API key with appropriate permissions/quota, run the script in an isolated environment if the audio is sensitive, and confirm the registry metadata is corrected or ask the publisher why the API key was not declared.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.1
Download zip
latestvk9799pw17eafysew9eq865934h83ah0s

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Audio/Video Transcription & Summarization

Transcribe audio/video files using the SenseASR API (api.senseaudio.cn), then summarize the content into structured notes.

{baseDir} refers to this skill's directory.

Prerequisites

  • Environment variable SENSEAUDIO_API_KEY configured (get your key at https://senseaudio.cn/platform/api-key)
  • Python 3.8+ with requests installed
  • For large files (>10MB): ffmpeg installed for splitting(macOS: brew install ffmpeg,Windows: ffmpeg.org 下载并加入 PATH,Linux: apt install ffmpeg

Quick Start

  1. Run the transcription script:
python {baseDir}/scripts/transcribe.py <audio_file> [--model sense-asr-pro] [--language zh] [--speakers] [--sentiment] [--translate en]
  1. The script outputs a transcript .txt file alongside the source file
  2. Read the transcript and generate a summary (see Summary Format below)

Workflow

Step 1: Assess the Audio File

Check file size and format:

  • Supported formats: wav, mp3, ogg, flac, aac, m4a, mp4
  • Max file size per request: 10MB
  • If file > 10MB, the script auto-splits using ffmpeg

Step 2: Choose the Right Model

ModelUse When
sense-asr-liteQuick batch transcription, simple audio, cost-sensitive
sense-asrGeneral transcription, need speaker separation or timestamps
sense-asr-proHigh accuracy needed: meetings, interviews, complex audio
sense-asr-deepthinkNoisy audio, dialects, heavy jargon, speech-to-clean-text

Default to sense-asr-pro for best quality.

Step 3: Transcribe

Run the transcription script. Key options:

# Basic transcription
python {baseDir}/scripts/transcribe.py recording.mp3

# Meeting with multiple speakers + emotion
python {baseDir}/scripts/transcribe.py meeting.wav \
  --model sense-asr-pro \
  --speakers --max-speakers 4 \
  --sentiment \
  --timestamps segment

# Transcribe and translate to English
python {baseDir}/scripts/transcribe.py lecture.mp3 \
  --model sense-asr \
  --translate en

Step 4: Summarize

After transcription, read the transcript file and produce a summary using the format below.

Summary Format

Generate summaries in this structure:

# [Title - inferred from content]

**Source**: filename.mp3
**Duration**: X min Y sec
**Date**: YYYY-MM-DD
**Speakers**: [if speaker diarization was used]

## Key Points
- Point 1
- Point 2
- ...

## Detailed Summary
[2-4 paragraph summary of the content organized by topic/chronology]

## Action Items
- [ ] Action item 1 (assigned to Speaker X, if applicable)
- [ ] Action item 2

## Notable Quotes
> "Direct quote from transcript" — Speaker X, [timestamp if available]

## Full Transcript
<details>
<summary>Click to expand full transcript</summary>

[Full transcript text here, with speaker labels and timestamps if available]

</details>

Adapt the template based on content type:

  • Meeting: emphasize action items, decisions, speaker contributions
  • Lecture/Talk: emphasize key concepts, learning points, structure
  • Interview: emphasize Q&A pairs, key responses
  • Podcast: emphasize topics discussed, interesting insights

API Reference

For full SenseASR API parameters and response formats, see api-reference.md.

Files

3 total
Select a file
Select a file to preview.

Comments

Loading comments…