toncall-videourl-text

PassAudited by VirusTotal on May 11, 2026.

Overview

Type: OpenClaw Skill Name: toncall-videourl-text Version: 1.0.0 The skill provides video-to-text transcription by downloading videos, extracting audio via ffmpeg, and utilizing Volcengine (ByteDance) TOS and ASR APIs. The Python script `video_url_to_text.py` implements standard AWS V4 signature logic for cloud storage interactions and includes robust cleanup routines for both local and remote temporary files. No evidence of malicious intent, credential exfiltration, or prompt injection was found; the code strictly follows the functionality described in `SKILL.md`.

Findings (0)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Note

ASI03: Identity and Privilege Abuse

What this means

If broad cloud credentials are placed in config.ini, the skill can use those credentials for its storage and ASR workflow.

Why it was flagged

The skill requires Volcengine TOS and ASR credentials, giving it delegated access to upload/delete storage objects and submit recognition jobs.

Skill content

[tos]
ak = 
sk = 
...
[asr]
app_key = 
access_key =

Recommendation

Use a dedicated, least-privilege Volcengine key and bucket for this skill, keep config.ini private, and rotate the keys if they are exposed.

Note

ASI02: Tool Misuse and Exploitation

What this means

Processing large, private, or untrusted video URLs may consume local resources and send derived audio to cloud services.

Why it was flagged

The skill processes a user-provided URL through a local script. This is central to the purpose, but it means the agent may download and process arbitrary direct video URLs supplied by the user.

Skill content

用户发送可直接访问的视频文件URL ... 执行脚本自动处理 ... py scripts/video_url_to_text.py <视频URL>

Recommendation

Only provide trusted direct video URLs, and consider adding size/type limits or an explicit confirmation step for sensitive or very large files.

Note

ASI05: Unexpected Code Execution

What this means

The skill depends on the ffmpeg installed on the user’s machine and will run it as part of the transcription workflow.

Why it was flagged

The script invokes the local ffmpeg binary. Local command execution is expected for audio extraction, but it is still a local executable dependency.

Skill content

subprocess.run(["ffmpeg", "-version"], capture_output=True)

Recommendation

Install ffmpeg from a trusted source and keep it updated.

Note

ASI07: Insecure Inter-Agent Communication

What this means

Audio extracted from the video is uploaded or made accessible to Volcengine services for transcription.

Why it was flagged

The extracted audio URL is submitted to an external Volcengine/ByteDance ASR endpoint, matching the stated purpose but moving media-derived data outside the local environment.

Skill content

ASR_SUBMIT_URL = "https://openspeech.bytedance.com/api/v3/auc/bigmodel/submit" ... "audio": { "url": audio_url }

Recommendation

Do not use the skill for videos containing sensitive content unless the Volcengine/TOS data handling model is acceptable to you.

Note

ASI06: Memory and Context Poisoning

What this means

Recognized text remains on disk after processing unless the user removes it.

Why it was flagged

The final transcript is persisted locally in a texts/ directory. This is disclosed and purpose-aligned, but transcripts may contain sensitive information.

Skill content

保存识别结果文本到 `texts/` 目录，返回给用户

Recommendation

Review and delete saved transcript files when they contain private or sensitive content.