toncall-videourl-text
PassAudited by VirusTotal on May 11, 2026.
Overview
Type: OpenClaw Skill Name: toncall-videourl-text Version: 1.0.0 The skill provides video-to-text transcription by downloading videos, extracting audio via ffmpeg, and utilizing Volcengine (ByteDance) TOS and ASR APIs. The Python script `video_url_to_text.py` implements standard AWS V4 signature logic for cloud storage interactions and includes robust cleanup routines for both local and remote temporary files. No evidence of malicious intent, credential exfiltration, or prompt injection was found; the code strictly follows the functionality described in `SKILL.md`.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If broad cloud credentials are placed in config.ini, the skill can use those credentials for its storage and ASR workflow.
The skill requires Volcengine TOS and ASR credentials, giving it delegated access to upload/delete storage objects and submit recognition jobs.
[tos] ak = sk = ... [asr] app_key = access_key =
Use a dedicated, least-privilege Volcengine key and bucket for this skill, keep config.ini private, and rotate the keys if they are exposed.
Processing large, private, or untrusted video URLs may consume local resources and send derived audio to cloud services.
The skill processes a user-provided URL through a local script. This is central to the purpose, but it means the agent may download and process arbitrary direct video URLs supplied by the user.
用户发送可直接访问的视频文件URL ... 执行脚本自动处理 ... py scripts/video_url_to_text.py <视频URL>
Only provide trusted direct video URLs, and consider adding size/type limits or an explicit confirmation step for sensitive or very large files.
The skill depends on the ffmpeg installed on the user’s machine and will run it as part of the transcription workflow.
The script invokes the local ffmpeg binary. Local command execution is expected for audio extraction, but it is still a local executable dependency.
subprocess.run(["ffmpeg", "-version"], capture_output=True)
Install ffmpeg from a trusted source and keep it updated.
Audio extracted from the video is uploaded or made accessible to Volcengine services for transcription.
The extracted audio URL is submitted to an external Volcengine/ByteDance ASR endpoint, matching the stated purpose but moving media-derived data outside the local environment.
ASR_SUBMIT_URL = "https://openspeech.bytedance.com/api/v3/auc/bigmodel/submit" ... "audio": { "url": audio_url }Do not use the skill for videos containing sensitive content unless the Volcengine/TOS data handling model is acceptable to you.
Recognized text remains on disk after processing unless the user removes it.
The final transcript is persisted locally in a texts/ directory. This is disclosed and purpose-aligned, but transcripts may contain sensitive information.
保存识别结果文本到 `texts/` 目录,返回给用户
Review and delete saved transcript files when they contain private or sensitive content.
