GLM-V-Caption
PassAudited by ClawScan on May 1, 2026.
Overview
This skill looks purpose-aligned, but it runs a local captioning script and sends selected media to Zhipu using your API key.
Before installing, be comfortable with running the included Python helper, providing a ZHIPU_API_KEY, and sending selected media or media URLs to Zhipu. Use a dedicated API key, monitor usage, and avoid confidential files unless that external processing is acceptable.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
If the Zhipu API is unavailable or unsuitable, the agent may stop instead of offering another way to caption the media.
These instructions force an API-only workflow and change fallback behavior. This is disclosed and purpose-aligned, but it meaningfully constrains how the agent may respond.
ONLY use GLM-V API — Execute the script `python scripts/glmv_caption.py`; NEVER caption media yourself; IF API fails — Display the error message and STOP immediately; NO fallback methods
Install this skill when you specifically want Zhipu GLM-V captioning; disable it or avoid invoking it if you want local, built-in, or fallback captioning.
The agent will execute local helper code to prepare media and call the Zhipu API.
Using the skill involves running the included local Python script. This is central to the skill's design and is not hidden.
Execute the script `python scripts/glmv_caption.py`
Use the skill only from a source you trust and keep the included script under normal review before providing credentials or private media.
Requests may consume quota or incur costs on the configured Zhipu account.
The script authenticates requests to Zhipu with the user's API key, which is expected for this integration but gives the skill account-level API access for caption requests.
api_key = os.environ.get("ZHIPU_API_KEY") ... "Authorization": f"Bearer {api_key}"Use a dedicated, revocable API key if possible, store it securely, and monitor usage on the Zhipu account.
Images, prompts, and media URLs submitted for captioning may be processed by Zhipu.
Local images can be read, encoded, and sent to Zhipu's external API for captioning. This data flow is expected and disclosed, but it is sensitive-data movement outside the local environment.
with open(path, "rb") as f: img_data = base64.b64encode(f.read()).decode() ... API_BASE_URL = "https://open.bigmodel.cn/api/paas/v4/chat/completions" ... requests.post(API_BASE_URL, headers=headers, json=payload
Avoid submitting confidential or regulated media unless Zhipu's terms, retention, and privacy practices are acceptable for that data.
