GLM-V-Caption

v1.0.2

Generate captions (descriptions) for images, videos, and documents using ZhiPu GLM-V multimodal model series. Use this skill whenever the user wants to descr...

1· 319·0 current·0 all-time
byJared Wen@jaredforreal
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description request ZHIPU_API_KEY and python and the code sends media to Zhipu's API (open.bigmodel.cn). The required env var and binary are appropriate and proportional to the stated goal of calling a third‑party multimodal model.
Instruction Scope
SKILL.md explicitly requires running scripts/glmv_caption.py and forbids local/fallback captioning; it also mandates showing full raw model output. These instructions are restrictive but consistent with the stated design. Important privacy implication: local images are base64-encoded and uploaded to the vendor API (intentional for this skill).
Install Mechanism
No install spec; this is an instruction-only skill with a bundled Python script. Nothing is downloaded from external URLs during install, and the script is executed locally with the system python.
Credentials
Only ZHIPU_API_KEY is required and is used as the Authorization bearer token when calling the Zhipu API. No unrelated credentials or system config paths are requested. The SKILL.md notes that the key may be shared with other Zhipu skills (so the key may be placed in a shared agent config).
Persistence & Privilege
always is false and the skill is user-invocable; it does not request permanent/force-inclusion privileges or modify other skills' configs. It follows normal autonomy defaults.
Assessment
This skill will upload images (including local images converted to base64) and file URLs to Zhipu's API (open.bigmodel.cn) using the ZHIPU_API_KEY you provide — only enable it if you trust that service and are comfortable sending media to it. The skill requires showing the raw model output (so model responses and any included metadata will be displayed to the user). Store the API key intentionally (for example, in a skill-specific config) because the SKILL.md notes that the key may be shared with other Zhipu skills. If you must process highly sensitive images or documents, do not allow this skill to access them. Otherwise the skill appears internally consistent with its stated purpose.

Like a lobster shell, security has layers — review code before you run it.

latestvk97cc773b0sr1jky6b3wg00qwd83tj5h

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🖼️ Clawdis
Binspython
EnvZHIPU_API_KEY
Primary envZHIPU_API_KEY

Comments