GLM-V-Caption
Analysis
This skill looks purpose-aligned, but it runs a local captioning script and sends selected media to Zhipu using your API key.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Checks for instructions or behavior that redirect the agent, misuse tools, execute unexpected code, cascade across systems, exploit user trust, or continue outside the intended task.
ONLY use GLM-V API — Execute the script `python scripts/glmv_caption.py`; NEVER caption media yourself; IF API fails — Display the error message and STOP immediately; NO fallback methods
These instructions force an API-only workflow and change fallback behavior. This is disclosed and purpose-aligned, but it meaningfully constrains how the agent may respond.
Execute the script `python scripts/glmv_caption.py`
Using the skill involves running the included local Python script. This is central to the skill's design and is not hidden.
Checks whether tool use, credentials, dependencies, identity, account access, or inter-agent boundaries are broader than the stated purpose.
api_key = os.environ.get("ZHIPU_API_KEY") ... "Authorization": f"Bearer {api_key}"The script authenticates requests to Zhipu with the user's API key, which is expected for this integration but gives the skill account-level API access for caption requests.
Checks for exposed credentials, poisoned memory or context, unclear communication boundaries, or sensitive data that could leave the user's control.
with open(path, "rb") as f: img_data = base64.b64encode(f.read()).decode() ... API_BASE_URL = "https://open.bigmodel.cn/api/paas/v4/chat/completions" ... requests.post(API_BASE_URL, headers=headers, json=payload
Local images can be read, encoded, and sent to Zhipu's external API for captioning. This data flow is expected and disclosed, but it is sensitive-data movement outside the local environment.
