Vision Sandbox

v1.1.0

Agentic Vision via Gemini's native Code Execution sandbox. Use for spatial grounding, visual math, and UI auditing.

1· 5.6k·26 current·28 all-time
byJo Alex@johanesalxd
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name/description align with required GEMINI_API_KEY, the code imports google.genai and submits image+prompt to a Gemini model. Requiring the `uv` CLI and the google-genai dependency is appropriate for the described CLI/script usage.
Instruction Scope
SKILL.md instructs only image-based tasks and how to run the CLI; the included code reads an image file, sends it to Gemini, prints model text/code and writes any inline images locally. However SKILL.md also includes broad prompt templates that give the model freedom to execute arbitrary code in the remote sandbox (expected for this skill but high-impact). The README advises copying SKILL.md into global OpenCode skill directories — that is a user action that would persist the skill in your user config and should not be done blindly.
Install Mechanism
This is instruction-only / repository-style with no automated install spec. There is a pyproject listing google-genai; no downloads from unknown URLs or archive extraction are present in the package metadata. Risk from install mechanism is low, but provenance is unclear (source/homepage unknown).
Credentials
Only GEMINI_API_KEY is required and used by the code to authenticate to Google GenAI — this is proportionate to the declared purpose. No unrelated secrets or config paths are requested.
Persistence & Privilege
The skill does not force inclusion (always: false) and allows normal autonomous invocation. The SKILL.md suggests copying the skill into global OpenCode dirs (user action) which would persist it in your config; exercise caution before following those instructions for untrusted packages.
Scan Findings in Context
[unicode-control-chars] unexpected: The SKILL.md triggered a unicode-control-chars prompt-injection detector. Such characters can be used to hide or manipulate visible instructions and may be an attempt to influence downstream prompt parsing. Even if the rest of the package is coherent, inspect SKILL.md and README for hidden/zero-width characters or other obfuscation before trusting or copying files into global configs.
What to consider before installing
This package appears to do what it says: it sends images+prompts to Gemini and prints the model's sandbox code/output. Still, exercise caution before installing or enabling it globally: 1) Verify provenance — the registry entry lacks a trusted homepage/source; prefer packages from known repositories. 2) Inspect SKILL.md/README for hidden characters (the scanner found unicode-control chars) and any surprising instructions. 3) Use a dedicated/limited-scope GEMINI_API_KEY (don't use a broadly privileged key). 4) If you try it, run in an isolated environment (container/VM) and avoid copying skill files into your global OpenCode config until you trust the author. 5) Be aware that model-written code runs in the remote Gemini sandbox — review outputs carefully; the sandboxed code could attempt network operations or reveal sensitive info if the model is given wide context. If you want to proceed confidently, ask the publisher for a repository link and verify commit history or request signed releases.

Like a lobster shell, security has layers — review code before you run it.

latestvk97ajahag2tzw6yx8hs2mzd9t180aevh

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Runtime requirements

🔭 Clawdis
Binsuv
EnvGEMINI_API_KEY
Primary envGEMINI_API_KEY

Comments