Image Vision

v1.0.0

Analyze and interpret images by describing content, extracting text, answering questions, comparing visuals, and extracting structured data from JPG, PNG, GI...

0· 2.2k·12 current·12 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name/description (image analysis, OCR, comparison, extraction) align with the SKILL.md which instructs the agent to call image()/images() on user-supplied image paths and use built-in vision capabilities; no unrelated env vars, binaries, or installs are requested.
Instruction Scope
Instructions only direct the agent to analyze user-provided image file paths and prompts (OCR, Q&A, comparison, extraction). The SKILL.md does not instruct reading arbitrary system files, accessing unrelated environment variables, or sending data to external endpoints beyond the platform's built-in vision processing.
Install Mechanism
No install spec or code files are present (instruction-only), so nothing will be downloaded or written to disk by the skill itself.
Credentials
The skill declares no required environment variables, credentials, or config paths — consistent with its described use of built-in vision capabilities.
Persistence & Privilege
always is false and there are no instructions to modify other skills or system-wide settings; autonomous invocation is allowed by default but is not combined with broad access or persistence requests.
Assessment
This skill appears coherent and limited to analyzing images you supply. Before using it: (1) avoid uploading sensitive images (IDs, credit cards, passwords) because OCR will extract readable text; (2) confirm how the platform handles and stores image content and outputs (retention, sharing, logging); (3) test on harmless images first to verify behavior; and (4) if you need the analysis to run offline or avoid any external transmission, verify your platform's execution model for vision tasks before providing private images.

Like a lobster shell, security has layers — review code before you run it.

latestvk97b7r1mpnepr2ne8zhdsdff8982ykt5

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Vision Analyze

Analyze images using the built-in vision capabilities of multimodal AI models.

Quick Start

Analyze an Image

Describe what's in an image:

# The agent will automatically use vision when you provide an image path
image("/path/to/image.jpg", prompt="Describe what's in this image")

Extract Text (OCR)

Extract text from images:

image("/path/to/document.png", prompt="Extract all text from this image")

Analyze Multiple Images

Compare or analyze multiple images:

images(["/path/to/image1.jpg", "/path/to/image2.jpg"], 
       prompt="Compare these two images and describe the differences")

Usage Patterns

Visual Q&A

Ask specific questions about image content:

image("menu.jpg", prompt="What are the prices of the main courses?")
image("chart.png", prompt="What trend does this graph show?")
image("screenshot.png", prompt="What error message is displayed?")

Content Moderation

Check image content:

image("upload.jpg", prompt="Is this image appropriate for a professional setting?")

Data Extraction

Extract structured data from visual content:

image("receipt.jpg", prompt="Extract the date, total amount, and items purchased")
image("business_card.png", prompt="Extract name, phone, email, and company")
image("form.jpg", prompt="Extract all filled fields as key-value pairs")

Visual Comparison

Compare images:

images(["before.jpg", "after.jpg"], 
       prompt="What changes were made between these two images?")

Tips

  • Be specific: The more specific your prompt, the better the results
  • Multiple images: You can analyze up to 20 images at once
  • Supported formats: JPG, PNG, GIF, WebP
  • Size limits: Large images are automatically resized

When to Use

  • Reading text from screenshots, documents, or photos
  • Describing visual content for accessibility
  • Analyzing charts, graphs, or diagrams
  • Comparing visual changes
  • Extracting data from forms or receipts
  • Understanding UI elements or error messages

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…