i-can-see

Security checks across malware telemetry and agentic risk

Overview

This skill transparently connects to a specified ESP32-CAM, saves a photo locally, and asks the agent to analyze it, but users should treat it as an intentional camera-capture tool.

Install this only if you control the ESP32-CAM at the configured local IP and are comfortable with OpenClaw taking and saving photos when you explicitly ask it to look. Use it in spaces where camera capture is acceptable, keep snapshots in a known folder, and delete sensitive images after analysis.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Findings (6)

Lp3

Medium
Category
MCP Least Privilege
Confidence
93% confidence
Finding
The skill explicitly instructs the agent to access a networked ESP32-CAM endpoint, but no corresponding permission declaration is documented. Hidden or undeclared network access reduces transparency and weakens policy enforcement, making it easier for the skill to perform external communication without informed user or platform approval.

Tp4

High
Category
MCP Tool Poisoning
Confidence
88% confidence
Finding
The skill claims to provide vision by taking a photo, but it also instructs saving images to arbitrary caller-supplied filesystem paths. That broad write capability exceeds the user-facing description and can enable unintended overwrites, persistence of sensitive photos, or storage in unsafe locations if the surrounding agent passes untrusted paths.

Vague Triggers

Medium
Confidence
86% confidence
Finding
Activation phrases like '看看' and '你看到了什么' are broad everyday language and can cause accidental triggering. In this skill's context, mistaken activation is more serious because it leads to real-world image capture and network/device interaction, creating privacy and consent risks from normal conversation.

Vague Triggers

Medium
Confidence
84% confidence
Finding
The example invocation phrases remain ambiguous and reinforce automatic activation for general requests to 'look' or 'help me see this.' Because the workflow immediately captures and saves an image, ambiguity can translate directly into unauthorized photo capture or retention.

Missing User Warnings

High
Confidence
96% confidence
Finding
The skill describes taking real-world photos and analyzing them, but it provides no user-facing privacy notice about image capture, local storage, retention, or possible sensitive content in frame. In a vision skill connected to a physical camera, lack of notice and consent materially increases the risk of covert or surprising surveillance behavior.

Missing User Warnings

High
Confidence
97% confidence
Finding
The workflow tells the agent to automatically create directories, take a photo, save it, and analyze it without any step for consent, privacy warning, or cleanup. This makes the skill especially dangerous because routine operation inherently performs surveillance-like actions and persists potentially sensitive images on disk.

VirusTotal

61/61 vendors flagged this skill as clean.

View on VirusTotal