gesture-control-generator

Security checks across malware telemetry and agentic risk

Overview

This skill is a legitimate gesture-control HTML generator, but its generated pages automatically request webcam access before a clear user action or privacy notice.

Review before installing. This skill appears intended for creative browser demos, not malware, but generated pages may request and start webcam-based hand tracking immediately. Install only if you are comfortable with that behavior, and prefer using or modifying it so camera mode starts only after a clear Enable Camera action and includes a short notice about local processing, CDN-loaded dependencies, and mouse-only fallback.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (11)

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The framework dynamically loads executable JavaScript from public CDNs at runtime, which expands trust to third-party infrastructure and introduces supply-chain and integrity risk. If a CDN resource is compromised, replaced, or unexpectedly changed, arbitrary code would execute in the page context without the skill author shipping a reviewed copy.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The scene automatically attempts to initialize camera-backed gesture tracking during startup, which is more invasive than a generator-style skill description suggests. This creates an unexpected privacy-sensitive permission request and webcam activation path before clear, explicit user intent is established.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The code proactively calls getUserMedia to test permission/device availability and then starts the camera flow, causing live webcam access behavior beyond the narrowly stated purpose of generating an interaction scene. Even if browser permission prompts intervene, this is still an unnecessary collection capability and broadens the privacy attack surface.

Context-Inappropriate Capability

Medium

Confidence: 86% confidence
Finding: Binding a large set of p5 APIs and state onto the global window object creates broad page-wide capabilities and increases the chance of namespace collisions, unintended interactions, or abuse by other scripts running in the same page. In a generated-scene context, this weakens encapsulation and makes the framework more permissive than necessary.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The code automatically probes and then starts camera access in _tryStartCamera() during setup, before a clear user action. For a skill described as an HTML generator, silently escalating into webcam access is broader than expected behavior and can surprise users into granting sensitive device permissions.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The framework dynamically injects third-party scripts from jsDelivr at runtime for p5.js and MediaPipe. This introduces an external supply-chain and network dependency not obvious from the skill description, and if those resources are tampered with or unavailable the generated experience can execute unreviewed code or fail unexpectedly.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The README promotes camera-based gesture control but does not clearly disclose privacy implications such as camera access, local-only processing expectations, or whether images/landmarks are stored or transmitted. In a skill that generates browser code using webcam input, this omission can mislead users into granting sensitive permissions without informed consent.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The skill instructs creation of camera-enabled gesture control but does not prominently warn users about camera activation, hand-tracking, or the privacy implications of loading gesture-recognition components. Even if browser permission prompts exist, the skill still normalizes generating artifacts that rely on camera access without clear upfront disclosure, which can surprise users and weaken informed consent.

Missing User Warnings

Low

Confidence: 81% confidence
Finding: The skill describes copying JavaScript files and writing HTML files to disk, but it does not clearly present these as potentially modifying the user’s filesystem or require explicit confirmation before doing so. Hidden or under-disclosed file operations are risky because they can overwrite files, create artifacts in unexpected locations, or cause users to approve actions they did not fully understand.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: Attempting camera access on startup without a prior in-product disclosure is a real privacy and consent problem because users encounter a sensitive permission prompt before understanding why it is needed. In this skill context, that mismatch makes the behavior more suspicious and erodes informed consent.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The code attempts to activate camera functionality automatically and the UI text does not clearly warn the user in advance that webcam permission will be requested. Even if the browser still enforces permission prompts, the lack of explicit in-app notice undermines informed consent for access to a highly sensitive sensor.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal