Screen Monitor

Security checks across malware telemetry and agentic risk

Overview

The skill has a legitimate screen-analysis purpose, but it handles sensitive screen content through under-scoped local/remote access and weak privacy disclosure.

Review before installing. Use this only on trusted networks, share a single non-sensitive window when possible, stop sharing when finished, and assume screenshots may be written locally under /tmp. Avoid using it where passwords, customer data, private messages, admin consoles, or regulated information may appear on screen.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (12)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 91% confidence
Finding: The skill invokes shell commands (`bash command`) to generate a screen-share URL and analyze captured frames, but the metadata does not declare corresponding permissions. This creates a capability/permission mismatch that can mislead users and host systems about what the skill can do, weakening trust boundaries and review controls.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The server explicitly enables cross-origin requests from any origin and exposes unauthenticated endpoints for uploading screen frames and retrieving screen status. In the context of a screen-monitoring skill, this creates a broad attack surface: any local webpage or process that can reach localhost may be able to inject fake screen data, interfere with monitoring state, or interact with sensitive screen-capture functionality without user authorization.

Context-Inappropriate Capability

Low

Confidence: 89% confidence
Finding: Uploaded screen frames are written to predictable files under /tmp, which introduces local retention of potentially sensitive screen content beyond transient processing. Even if /tmp is local-only, this can expose screenshots and metadata to other local users, processes, or later forensic recovery, increasing privacy and data exposure risk.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The script's documented purpose is screen sharing and analysis, but when no shared WebRTC frame exists it silently falls back to capturing the entire OS desktop. That expands the data collection scope beyond the apparent user intent and can expose unrelated sensitive information such as other windows, notifications, credentials, or personal data to downstream analysis.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Taking an unprompted full-desktop screenshot is a privacy-sensitive capability that is broader than the stated screen-sharing use case. In this context, the skill can collect sensitive on-screen content even when no active shared frame is present, making accidental surveillance or data exfiltration more likely.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The README instructs users to start browser-based screen sharing and ask questions about their screen, but it does not warn that the shared view may include passwords, personal messages, customer data, internal documents, or other sensitive content. In a screen-analysis skill, omission of privacy guidance materially increases the likelihood of accidental data exposure because users are explicitly encouraged to grant live visual access to their desktop or browser.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill enables screen sharing, screenshot capture, frame analysis, and browser clicking/typing, but does not provide an explicit privacy and safety warning to the user. These capabilities can expose sensitive on-screen data, session tokens, personal information, or trigger unintended actions in the browser, especially because the skill advertises both passive viewing and active control.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The manifest explicitly advertises browser-based screen sharing and vision analysis, which implies collection and transmission of highly sensitive user content. Without any visible warning, consent language, or privacy disclosure in the package metadata, users may enable the skill without understanding that screens can contain credentials, personal data, or confidential material.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Declaring remote_access as a feature signals that the skill may expose the host or screen data beyond the local machine, increasing the risk of unauthorized viewing, interception, or misuse. In this context, the combination of remote access, WebRTC, and screen sharing makes the lack of prominent security warnings and trust-boundary documentation materially dangerous rather than merely informational.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The endpoint accepts and stores screen-frame uploads without any visible consent, warning, or disclosure mechanism in the server logic, despite handling highly sensitive visual data. In a screen-sharing skill, silent retention or handling of screenshots materially increases privacy risk because users may not realize their screen contents are being stored locally and remain accessible after capture.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The script captures a full-screen image and immediately passes it to an internal agent for analysis without warning the user about the privacy-sensitive operation. This creates a risk of exposing confidential on-screen material to another processing component without informed consent, especially because the capture may include the entire desktop rather than a deliberately shared surface.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: Once screen sharing begins, the page silently captures full-resolution frames every 3 seconds and uploads them to the backend, while the UI only says the agent can analyze the screen. That does not clearly communicate the ongoing periodic transmission of potentially sensitive on-screen data such as passwords, messages, tokens, or customer information, so users may consent to viewing but not realize they are authorizing repeated backend uploads.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal