Virtual Desktop Browser

Security checks across malware telemetry and agentic risk

Overview

This skill appears to do what it says, but it gives an agent broad live-browser control and screenshot access without enough isolation or safety guardrails.

Install only in an isolated container or VM with a dedicated, non-personal browser profile. Avoid logged-in personal accounts, treat screenshots as sensitive data, restrict use to sites and tasks you are authorized to automate, require human confirmation before posting, messaging, purchasing, changing settings, or deleting data, and call browser_stop after use.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (12)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: f"--window-size={FIXED_WIDTH},{FIXED_HEIGHT}", ] chrome_cmd.append(url or "about:blank") chrome_proc = subprocess.Popen(chrome_cmd, env=env, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL) state = { "display": disp,
Confidence: 90% confidence
Finding: chrome_proc = subprocess.Popen(chrome_cmd, env=env, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 90% confidence
Finding: The skill exposes capabilities consistent with shell execution, environment access, and file read/write, but the manifest does not declare any permissions or trust boundaries. This is dangerous because consumers may treat the skill as low-risk documentation while it can install packages, launch browser processes, and persist state on disk, enabling unintended system modification or data exposure.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 84% confidence
Finding: The documented purpose understates the effective surveillance and control surface: beyond browser automation, the skill can inspect arbitrary screen regions, read pixel values, activate windows by partial title, perform template matching, and persist runtime state. In a GUI automation context this broadens the scope from simple browsing to generalized desktop interaction, which can be abused to target other windows, capture sensitive on-screen information, or interfere with unrelated applications.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The README explicitly promotes screenshot capture and GUI automation on live browser sessions, including bot-resistant sites, but does not warn that screenshots, pixel reads, and simulated clicks/typing can capture credentials, personal data, session state, or other sensitive on-screen content. In an agent setting, missing safety guidance increases the chance that operators will use these functions on authenticated sessions or sensitive workflows without appropriate consent, redaction, or data-handling controls.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The documentation explicitly exposes a screenshot capability that captures the full virtual desktop and returns the image as Base64, but it does not warn that the screen may contain credentials, personal data, session tokens, private messages, or other sensitive on-screen content. In the context of a GUI browser automation skill intended for real websites, this omission increases the risk of accidental data exfiltration or unsafe logging of sensitive visual data.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The documentation describes automated typing and hotkeys without warning that GUI input is sent to whichever window currently has focus, which can result in credentials, commands, or destructive shortcuts being delivered to the wrong application. In a desktop automation skill that simulates human input, this is a real safety issue because focus changes, popups, or race conditions are common and can cause unintended actions.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The documentation explicitly advertises full-screen and region screenshot capture with Base64 export, but does not warn that captures may include credentials, session data, personal content, or other sensitive information visible in the virtual browser. In a GUI automation skill intended for bot-resistant sites and social platforms, screenshot tooling materially increases the risk of collecting and exfiltrating sensitive on-screen data if misused or used carelessly.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: La documentación expone una función de captura de pantalla que devuelve toda la pantalla virtual en PNG Base64, pero no advierte que puede incluir credenciales, cookies visibles, mensajes privados o datos personales mostrados en el navegador. En un skill orientado a automatizar sitios sensibles y resistentes a bots, esta capacidad incrementa el riesgo de recolección y exfiltración de información visual sin que el usuario entienda las implicaciones de privacidad.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: La documentación describe entrada de texto y combinaciones de teclas sobre la ventana enfocada sin advertir que el foco puede estar en un elemento o ventana distinta a la esperada, provocando envíos accidentales, ejecución de atajos destructivos o ingreso de secretos en el contexto equivocado. Dado que el skill simula interacción humana completa en un escritorio virtual, el riesgo contextual es mayor porque puede operar sobre sesiones autenticadas y flujos reales de UI.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The function captures arbitrary screen contents and returns them as base64, enabling silent collection and exfiltration of whatever is visible in the virtual desktop, including credentials, messages, or session data. Because this skill is explicitly designed for GUI automation, screenshot capture is core functionality but still presents a meaningful privacy and data-exposure risk.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: Automated typing can inject arbitrary text into web forms, terminals, chats, or password prompts within the virtual desktop without any approval barrier. In this context, that can be used to submit credentials, post content, or drive workflows on remote sites in a way that is hard to distinguish from legitimate operator intent.

Missing User Warnings

High

Confidence: 94% confidence
Finding: Hotkey execution allows arbitrary key combinations such as closing windows, opening developer tools, refreshing, selecting all, pasting secrets, or triggering OS/application shortcuts. In a GUI automation skill, unrestricted hotkeys greatly expand the attack surface and can cause destructive or stealthy actions beyond simple browser interaction.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal