GUI Agent

Security checks across malware telemetry and agentic risk

Overview

This is a real GUI automation skill, but it gives an agent broad screen, keyboard, memory, and remote-VM control with some under-scoped execution paths users should review carefully.

Install only if you are comfortable granting the agent screen capture, accessibility control, clipboard access, local UI memory storage, and optional remote VM control. Prefer using it in a dedicated VM or non-sensitive desktop session, review and restrict remote endpoints, avoid storing sensitive screenshots/components, and be cautious with tasks involving account settings, deletion, messages, passwords, or business data.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (109)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: if args.remote and args.remote_cmd: import subprocess try: result = subprocess.run( args.remote_cmd.split() + [os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "platforms", "detect.py"), "--json"], capture_output=True, text=True, timeout=10 )
Confidence: 93% confidence
Finding: result = subprocess.run( args.remote_cmd.split() + [os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "platforms", "detect.py"), "--json"],

subprocess module call

Medium

Category: Dangerous Code Execution
Content: Selects the largest window if the app has multiple windows. """ try: r = subprocess.run( ["osascript", "-e", f'tell application "System Events" to tell process "{app_name}"\n' f' set best to missing value\n'
Confidence: 96% confidence
Finding: r = subprocess.run( ["osascript", "-e", f'tell application "System Events" to tell process "{app_name}"\n' f' set best to missing value\n' f

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def get_current_url(app_name="Google Chrome"): """Get the current URL from the browser address bar.""" try: r = subprocess.run( ["osascript", "-e", f'tell application "{app_name}" to return URL of active tab of front window'], capture_output=True, text=True, timeout=15 )
Confidence: 96% confidence
Finding: r = subprocess.run( ["osascript", "-e", f'tell application "{app_name}" to return URL of active tab of front window'], capture_output=True, text=True, timeout=15

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def focus(title): subprocess.run(["osascript", "-e", f'tell application "{title}" to activate'], capture_output=True) print(f"focused: {title}")
Confidence: 97% confidence
Finding: subprocess.run(["osascript", "-e", f'tell application "{title}" to activate'], capture_output=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def close(title): subprocess.run(["osascript", "-e", f'tell application "{title}" to close (every window whose name contains "{title}")'], capture_output=True) print(f"closed: {title}")
Confidence: 98% confidence
Finding: subprocess.run(["osascript", "-e", f'tell application "{title}" to close (every window whose name contains "{title}")'], capture_output=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: # ═══════════════════════════════════════════ def osascript(script): r = subprocess.run(["osascript", "-e", script], capture_output=True, text=True, timeout=10) return r.stdout.strip() def shell(cmd, timeout=15):
Confidence: 97% confidence
Finding: r = subprocess.run(["osascript", "-e", script], capture_output=True, text=True, timeout=10)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: return r.stdout.strip() def shell(cmd, timeout=15): r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout, env={**os.environ, "LANG": "en_US.UTF-8", "LC_ALL": "en_US.UTF-8"}) return r.stdout.strip()
Confidence: 100% confidence
Finding: r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout, env={**os.environ, "LANG": "en_US.UTF-8", "LC_ALL": "en_US.UTF-8"})

subprocess module call

Medium

Category: Dangerous Code Execution
Content: """Bring app window to front.""" if SYSTEM == "Darwin": try: subprocess.run(["osascript", "-e", f'tell application "System Events" to set frontmost of process "{app_name}" to true'], capture_output=True, timeout=5) time.sleep(0.3)
Confidence: 92% confidence
Finding: subprocess.run(["osascript", "-e", f'tell application "System Events" to set frontmost of process "{app_name}" to true'], capture_output=True, timeout=5)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: """ if SYSTEM == "Darwin": try: r = subprocess.run(["osascript", "-l", "JavaScript", "-e", f''' var se = Application("System Events"); var ws = se.processes["{app_name}"].windows(); var best = null;
Confidence: 93% confidence
Finding: r = subprocess.run(["osascript", "-l", "JavaScript", "-e", f''' var se = Application("System Events"); var ws = se.processes["{app_name}"].windows(); var best = null; var bestArea = 0; for

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def take_screenshot(path=SCREENSHOT_PATH): """Take a screenshot and return the image.""" subprocess.run(["/usr/sbin/screencapture", "-x", path], check=True) img = cv2.imread(path) return img
Confidence: 88% confidence
Finding: subprocess.run(["/usr/sbin/screencapture", "-x", path], check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: return x, y = result["x"], result["y"] subprocess.run(["cliclick", f"c:{x},{y}"], check=True) print(json.dumps({ "clicked": True,
Confidence: 82% confidence
Finding: subprocess.run(["cliclick", f"c:{x},{y}"], check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def take_window_screenshot(window_id, out_path="/tmp/ui_detect_window.png"): """Capture a specific window by ID.""" subprocess.run(["/usr/sbin/screencapture", "-x", "-l", str(window_id), out_path], check=True, timeout=5) return out_path
Confidence: 83% confidence
Finding: subprocess.run(["/usr/sbin/screencapture", "-x", "-l", str(window_id), out_path], check=True, timeout=5)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def take_fullscreen(out_path="/tmp/ui_detect_full.png"): """Capture full screen.""" subprocess.run(["/usr/sbin/screencapture", "-x", out_path], check=True, timeout=5) return out_path
Confidence: 84% confidence
Finding: subprocess.run(["/usr/sbin/screencapture", "-x", out_path], check=True, timeout=5)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def detect_ax_dock(): """Get Dock items via AX API.""" r = subprocess.run(['osascript', '-l', 'JavaScript', '-e', ''' var se = Application("System Events"); var list1 = se.processes["Dock"].uiElements[0]; var items = list1.uiElements();
Confidence: 86% confidence
Finding: r = subprocess.run(['osascript', '-l', 'JavaScript', '-e', ''' var se = Application("System Events"); var list1 = se.processes["Dock"].uiElements[0]; var items = list1.uiElements(); var r = []; fo

subprocess module call

Medium

Category: Dangerous Code Execution
Content: def detect_ax_menubar(): """Get menu bar items via AX API.""" r = subprocess.run(['osascript', '-l', 'JavaScript', '-e', ''' var se = Application("System Events"); var front = se.processes.whose({frontmost: true})[0]; var bar = front.menuBars[0];
Confidence: 86% confidence
Finding: r = subprocess.run(['osascript', '-l', 'JavaScript', '-e', ''' var se = Application("System Events"); var front = se.processes.whose({frontmost: true})[0]; var bar = front.menuBars[0]; var items =

subprocess module call

Medium

Category: Dangerous Code Execution
Content: print(f" 📸 Full screen screenshot") else: if app_name: subprocess.run(["osascript", "-e", f'tell application "{app_name}" to activate'], capture_output=True, timeout=5) time.sleep(0.5) win = get_window_info(app_name or get_front_app())
Confidence: 98% confidence
Finding: subprocess.run(["osascript", "-e", f'tell application "{app_name}" to activate'], capture_output=True, timeout=5)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 90% confidence
Finding: The skill advertises and appears to require powerful capabilities including shell, file read/write, environment access, and network access, but does not declare permissions. This creates a transparency and policy-enforcement gap: users and hosting systems cannot accurately assess or constrain what the skill can do before execution, which is especially risky for a GUI automation skill that can interact broadly with the local system.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 96% confidence
Finding: The documented description frames the skill as simple GUI automation, but the referenced behavior includes persistent memory storage, remote VM control over HTTP, remote command execution, extensive file writes, analytics, setup/install actions, and broader workflow/state management. This mismatch is dangerous because it can mislead reviewers and users about the real attack surface, causing them to grant trust to a skill that can exfiltrate data, alter systems, or execute commands beyond the expected GUI-only scope.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The design explicitly expands a local GUI automation skill into arbitrary remote control over HTTP and SSH, which materially changes the trust boundary and attack surface. In an agent setting, this can enable the model or a downstream prompt injection to drive actions on external systems, capture remote screenshots, and manipulate remote desktops without clear scoping, authentication, or authorization controls.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: Documenting SSH-based remote execution inside a GUI automation skill introduces a powerful capability that is broader than the stated visual automation purpose. Even if presented as 'future' support, SSH via subprocess can become a bridge to general remote command execution and host access, increasing the risk of abuse by an agent or attacker-controlled instruction flow.

Context-Inappropriate Capability

High

Confidence: 95% confidence
Finding: Using subprocess-driven SSH is unjustified for the described GUI automation use case and creates a path to capabilities far beyond clicking and typing. In practice, any SSH backend tends to blur into arbitrary remote execution, which is especially dangerous when exposed through an agent interface that may act on untrusted instructions.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The open_app action allows arbitrary application launch via an unconstrained shell command template (`{app_command} &`), which expands the skill beyond GUI primitives into general command execution. In a GUI automation skill this materially increases abuse potential because a caller can start unexpected programs, invoke shell metacharacters, or launch network-enabled/debugging tools unrelated to the declared task.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The quit_app fallback uses `pkill -f '{process_name}'`, which can terminate arbitrary processes by pattern match rather than limiting itself to the foreground GUI app. This is risky in a GUI automation context because it enables destructive interruption of unrelated applications, potential data loss, and easy misuse beyond normal UI automation needs.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The document explicitly recommends using Chrome DevTools Protocol JavaScript to remove cookie and privacy overlays, which expands the agent from GUI-only interaction into privileged browser automation and DOM manipulation. That scope creep is security-relevant because it can bypass intended user-facing consent flows and create hidden action capability beyond the advertised screenshot→detect→act model.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The markdown documents use of shell commands such as changing display configuration with xrandr and launching Chromium from the command line, adding host-level command execution not inherent to GUI automation. In a skill advertised as GUI-only, undocumented subprocess capability increases attack surface because it could be repurposed for arbitrary system actions outside the visible browser workflow.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal