GUI Agent

Security checks across malware telemetry and agentic risk

Overview

This is a real GUI automation skill, but it gives an agent broad screen, keyboard, memory, and remote-VM control with some under-scoped execution paths users should review carefully.

Install only if you are comfortable granting the agent screen capture, accessibility control, clipboard access, local UI memory storage, and optional remote VM control. Prefer using it in a dedicated VM or non-sensitive desktop session, review and restrict remote endpoints, avoid storing sensitive screenshots/components, and be cautious with tasks involving account settings, deletion, messages, passwords, or business data.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (109)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
if args.remote and args.remote_cmd:
        import subprocess
        try:
            result = subprocess.run(
                args.remote_cmd.split() + [os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "platforms", "detect.py"), "--json"],
                capture_output=True, text=True, timeout=10
            )
Confidence
93% confidence
Finding
result = subprocess.run( args.remote_cmd.split() + [os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "platforms", "detect.py"), "--json"],

subprocess module call

Medium
Category
Dangerous Code Execution
Content
Selects the largest window if the app has multiple windows.
    """
    try:
        r = subprocess.run(
            ["osascript", "-e",
             f'tell application "System Events" to tell process "{app_name}"\n'
             f'  set best to missing value\n'
Confidence
96% confidence
Finding
r = subprocess.run( ["osascript", "-e", f'tell application "System Events" to tell process "{app_name}"\n' f' set best to missing value\n' f

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def get_current_url(app_name="Google Chrome"):
    """Get the current URL from the browser address bar."""
    try:
        r = subprocess.run(
            ["osascript", "-e", f'tell application "{app_name}" to return URL of active tab of front window'],
            capture_output=True, text=True, timeout=15
        )
Confidence
96% confidence
Finding
r = subprocess.run( ["osascript", "-e", f'tell application "{app_name}" to return URL of active tab of front window'], capture_output=True, text=True, timeout=15

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def focus(title):
    subprocess.run(["osascript", "-e", f'tell application "{title}" to activate'], capture_output=True)
    print(f"focused: {title}")
Confidence
97% confidence
Finding
subprocess.run(["osascript", "-e", f'tell application "{title}" to activate'], capture_output=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def close(title):
    subprocess.run(["osascript", "-e",
        f'tell application "{title}" to close (every window whose name contains "{title}")'],
        capture_output=True)
    print(f"closed: {title}")
Confidence
98% confidence
Finding
subprocess.run(["osascript", "-e", f'tell application "{title}" to close (every window whose name contains "{title}")'], capture_output=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
# ═══════════════════════════════════════════

def osascript(script):
    r = subprocess.run(["osascript", "-e", script], capture_output=True, text=True, timeout=10)
    return r.stdout.strip()

def shell(cmd, timeout=15):
Confidence
97% confidence
Finding
r = subprocess.run(["osascript", "-e", script], capture_output=True, text=True, timeout=10)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
return r.stdout.strip()

def shell(cmd, timeout=15):
    r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout,
                       env={**os.environ, "LANG": "en_US.UTF-8", "LC_ALL": "en_US.UTF-8"})
    return r.stdout.strip()
Confidence
100% confidence
Finding
r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout, env={**os.environ, "LANG": "en_US.UTF-8", "LC_ALL": "en_US.UTF-8"})

subprocess module call

Medium
Category
Dangerous Code Execution
Content
"""Bring app window to front."""
    if SYSTEM == "Darwin":
        try:
            subprocess.run(["osascript", "-e",
                f'tell application "System Events" to set frontmost of process "{app_name}" to true'],
                capture_output=True, timeout=5)
            time.sleep(0.3)
Confidence
92% confidence
Finding
subprocess.run(["osascript", "-e", f'tell application "System Events" to set frontmost of process "{app_name}" to true'], capture_output=True, timeout=5)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
"""
    if SYSTEM == "Darwin":
        try:
            r = subprocess.run(["osascript", "-l", "JavaScript", "-e", f'''
var se = Application("System Events");
var ws = se.processes["{app_name}"].windows();
var best = null;
Confidence
93% confidence
Finding
r = subprocess.run(["osascript", "-l", "JavaScript", "-e", f''' var se = Application("System Events"); var ws = se.processes["{app_name}"].windows(); var best = null; var bestArea = 0; for

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def take_screenshot(path=SCREENSHOT_PATH):
    """Take a screenshot and return the image."""
    subprocess.run(["/usr/sbin/screencapture", "-x", path], check=True)
    img = cv2.imread(path)
    return img
Confidence
88% confidence
Finding
subprocess.run(["/usr/sbin/screencapture", "-x", path], check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
return
    
    x, y = result["x"], result["y"]
    subprocess.run(["cliclick", f"c:{x},{y}"], check=True)
    
    print(json.dumps({
        "clicked": True,
Confidence
82% confidence
Finding
subprocess.run(["cliclick", f"c:{x},{y}"], check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def take_window_screenshot(window_id, out_path="/tmp/ui_detect_window.png"):
    """Capture a specific window by ID."""
    subprocess.run(["/usr/sbin/screencapture", "-x", "-l", str(window_id), out_path],
                   check=True, timeout=5)
    return out_path
Confidence
83% confidence
Finding
subprocess.run(["/usr/sbin/screencapture", "-x", "-l", str(window_id), out_path], check=True, timeout=5)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def take_fullscreen(out_path="/tmp/ui_detect_full.png"):
    """Capture full screen."""
    subprocess.run(["/usr/sbin/screencapture", "-x", out_path], check=True, timeout=5)
    return out_path
Confidence
84% confidence
Finding
subprocess.run(["/usr/sbin/screencapture", "-x", out_path], check=True, timeout=5)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def detect_ax_dock():
    """Get Dock items via AX API."""
    r = subprocess.run(['osascript', '-l', 'JavaScript', '-e', '''
var se = Application("System Events");
var list1 = se.processes["Dock"].uiElements[0];
var items = list1.uiElements();
Confidence
86% confidence
Finding
r = subprocess.run(['osascript', '-l', 'JavaScript', '-e', ''' var se = Application("System Events"); var list1 = se.processes["Dock"].uiElements[0]; var items = list1.uiElements(); var r = []; fo

subprocess module call

Medium
Category
Dangerous Code Execution
Content
def detect_ax_menubar():
    """Get menu bar items via AX API."""
    r = subprocess.run(['osascript', '-l', 'JavaScript', '-e', '''
var se = Application("System Events");
var front = se.processes.whose({frontmost: true})[0];
var bar = front.menuBars[0];
Confidence
86% confidence
Finding
r = subprocess.run(['osascript', '-l', 'JavaScript', '-e', ''' var se = Application("System Events"); var front = se.processes.whose({frontmost: true})[0]; var bar = front.menuBars[0]; var items =

subprocess module call

Medium
Category
Dangerous Code Execution
Content
print(f"  📸 Full screen screenshot")
    else:
        if app_name:
            subprocess.run(["osascript", "-e", f'tell application "{app_name}" to activate'],
                           capture_output=True, timeout=5)
            time.sleep(0.5)
        win = get_window_info(app_name or get_front_app())
Confidence
98% confidence
Finding
subprocess.run(["osascript", "-e", f'tell application "{app_name}" to activate'], capture_output=True, timeout=5)

Lp3

Medium
Category
MCP Least Privilege
Confidence
90% confidence
Finding
The skill advertises and appears to require powerful capabilities including shell, file read/write, environment access, and network access, but does not declare permissions. This creates a transparency and policy-enforcement gap: users and hosting systems cannot accurately assess or constrain what the skill can do before execution, which is especially risky for a GUI automation skill that can interact broadly with the local system.

Tp4

High
Category
MCP Tool Poisoning
Confidence
96% confidence
Finding
The documented description frames the skill as simple GUI automation, but the referenced behavior includes persistent memory storage, remote VM control over HTTP, remote command execution, extensive file writes, analytics, setup/install actions, and broader workflow/state management. This mismatch is dangerous because it can mislead reviewers and users about the real attack surface, causing them to grant trust to a skill that can exfiltrate data, alter systems, or execute commands beyond the expected GUI-only scope.

Description-Behavior Mismatch

Medium
Confidence
88% confidence
Finding
The design explicitly expands a local GUI automation skill into arbitrary remote control over HTTP and SSH, which materially changes the trust boundary and attack surface. In an agent setting, this can enable the model or a downstream prompt injection to drive actions on external systems, capture remote screenshots, and manipulate remote desktops without clear scoping, authentication, or authorization controls.

Description-Behavior Mismatch

Medium
Confidence
92% confidence
Finding
Documenting SSH-based remote execution inside a GUI automation skill introduces a powerful capability that is broader than the stated visual automation purpose. Even if presented as 'future' support, SSH via subprocess can become a bridge to general remote command execution and host access, increasing the risk of abuse by an agent or attacker-controlled instruction flow.

Context-Inappropriate Capability

High
Confidence
95% confidence
Finding
Using subprocess-driven SSH is unjustified for the described GUI automation use case and creates a path to capabilities far beyond clicking and typing. In practice, any SSH backend tends to blur into arbitrary remote execution, which is especially dangerous when exposed through an agent interface that may act on untrusted instructions.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The open_app action allows arbitrary application launch via an unconstrained shell command template (`{app_command} &`), which expands the skill beyond GUI primitives into general command execution. In a GUI automation skill this materially increases abuse potential because a caller can start unexpected programs, invoke shell metacharacters, or launch network-enabled/debugging tools unrelated to the declared task.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
The quit_app fallback uses `pkill -f '{process_name}'`, which can terminate arbitrary processes by pattern match rather than limiting itself to the foreground GUI app. This is risky in a GUI automation context because it enables destructive interruption of unrelated applications, potential data loss, and easy misuse beyond normal UI automation needs.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
The document explicitly recommends using Chrome DevTools Protocol JavaScript to remove cookie and privacy overlays, which expands the agent from GUI-only interaction into privileged browser automation and DOM manipulation. That scope creep is security-relevant because it can bypass intended user-facing consent flows and create hidden action capability beyond the advertised screenshot→detect→act model.

Context-Inappropriate Capability

Medium
Confidence
89% confidence
Finding
The markdown documents use of shell commands such as changing display configuration with xrandr and launching Chromium from the command line, adding host-level command execution not inherent to GUI automation. In a skill advertised as GUI-only, undocumented subprocess capability increases attack surface because it could be repurposed for arbitrary system actions outside the visible browser workflow.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal