Security audit

Avenir Web

Security checks across malware telemetry and agentic risk

Overview

This is a coherent autonomous web-browsing skill, but it needs Review because it can act on arbitrary live websites while sending screenshots and task data to OpenRouter and storing sensitive run artifacts locally.

Install only if you are comfortable with an autonomous browser agent sending page screenshots, visible content, task text, and action history to OpenRouter and saving detailed local logs/screenshots. Avoid using it on banking, healthcare, private dashboards, password entry, payment, or account-management pages unless you first add explicit confirmation, redaction, and retention controls. Review the stealth browser instrumentation and remove or gate it if you need compliance with website automation policies.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (28)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: config_without_key["api_keys"]["openrouter_api_key"] = "Your API key here" toml.dump(config_without_key, f) else: os.system(" ".join(["cp", str(config), str(save_file)]))
Confidence: 97% confidence
Finding: os.system(" ".join(["cp", str(config), str(save_file)]))

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The agent sends screenshot-derived image data and task instructions to an external LLM service for GUI grounding via network requests. This can expose sensitive on-screen content, internal URLs, or user-entered data to a third party without any consent gate, redaction, or clear limitation in this file.

Intent-Code Divergence

Medium

Confidence: 91% confidence
Finding: The SAY action text is explicitly labeled as 'Chain of Thought'/'thinking step' but is both logged and returned to callers, which can expose internal reasoning, prompts, task context, or sensitive intermediate data. In an agentic browser automation system, this increases risk because model-generated 'thinking' may contain secrets, personal data, or operational details that should never be surfaced externally.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The prompt layer explicitly tells the agent not to use GOTO for navigation, but the parser, action schema, and validator still accept and normalize GOTO as a valid action. This policy/implementation mismatch can let an LLM or prompt-injected page bypass intended navigation restrictions and trigger off-policy URL navigation, increasing the chance of offsite browsing or reaching prohibited flows.

Intent-Code Divergence

Medium

Confidence: 84% confidence
Finding: The select-failure warning tells the agent to try clicking to open a dropdown after earlier instructions say select elements must be handled only with direct SELECT and not by clicking. Conflicting guidance weakens safety controls and can cause the agent to fall back to less constrained UI interactions, making unintended clicks or navigation more likely under failure conditions.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The function contract says it saves a dictionary config, but it silently accepts other input types and switches into a shell-based file copy path. This mismatch increases the chance that callers pass attacker-controlled or unexpected values, which in this implementation reaches the dangerous `os.system` sink and can enable command injection.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The handler injects JavaScript that hides browser automation by overriding `navigator.webdriver` and interferes with normal page-close behavior by replacing `window.close` and hooking `beforeunload`. These are stealth and tampering behaviors that can bypass site bot-detection or prevent a page from terminating itself, which is risky in an agent that browses arbitrary websites and provides no consent, policy guardrails, or documented justification here.

Intent-Code Divergence

Low

Confidence: 89% confidence
Finding: The module writes debug output to a hard-coded external path on /Volumes/Lexar, which can leak runtime information to an unexpected location and create reliability and privacy issues. Because the path is fixed and outside normal application-controlled storage, logs may be written to removable/shared media, fail unpredictably, or expose sensitive operational details without clear operator consent.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The README explicitly states that the system stores per-task screenshots and recorded LLM I/O, which can capture sensitive page contents, credentials, personal data, or proprietary prompts during autonomous web sessions. Although this is documentation rather than executable code, the skill context makes the issue more credible because the agent operates on arbitrary websites and long-horizon tasks, increasing the chance of collecting and retaining sensitive data without prominent privacy warnings, minimization controls, or redaction guidance.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill explicitly states that `read_page.py` captures a screenshot and sends the screenshot plus page metadata to the main model, but it does not require any user notice, consent, or data-sensitivity check before transmission. In a web-automation context, screenshots can contain credentials, personal data, account details, or confidential business information, so silent forwarding to an external model creates a real privacy and data-handling risk.

Vague Triggers

Low

Confidence: 90% confidence
Finding: The file defines broad web-automation tasks across multiple high-impact sites without any explicit activation constraints, allowed domains beyond a single URL field, or user-confirmation boundaries for consequential actions. This is dangerous because an agent consuming this dataset could execute sensitive browsing and commerce workflows from loosely scoped task text, increasing the chance of overreach, unintended actions, or prompt/task injection through ambiguous task interpretation.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The Amazon task explicitly instructs the agent to add two items to the basket but provides no requirement to obtain just-in-time user confirmation before taking a state-changing shopping action. In an agent skill context, this can lead to unauthorized cart manipulation, unwanted purchases if subsequent checkout is automated, and erosion of user trust around financial or account-affecting operations.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The script captures a screenshot of an arbitrary webpage and submits it, along with page metadata and a user question, to a model without any user-facing notice, consent check, or data-sensitivity guardrail. Because webpages can contain personal data, credentials, internal dashboards, or regulated information, silently exporting visual page contents to an external model can create privacy, confidentiality, and compliance risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The agent persists detailed LLM interaction records to disk, including prompts, outputs, and image paths, which may contain sensitive task data, browsing context, or secrets visible in screenshots. Storing this by default increases the blast radius of any local compromise and creates an unnecessary privacy and retention risk.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The grounding function base64-encodes screenshots and transmits them with textual instructions to a remote LLM endpoint. Because screenshots may include credentials, personal data, or confidential business information, this creates a direct data-exfiltration path from the user's browser session to an external provider.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The TYPE handler logs the full value being entered with `self.logger.info(f"Typed '{value}'")`, which can capture passwords, API keys, search queries, PII, or other secrets in application logs. Because this component automates arbitrary web form interactions, the surrounding skill context makes the issue more dangerous: typed values are very likely to include sensitive credentials and user data.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: This code sends prompt text and optional image data to the third-party OpenRouter service via LiteLLM, including base64-encoded images, but there is no indication here of consent, disclosure, or data-classification checks before transmission. In an agent/runtime context, prompts may contain sensitive user inputs, system prompts, task data, or local image contents, so silent external transmission creates a genuine privacy and data-governance risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The in-memory LLM_IO_RECORDS store captures full messages and model outputs, and sanitization only truncates large base64 image URLs rather than removing prompt or response contents. This can retain sensitive data in process memory for later exposure through debugging endpoints, crashes, memory dumps, or unintended reuse by other components.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The function serializes action descriptions and sends them to an external/internal LLM engine for summarization. Action descriptions can contain sensitive user activity, page content, credentials, personal data, or proprietary workflow details, and this code performs no visible redaction, minimization, consent check, or policy gate before transmission.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: This history update path sends both the prior accumulated summary and new action data to the LLM, increasing the volume and persistence of potentially sensitive information exposed to the model provider or downstream logs. Because prior summaries may already contain condensed sensitive context, repeated updates can compound privacy leakage and broaden data retention beyond what is necessary.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The code silently injects scripts into every opened page to conceal automation and block or intercept close behavior, with no warning or limitation based on destination origin. In the context of a browser automation skill, this increases the ability to interact with third-party sites deceptively and can undermine site safety controls or user expectations about page behavior.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: This code captures and returns visible page text, active element metadata, and active form values, including up to 200 characters of an input's current value and selected option text. In a browser automation or agent setting, that can expose sensitive information such as typed credentials, personal data, tokens, or confidential page content to logs, downstream components, or telemetry without any minimization or consent controls visible in this code.

Missing User Warnings

Medium

Confidence: 77% confidence
Finding: This function persists annotated screenshots that can include the active element's tag, placeholder, id, focus location, and click coordinates. On pages with login forms, internal dashboards, or sensitive workflows, these overlays can reveal contextual secrets or user activity patterns and increase the sensitivity of stored screenshots beyond a plain image.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The code silently writes debug logs to a fixed file path with no user-facing disclosure, which can surprise users and capture stack traces or status data without their awareness. In a dashboard tied to agent activity, those logs may contain task details, errors, or operational history that should not be persisted implicitly.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The function sends `task_description`, `website`, and policy-derived prompt content to an external OpenRouter-backed LLM service via `litellm.acompletion`. Even if this is expected product behavior, the code has no visible minimization, consent, or sensitivity gating at this call site, so user-supplied operational data may be disclosed to a third party. In a planning/automation context, task text can easily contain credentials, internal URLs, or regulated data, making external transmission a real privacy and data-governance risk.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal