Wip Ai Devops Toolbox Private

Security checks across malware telemetry and agentic risk

Overview

This DevOps toolbox is mostly coherent, but it bundles under-disclosed private/browser-automation materials with sensitive cookie-import capabilities and makes broad persistent changes to developer tooling.

Install only if you are comfortable giving this toolbox broad local developer-environment authority, including global tools, hooks, GitHub/npm publishing workflows, and persistent agent configuration. Review the bundled ai/ and gstack-private contents first; do not use the browser cookie import features on personal or production accounts unless you explicitly intend to hand authenticated browser sessions to automation.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (224)

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The public guide explicitly instructs readers to consult a private file under a home-directory path, which discloses the existence, location, and likely sensitivity of internal operational documentation. In an agent-executable context, this can prompt unauthorized access attempts to local private materials that are unrelated to the public skill's stated purpose.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The script copies release notes directly from the private repository into the public repository, which can disclose internal roadmap details, security information, customer names, issue references, or other non-public metadata. This exceeds the documented behavior of a file-only sync and creates a real confidentiality risk because only repository-name strings are rewritten, not sensitive content within the notes.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The comments suggest a narrow transformation of private references, but the implementation actually imports the full private release body and performs only a simple string replacement on repo identifiers. That mismatch is dangerous because operators may believe the process is safe while sensitive prose, links, incident details, or internal references remain intact and get published publicly.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The release notes describe behavior where a DevOps toolbox silently performs host-level bootstrapping by globally installing additional software when a command is missing. Even though this is documentation rather than executable code, it signals a feature that expands scope from project tooling into modifying the user's environment without explicit consent, which is risky in an agent-callable skill context.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: A silent global npm install is a system-changing action that is not clearly necessary for the stated DevOps-toolbox purpose and can be abused to alter the host environment unexpectedly. In an AI-assisted or agent-driven workflow, this is more dangerous because the agent may trigger the behavior automatically, reducing user awareness and control over what gets installed.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The documentation explicitly describes importing and decrypting cookies from real Chromium browsers, which enables reuse of authenticated browser sessions and access to sensitive account state. In a DevOps-focused skill, this is out of scope and materially increases the chance of credential/session theft or unauthorized access if an agent invokes it on a developer workstation.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: Interactive cookie-picker routes and UI make sensitive session-cookie import functionality easier to access and operationalize, broadening the attack surface beyond normal browser automation. Because the skill is presented as a DevOps toolbox, exposing a browser-cookie harvesting workflow is not justified by stated purpose and could facilitate silent session takeover.

Context-Inappropriate Capability

Medium

Confidence: 84% confidence
Finding: The handoff/resume workflow preserves cookies and localStorage while transferring state into a visible Chrome session, which can expose authenticated session material across execution contexts. While useful for MFA/CAPTCHA recovery, in this skill context it expands session-handling power beyond typical DevOps needs and increases risk of unintended data exposure or misuse.

Context-Inappropriate Capability

Medium

Confidence: 81% confidence
Finding: Arbitrary in-page JavaScript execution via `js` and `eval` allows unrestricted interaction with page content, DOM state, and authenticated browser context. Although common in browser automation, this is broader than the declared DevOps purpose and can be abused to exfiltrate sensitive page data, tokens, or internal application state from visited sites.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The preamble persists session markers and appends per-use analytics data under ~/.gstack, which is unrelated to core browser QA functionality and occurs automatically. Even though the data stays local, silent telemetry and session tracking expand the skill's data collection footprint and can expose repository names, usage patterns, and behavioral metadata without clear opt-in.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: The skill advertises itself as a browser QA tool but also embeds proactive workflow steering and cross-skill suggestion behavior. This broadens agent behavior beyond the declared purpose and can manipulate session flow or user decisions in ways not necessary for QA execution.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The skill instructs the agent to install external software via a remote curl-to-shell bootstrap if bun is missing. Piping network-fetched code directly into bash creates a supply-chain execution path with little verification, allowing compromise if the source, transport, or environment is tampered with.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The skill supports importing real browser cookies to access authenticated pages, which exposes live session tokens to the automation environment. If mishandled, logged, or reused outside intended scope, those cookies could enable account takeover or unauthorized access to sensitive systems.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill’s preamble performs actions beyond browsing QA, including update checks, telemetry writes, session tracking, and conditional upgrade behavior. This broadens the trust boundary and causes side effects on the host filesystem before the user has explicitly approved anything unrelated to page testing.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: Importing cookies directly from local desktop browsers gives the skill access to authenticated web sessions that may belong to unrelated sites and accounts. In an agent context, this can enable account takeover, cross-service impersonation, or unauthorized access to sensitive data far beyond the declared QA use case.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: This module is explicitly designed to locate, decrypt, and return real Chromium browser cookies from local user profiles, including deriving keys from macOS Keychain secrets. In a skill described as a DevOps/release/compliance toolbox, that capability is unrelated and enables session-token theft and account hijacking, making the mismatch highly suspicious and dangerous.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The code spawns the macOS `security` tool to retrieve Keychain material needed to decrypt browser cookies, which is a credential-access technique rather than a normal DevOps function. Accessing Keychain-backed browser secrets lets the skill recover authenticated web sessions and impersonate the user across sites and internal services.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The code explicitly relies on a 'localhost-only, no auth' trust boundary, but this handler itself does not enforce that boundary by validating the Host/Origin or requiring a loopback client. It exposes powerful endpoints that enumerate installed browsers, list domains, decrypt browser cookies, inject them into a Playwright session, and remove cookies; if the service is ever reachable beyond the local machine due to binding, proxying, or misconfiguration, an attacker could abuse it without authentication.

Description-Behavior Mismatch

Medium

Confidence: 86% confidence
Finding: `storage set` performs a state-changing write inside a module presented as read-only, which can bypass policy assumptions or guardrails built around command categories. In an agentic DevOps/browser automation context, a caller may permit `read` operations but deny writes, so this mismatch can enable unintended mutation of page state and affect authentication, feature flags, or workflow behavior.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: This file exposes a broad browser-automation surface including navigation, clicking, form entry, uploads, cookie/header manipulation, and local file access, which is far beyond the stated DevOps-toolbox purpose. In an agent-callable skill, that mismatch is dangerous because it enables web interaction and data movement capabilities that could be abused for account/session misuse, exfiltration, or unintended actions on arbitrary sites.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The code can import decrypted cookies from installed local browsers and inject them into the automation context, effectively transferring authenticated session state into the agent-controlled browser. That capability is highly sensitive and not justified by the skill description; if misused, it could enable unauthorized access to user accounts and bypass normal authentication boundaries.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: Allowing arbitrary header injection and user-agent spoofing lets the automation impersonate other clients and attach attacker-chosen authentication or routing metadata to requests. In combination with the browsing features in this file, this increases the risk of stealthy scraping, policy evasion, or misuse of bearer tokens/cookies against third-party services.

Description-Behavior Mismatch

Low

Confidence: 95% confidence
Finding: The skill is presented as a safety guardrail, but the embedded activation snippet also performs an unrelated write to ~/.gstack/analytics/skill-usage.jsonl and records repository context. This creates hidden side effects and local telemetry/persistence that a user would not reasonably expect from command inspection alone.

Intent-Code Divergence

Low

Confidence: 94% confidence
Finding: The documentation claims every bash command will be checked, implying passive inspection, but it also includes a side-effecting command that creates a directory and appends analytics data under the user's home directory. That mismatch is security-relevant because it hides state changes unrelated to the stated safety function.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The skill explicitly claims to be read-only, but elsewhere instructs the agent to create temp files, persist session IDs, touch marker files, and write analytics and contributor logs. That mismatch can mislead users and reviewers about the true side effects of invocation, undermining informed consent and safe deployment decisions.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal