Gstack Openclaw

Security checks across malware telemetry and agentic risk

Overview

This is a Markdown-only engineering prompt pack, but it needs review because it under-discloses high-impact workflows like PR merging, production deployment, rollback, and code changes.

Install only if you intend to use it as an advisory engineering workflow pack. Do not grant it GitHub, browser, CI/CD, cloud, monitoring, notification, or production-deployment authority without per-action confirmation, least-privilege credentials, and review of diffs/commands before merges, rollbacks, telemetry collection, or code changes.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (19)

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The README explicitly claims the skill is documentation-only and non-executable, but later advertises operational behaviors such as deployment, browser testing, GitHub integration, notifications, benchmarking, and rollback decisions. This mismatch can mislead users and reviewers about the trust boundary of the skill, causing them to enable or install it under a weaker risk assumption than its described capabilities warrant.

Description-Behavior Mismatch

Medium

Confidence: 82% confidence
Finding: The manifest/description frames gstack as a thinking or workflow collection, while the README markets concrete execution-oriented capabilities that imply active operational behavior. This kind of capability inflation or ambiguity is dangerous because users may install a seemingly harmless planning skill while expecting or later granting access to functions with materially different security implications.

Intent-Code Divergence

Medium

Confidence: 86% confidence
Finding: The version history reiterates that the package is documentation-only and does not involve external APIs, yet the rest of the README continues to describe behaviors that normally require environment access or third-party interaction. Repeated safety claims in tension with the advertised functionality are a red flag because they can suppress scrutiny and create a false sense of safety.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The file explicitly claims there is 'no install script' and presents the skill as documentation-only, yet later instructs users to run './install.sh'. This contradiction can mislead users and reviewers into underestimating execution risk, making social engineering or unsafe installation more likely.

Intent-Code Divergence

Low

Confidence: 81% confidence
Finding: The security notice says the skill does not read or write user files, but the instructions tell the user to create a 'GSTACK.md' file in the project root. Even if the write is user-initiated, the claim is materially inaccurate and can cause users to trust the skill's file-handling boundaries more than they should.

Description-Behavior Mismatch

Medium

Confidence: 87% confidence
Finding: The manifest markets the package as a documentation-only prompt skill, but the documented feature set includes deployment, browser testing, GitHub integration, notifications, and monitoring-oriented workflows. This broad operational scope conflicts with the safety framing and may cause operators to install or trust the skill without understanding its effective capabilities and required permissions.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The anomaly example states a trigger condition is met, but the numeric comparison and resulting action are inconsistent. In an operational skill that guides rollout decisions, contradictory rollback logic can cause the agent or operator to continue a bad deployment when rollback is warranted, or create confusion during an incident.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The documented auto-rollback rules say dangerous metrics should immediately trigger rollback, but later example content shows a weaker action for similar conditions. This inconsistency is dangerous because operators may trust the example over the rule, leading to delayed rollback and prolonged user impact during a faulty canary release.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The skill is presented as a QA/test-strategy assistant, but it also describes a workflow that progresses from bug discovery to generating fixes, local validation, and PR submission. That scope expansion is dangerous because users may invoke a seemingly advisory QA skill and unexpectedly authorize code-changing actions without a clear trust boundary or explicit consent step.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The documented auto-fix capability is not aligned with the skill's stated QA purpose, creating a deceptive capability boundary. In practice, this can lead operators to grant broader permissions than intended or trigger source changes from a skill they expected to only analyze and recommend tests.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The manual installation path instructs users to clone a repository and execute './install.sh' with no disclosure of what the script does or warning that it may modify the local environment. Running repository-provided shell scripts is a common supply-chain risk and is especially concerning here because the document elsewhere claims there is no install script.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The RUM example transmits Web Vitals data to an analytics endpoint without any notice that this constitutes telemetry collection and may include user/device/network-related metadata. In a skill meant to guide implementation, this omission can cause users to deploy tracking behavior without considering consent, retention, minimization, or applicable privacy requirements.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The external API examples send site URLs to third-party services and include use of API keys, but the skill does not warn that this shares operational metadata externally or that secrets must be protected. This can lead users to expose internal/pre-release URLs, misuse production credentials, or paste keys into insecure environments.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The skill advertises automatic rollback behavior without an explicit warning that rollback is a disruptive production action with availability, data consistency, and change-management implications. In deployment contexts, an agent presenting autonomous rollback as routine can encourage unsafe use without operator confirmation or guardrails.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill is explicitly positioned to merge PRs, deploy to production, and perform rollback actions, but it does not prominently warn users that these are high-impact operations that can change live production state. In an agent setting, this increases the chance of accidental destructive actions or over-trusting automation for sensitive workflows.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill describes automatic rollback behavior that can autonomously alter production state based on detected metrics, but it does not clearly warn the user that the agent may take such action without an additional approval gate. Autonomous rollback can be operationally appropriate, but in a general-purpose skill it is dangerous because false positives, misconfiguration, or prompt misuse could trigger disruptive production changes.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The markdown describes an end-to-end repair flow that can modify code and create PRs, but it does not include a clear warning that repository contents may be changed. This increases the risk of unintended modifications, especially in agentic environments where documentation influences tool behavior and user expectations.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The '[应用修复]' action is an imperative affordance suggesting immediate code modification, yet it appears without any safety notice, review requirement, or confirmation language. Such unqualified repair actions can cause unauthorized or low-quality changes to be applied under the guise of QA assistance.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: This repeated '[应用修复]' action reinforces the same unsafe pattern: a direct code-change affordance with no warning or consent checkpoint. Repetition makes the behavior more likely to be treated as normal and lowers user caution around source modification.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal