OpenClaw-Skill-Creator

Security checks across malware telemetry and agentic risk

Overview

This skill is not clearly malicious, but it needs review because its helper scripts can send skill/eval content through the Claude CLI, write into local Claude project files, and terminate local processes on a port.

Install only if you intentionally want a Claude Code-style skill creation and evaluation workflow. Avoid using it on proprietary or secret skill content unless you are comfortable sending prompts and eval data through your configured Claude CLI session. Prefer the safer successor mentioned in the artifact changelog, or run this in an isolated workspace and avoid the server mode unless you accept that it may terminate an existing local process on the chosen port.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (9)

Intent-Code Divergence

Medium
Confidence
91% confidence
Finding
The skill gives contradictory operational guidance about whether OpenClaw has subagents and whether to use browser/server review versus static HTML review. In an agentic workflow, inconsistent capability assumptions can cause the agent to choose the wrong execution path, skip safeguards, fail to collect feedback, or mishandle evaluation steps, reducing reliability and weakening review controls.

Intent-Code Divergence

Medium
Confidence
92% confidence
Finding
The file defines two materially different agent roles and output contracts in a single skill without an explicit dispatch boundary. An agent given mixed inputs could follow the benchmark-analysis section instead of the post-hoc analyzer section, producing the wrong output shape and analyzing the wrong artifacts, which can silently corrupt evaluation workflows or overwrite expected results with incompatible JSON.

Intent-Code Divergence

Medium
Confidence
95% confidence
Finding
The top-level docstring describes the script as generating and serving a review page, but omits that it will also terminate any process bound to the selected port. That hidden side effect can surprise operators, cause loss of service for unrelated local applications, and makes the tool more dangerous than advertised.

Intent-Code Divergence

High
Confidence
98% confidence
Finding
The helper is named and documented as if it merely checks port usage, but it actually enumerates PIDs and sends SIGTERM to each matching process. Misleading documentation around destructive behavior increases the risk of accidental misuse and can hide operationally unsafe actions during review.

Vague Triggers

High
Confidence
93% confidence
Finding
The description-writing guidance explicitly recommends making skill descriptions 'pushy' and triggering on broad adjacent phrases even when the user does not ask for the skill. This increases over-triggering and skill collisions, which can cause the wrong skill to activate, override user intent, or pull the agent into unnecessary file/tool workflows with broader access than needed.

Vague Triggers

Medium
Confidence
82% confidence
Finding
The invocation guidance tells the agent to 'jump in and help' based on rough process stage but does not provide concrete limits for when the skill should not trigger. That ambiguity can lead to unnecessary activation, context hijacking, or the agent initiating eval, packaging, or optimization workflows when the user only wanted lightweight advice.

Natural-Language Policy Violations

Medium
Confidence
90% confidence
Finding
Conflicting environment-specific instructions create ambiguous policy behavior: one section says OpenClaw lacks subagents and should skip benchmarking/baselines, while another says subagents are available and the full workflow works. This ambiguity can produce inconsistent compliance with testing, review, and feedback steps, undermining assurance and making results hard to trust.

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The script unconditionally calls _kill_port(port) before binding the HTTP server, with no interactive warning or explicit user consent at the moment of execution. In context, this is a local review utility, so killing unrelated services on the same host is unnecessary and can disrupt development tools, dashboards, or other applications.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The script sends the assembled prompt, including full skill content, eval history, and queries, to an external `claude` process with no consent gate or data-sensitivity check at the call site. If those inputs contain secrets, proprietary prompts, or sensitive evaluation data, they may be exposed outside the local process boundary to a remote model backend via the CLI.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal