Skill Factory

Security checks across malware telemetry and agentic risk

Overview

This skill is a transparent OpenClaw skill-building toolkit, but it can create, modify, scan, package, and publish persistent agent skills with broad scope.

Install only if you want an agent to author and manage OpenClaw skills. Use it in a dedicated draft workspace, review generated diffs and packaged archives for secrets before enabling or publishing, and avoid sync/publish commands unless you intend to share those skills.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Rogue AgentSelf-Modification, Session Persistence
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (13)

Lp3

Medium
Category
MCP Least Privilege
Confidence
93% confidence
Finding
The skill instructs the agent to read and write workspace files, invoke shell commands, and publish to an external service, but it declares no permissions or user-consent boundaries. That creates a capability mismatch where a user may trigger actions with broader filesystem, command execution, and network effects than expected, increasing the chance of unintended modification or exfiltration.

Vague Triggers

Medium
Confidence
81% confidence
Finding
The trigger phrases include broad commands like 'create a skill', 'build a skill', and 'improve this skill', which can match common benign requests and cause the skill to activate unexpectedly. Because this skill can write files, run shell commands, scan installed skills, and publish artifacts, accidental invocation raises the risk of unintended side effects.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The publishing section directs use of `clawhub login`, `clawhub publish`, and `clawhub sync` without clearly warning that skill contents and metadata will be transmitted to an external service. In a skill-development context, workspace skills may contain proprietary prompts, internal tooling references, secrets, or unpublished code patterns, so omission of a transmission warning materially increases disclosure risk.

Vague Triggers

Medium
Confidence
85% confidence
Finding
The eval encourages descriptions to include broad trigger phrases like 'disk' or 'monitor' without requiring specificity about scope, exclusions, or intended invocation context. In a skill system, this can cause over-triggering, where the skill activates on loosely related user requests and may perform actions or provide guidance outside its intended boundaries.

Vague Triggers

Low
Confidence
78% confidence
Finding
This eval checks that trigger phrases are mentioned and that 'use when' guidance exists, but it does not require boundaries, disambiguation, or negative examples. As a result, a generated skill description may optimize for broad matching rather than accurate routing, increasing the chance of unintended activation.

Vague Triggers

Medium
Confidence
89% confidence
Finding
This eval data rewards descriptions containing very generic trigger phrases such as 'disk' or 'monitor', which can normalize broad activation criteria when creating or grading skills. In a skill-factory context, that weakness propagates into generated skills and increases the chance of accidental or inappropriate invocation across many downstream skills.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The requirement that descriptions use unconstrained example triggers creates a systematic quality issue: skills may pass evaluation despite unsafe activation wording. Because this file is part of the skill-factory eval suite, it influences how skills are authored and approved, making accidental activation more likely at scale.

Vague Triggers

Medium
Confidence
90% confidence
Finding
This assertion set appears to encode generic trigger examples as a success condition, which is unsafe for activation design. In the context of a tool that creates, evaluates, and publishes skills, weak trigger requirements can be replicated into multiple released skills, broadening the blast radius beyond a single artifact.

Vague Triggers

Medium
Confidence
85% confidence
Finding
The trigger phrases listed here are very broad and map to common user intents like 'create a skill', 'build a skill', and 'improve this skill'. In an agent routing system, overly generic triggers can cause this skill to activate unexpectedly for unrelated requests, increasing the chance that a powerful skill-factory workflow is invoked in contexts the user did not intend.

Self-Modification

High
Category
Rogue Agent
Content
_meta.json        # Auto-populated on publish
```

### Step 4 — Write SKILL.md

**Frontmatter rules:**
```yaml
Confidence
92% confidence
Finding
Write SKILL

Self-Modification

High
Category
Rogue Agent
Content
Validates structure, outputs `<skill-name>.skill` zip.

### Step 6 — Iterate
Run evals (Mode 2) → identify failures → update SKILL.md → re-package → repeat.

---
Confidence
90% confidence
Finding
update SKILL

Session Persistence

Medium
Category
Rogue Agent
Content
_meta.json        # Auto-populated on publish
```

### Step 4 — Write SKILL.md

**Frontmatter rules:**
```yaml
Confidence
87% confidence
Finding
Write SKILL.md **Frontmatter rules:** ```yaml --- name: my-skill-name # lowercase-hyphen, max 64 chars description: "One sentence: what it does AND when to use it. Include trigger phrases."

Session Persistence

Medium
Category
Rogue Agent
Content
Measure skill quality against defined expectations.

### Setup
Create `evals/evals.json`:
```json
[
  {
Confidence
83% confidence
Finding
Create `evals/evals.json`: ```json [ { "id": "basic-create", "prompt": "Create a skill that sends a Slack message", "expected_output": "SKILL.md with slack-notifier name and working comm

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal