skill-creator

Security checks across static analysis, malware telemetry, and agentic risk

Overview

The skill is broadly aligned with creating and testing skills, but its eval helpers can launch nested Claude runs with ambient local credentials and workspace context, so it needs careful review before use.

Install only if you are comfortable with a skill that can run local helper scripts, invoke nested Claude sessions, and send skill/eval content to Anthropic. Use it in a sandbox or disposable repo, review eval prompts before running them, sanitize environment variables or use a separate account, and delete logs that may contain private skill content.

Static analysis

No static analysis findings were reported for this release.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Risk analysis

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Running benchmarks may start many Claude subprocesses that operate from the project root and use the user's local Claude setup.

Why it was flagged

The eval runner creates a temporary Claude command/skill entry and launches nested Claude CLI runs using eval-set queries. This is purpose-aligned, but the visible command does not include explicit sandboxing or permission limits.

Skill content
project_commands_dir = Path(project_root) / ".claude" / "commands" ... command_file.write_text(command_content) ... cmd = ["claude", "-p", query, "--output-format", "stream-json", "--verbose", "--include-partial-messages"]
Recommendation

Run evaluations only in a trusted or disposable workspace, review eval prompts first, and prefer explicit read-only/sandboxed Claude permissions where available.

What this means

The helper may use the user's local Claude account/API configuration and inherit other environment secrets during eval runs.

Why it was flagged

The nested Claude process inherits nearly the entire local environment and explicitly removes the CLAUDECODE nesting guard. That can expose ambient credentials or profile state to the child process, while the registry metadata declares no credential or env-var requirements.

Skill content
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"} ... subprocess.Popen(... cwd=project_root, env=env)
Recommendation

Use a sanitized environment or separate profile/account for evaluations, and maintainers should declare credential/environment expectations and pass only the minimum required environment.

What this means

Users may be surprised that an apparently instruction-only skill contains scripts that require local tooling and Python dependencies.

Why it was flagged

The registry framing under-declares that the skill includes runnable helper scripts and runtime assumptions. The visible scripts are purpose-aligned, so this is a setup/provenance note rather than evidence of malicious behavior.

Skill content
Required binaries (all must exist): none ... Required env vars: none ... No install spec — this is an instruction-only skill. Code file presence 10 code file(s)
Recommendation

Document required tools and dependencies, especially the Claude CLI and Anthropic SDK usage, before running helper scripts.

What this means

Private skill drafts, eval prompts, or benchmark results may be sent to Anthropic when using the description-improvement script.

Why it was flagged

The description improver sends skill content, eval failures, and history to the Anthropic API. This is aligned with the optimization feature, but it is a sensitive provider data flow users should understand.

Skill content
Skill content (for context on what the skill does):\n<skill_content>\n{skill_content}\n</skill_content> ... response = client.messages.create(... messages=[{"role": "user", "content": prompt}])
Recommendation

Do not run the improver on confidential skill content unless that provider data flow is acceptable; document this behavior for users.

What this means

Local logs may retain sensitive drafts or evaluation details after the task is finished.

Why it was flagged

When logging is enabled, the script persists full prompts, model thinking, responses, and parsed descriptions. Those prompts can contain skill content and eval data.

Skill content
transcript: dict = {"iteration": iteration, "prompt": prompt, "thinking": thinking_text, "response": text, ...} ... log_file.write_text(json.dumps(transcript, indent=2))
Recommendation

Store logs only in trusted locations, clean them up when no longer needed, and avoid logging sensitive skill content.

What this means

Skills created with this guidance might over-trigger and steer the agent into workflows the user did not clearly request.

Why it was flagged

The guidance encourages generated skills to trigger aggressively. This is part of the stated goal of improving skill triggering, but it can cause future skills to activate more often than intended.

Skill content
make the skill descriptions a little bit "pushy" ... "Make sure to use this skill whenever ... even if they don't explicitly ask"
Recommendation

Review generated skill descriptions and keep trigger language tied to clear user intent and appropriate scope.