skill-creator
Security checks across static analysis, malware telemetry, and agentic risk
Overview
The skill is broadly aligned with creating and testing skills, but its eval helpers can launch nested Claude runs with ambient local credentials and workspace context, so it needs careful review before use.
Install only if you are comfortable with a skill that can run local helper scripts, invoke nested Claude sessions, and send skill/eval content to Anthropic. Use it in a sandbox or disposable repo, review eval prompts before running them, sanitize environment variables or use a separate account, and delete logs that may contain private skill content.
Static analysis
No static analysis findings were reported for this release.
VirusTotal
VirusTotal findings are pending for this skill version.
Risk analysis
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Running benchmarks may start many Claude subprocesses that operate from the project root and use the user's local Claude setup.
The eval runner creates a temporary Claude command/skill entry and launches nested Claude CLI runs using eval-set queries. This is purpose-aligned, but the visible command does not include explicit sandboxing or permission limits.
project_commands_dir = Path(project_root) / ".claude" / "commands" ... command_file.write_text(command_content) ... cmd = ["claude", "-p", query, "--output-format", "stream-json", "--verbose", "--include-partial-messages"]
Run evaluations only in a trusted or disposable workspace, review eval prompts first, and prefer explicit read-only/sandboxed Claude permissions where available.
The helper may use the user's local Claude account/API configuration and inherit other environment secrets during eval runs.
The nested Claude process inherits nearly the entire local environment and explicitly removes the CLAUDECODE nesting guard. That can expose ambient credentials or profile state to the child process, while the registry metadata declares no credential or env-var requirements.
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"} ... subprocess.Popen(... cwd=project_root, env=env)Use a sanitized environment or separate profile/account for evaluations, and maintainers should declare credential/environment expectations and pass only the minimum required environment.
Users may be surprised that an apparently instruction-only skill contains scripts that require local tooling and Python dependencies.
The registry framing under-declares that the skill includes runnable helper scripts and runtime assumptions. The visible scripts are purpose-aligned, so this is a setup/provenance note rather than evidence of malicious behavior.
Required binaries (all must exist): none ... Required env vars: none ... No install spec — this is an instruction-only skill. Code file presence 10 code file(s)
Document required tools and dependencies, especially the Claude CLI and Anthropic SDK usage, before running helper scripts.
Private skill drafts, eval prompts, or benchmark results may be sent to Anthropic when using the description-improvement script.
The description improver sends skill content, eval failures, and history to the Anthropic API. This is aligned with the optimization feature, but it is a sensitive provider data flow users should understand.
Skill content (for context on what the skill does):\n<skill_content>\n{skill_content}\n</skill_content> ... response = client.messages.create(... messages=[{"role": "user", "content": prompt}])Do not run the improver on confidential skill content unless that provider data flow is acceptable; document this behavior for users.
Local logs may retain sensitive drafts or evaluation details after the task is finished.
When logging is enabled, the script persists full prompts, model thinking, responses, and parsed descriptions. Those prompts can contain skill content and eval data.
transcript: dict = {"iteration": iteration, "prompt": prompt, "thinking": thinking_text, "response": text, ...} ... log_file.write_text(json.dumps(transcript, indent=2))Store logs only in trusted locations, clean them up when no longer needed, and avoid logging sensitive skill content.
Skills created with this guidance might over-trigger and steer the agent into workflows the user did not clearly request.
The guidance encourages generated skills to trigger aggressively. This is part of the stated goal of improving skill triggering, but it can cause future skills to activate more often than intended.
make the skill descriptions a little bit "pushy" ... "Make sure to use this skill whenever ... even if they don't explicitly ask"
Review generated skill descriptions and keep trigger language tied to clear user intent and appropriate scope.
