Back to skill

Security audit

Code Factory

Security checks across malware telemetry and agentic risk

Overview

This skill is a coherent local code-generation tool, but it can automatically write files and run tests with shell execution on broad coding requests, so users should review it before installing.

Install only if you want a project generator that can write files, create runnable scripts, and execute pytest locally. Use it in a sandbox or disposable workspace, review generated requirements and run.sh before running them, and avoid using it on sensitive code unless you are comfortable with local diagnostic logs under .learnings.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Findings (12)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
return {"passed": True, "output": "无测试目录", "summary": "跳过"}

        try:
            result = subprocess.run(
                [sys.executable, "-m", "pytest", str(test_dir), "-v", "--tb=short"],
                capture_output=True,
                text=True,
Confidence
92% confidence
Finding
result = subprocess.run( [sys.executable, "-m", "pytest", str(test_dir), "-v", "--tb=short"], capture_output=True, text=True,

Tp4

High
Category
MCP Tool Poisoning
Confidence
89% confidence
Finding
The skill description materially understates behavior: beyond generating files, it performs environment inspection, executes `pytest`, auto-modifies code on failure, and persists learning artifacts for future runs. This mismatch can cause users or orchestrators to invoke the skill without realizing it will execute commands and create additional files, increasing the chance of unintended side effects.

Description-Behavior Mismatch

Medium
Confidence
89% confidence
Finding
The engine unconditionally adds multiple operational and metadata files such as run.sh, SKILL.md, manifest.json, and environment.toml regardless of the user's requested deliverables. In an agentic code-generation skill, this expands the produced artifact set beyond user intent, which can introduce unexpected executable surfaces, hidden behavior, or packaging metadata that downstream automation may trust or execute.

Intent-Code Divergence

Medium
Confidence
95% confidence
Finding
The docstring promises atomic commit and complete rollback, but commit() mutates target files directly and does not handle partial failures. If an unlink or copy operation fails midway, the target directory can be left in a mixed state with some files deleted or updated and others unchanged, violating integrity guarantees that callers may rely on for safe project generation.

Context-Inappropriate Capability

Medium
Confidence
95% confidence
Finding
The template unconditionally grants the generated skill the `exec` capability, which allows arbitrary command execution by any downstream agent using the produced metadata. For a metadata generator whose stated purpose is creating standard project scaffolding, this is broader privilege than necessary and increases the blast radius if generated code, prompts, or dependencies are malicious or compromised.

Vague Triggers

High
Confidence
95% confidence
Finding
The trigger phrases are broad enough to match ordinary coding requests such as 'write me a tool' or 'create a project,' which can auto-activate a skill that writes files and runs shell commands. In context, this is more dangerous because the skill includes `exec`-based verification and retry loops, so accidental activation can lead to unanticipated local command execution and filesystem changes.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The skill instructs running `pip install -e ".[dev]"` and `pytest tests/ -v` but does not prominently warn users that verification entails shell execution and potentially dependency installation. Because command execution is one of the highest-risk capabilities in agent skills, omitting that warning undermines informed consent and can expose the host environment to unintended changes.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The controller writes retry failure patterns to `.learnings/` on disk without any consent, notice, retention control, or sanitization. Because the logged JSON includes project name, step history, and verification details, this can silently persist potentially sensitive user or code metadata beyond the execution lifecycle.

Missing User Warnings

Low
Confidence
84% confidence
Finding
When `guard` is absent, verification runs directly without timeout enforcement, which can allow tests or verification steps to hang indefinitely or consume resources unexpectedly. In an automated project-generation pipeline, that creates a denial-of-service and reliability risk even if it does not directly expose confidentiality or integrity.

Ssd 3

Medium
Confidence
95% confidence
Finding
The failure log stores plain JSON containing project name, retry history, current step, and full verification report, which may include sensitive code, file paths, test output, or user-derived content. Persisting this data in a predictable `.learnings/` directory increases the chance of unintended disclosure, later collection, or cross-run leakage.

Unbounded Resource Access

Medium
Category
Excessive Agency
Content
return self.breaker.execute_with_timeout(
                verify_fn,
                timeout_seconds=timeout_seconds,
                on_timeout=None,
            )
        except Exception:
            return None
Confidence
84% confidence
Finding
timeout=None

Known Vulnerable Dependency: pytest — 1 advisory(ies): CVE-2025-71176 (pytest has vulnerable tmpdir handling)

Low
Category
Supply Chain
Confidence
84% confidence
Finding
pytest

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Static analysis

No suspicious patterns detected.