Back to skill

Security audit

Zero Cover Mode

Security checks across malware telemetry and agentic risk

Overview

This skill is not clearly malicious, but it gives the agent broad authority to run project commands, write persistent workflow state, generate follow-up cron tasks, and delete workflow directories, so it should be reviewed before use.

Install only if you want this skill to actively manage bug-fix workflows in your workspace. Review the test command before each run, use a sandbox for untrusted repositories, keep secrets out of project environment/state fields, and be careful with cleanup and cron-related steps because they can persist work and remove old bug artifacts.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
Findings (19)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
if not cov_args:
            cov_args = ["--cov", "."]
        cmd = [sys.executable, "-m", "pytest"] + cov_args + [test_path, "--no-header", "-q"]
        result = subprocess.run(
            cmd,
            capture_output=True, text=True, timeout=30,
        )
Confidence
87% confidence
Finding
result = subprocess.run( cmd, capture_output=True, text=True, timeout=30, )

subprocess module call

Medium
Category
Dangerous Code Execution
Content
log.info("Exec: %s (cwd=%s)", cmd, actual_cwd)
        try:
            shell = os.name == "nt"
            proc = subprocess.run(
                cmd, capture_output=True, text=True,
                timeout=self.timeout, cwd=actual_cwd, shell=shell, env=env
            )
Confidence
95% confidence
Finding
proc = subprocess.run( cmd, capture_output=True, text=True, timeout=self.timeout, cwd=actual_cwd, shell=shell, env=env )

Lp3

Medium
Category
MCP Least Privilege
Confidence
93% confidence
Finding
The skill directs the agent to read/write files, run shell commands, inspect environment state, and invoke local/networked services, but it does not declare those capabilities up front. That creates a transparency and policy-enforcement gap: users or orchestrators may treat it as low-risk documentation/reporting logic while it actually performs privileged operations affecting the workspace and local system.

Tp4

High
Category
MCP Tool Poisoning
Confidence
95% confidence
Finding
The declared purpose suggests a bug-fix closure and reporting workflow, but the content expands into broader operational behavior: state persistence, event publication, cron scheduling, schema migration/rotation, and local service interaction. This mismatch is dangerous because reviewers and permission systems may underestimate the skill's authority and side effects, enabling unexpected persistence, automation, and data handling beyond the stated scope.

Context-Inappropriate Capability

Medium
Confidence
90% confidence
Finding
The skill instructs the agent to set up cron-based follow-up tasks, which introduces durable operational automation outside a one-time bug-fix workflow. Persistent scheduled execution can create unintended recurring commands, surprise system changes, and a larger attack surface if the scheduled command or its parameters are influenced by untrusted project data.

Intent-Code Divergence

Medium
Confidence
92% confidence
Finding
The CLI `rotate` path uses `os.replace()` directly and then recreates an empty file, which bypasses the safer `safe_rotate_with_backup()` flow used elsewhere in the module. This creates a data-integrity risk: if the rotation target collides, the source is moved without a backup, and any operational mistake or partial workflow failure can cause irrecoverable loss of historical NDJSON records.

Context-Inappropriate Capability

High
Confidence
97% confidence
Finding
The pipeline accepts a caller-controlled test_cmd and passes it into TestRunner.run(), while the CLI also exposes --test-cmd directly to users. In an agent skill context this creates a general command-execution surface that exceeds narrowly scoped bug-fix orchestration, enabling arbitrary local command execution, data access, or destructive actions if untrusted input reaches this parameter.

Context-Inappropriate Capability

Medium
Confidence
88% confidence
Finding
The code generates actionable cron job instructions and prints ready-to-use scheduling payloads for later execution. In an agent environment, this grants persistence and deferred execution capability beyond simple reporting, which can be abused to trigger future actions without immediate user review.

Intent-Code Divergence

Medium
Confidence
96% confidence
Finding
The module claims to provide integrity verification and automatic recovery, but verify_integrity() calls json.loads/json.JSONDecodeError without importing json. In practice this can cause the verification/repair path to fail at runtime, disabling a safety mechanism that operators may rely on to detect corruption and restore state consistency.

Description-Behavior Mismatch

Medium
Confidence
84% confidence
Finding
The skill performs side effects beyond passive state tracking: it creates workspace marker files and later participates in deletion/cleanup flows. When a skill's manifest/description emphasizes tracking, reporting, and state management but the implementation mutates workspace structure, users and higher-level orchestrators may grant it broader file authority than intended, creating integrity and trust-boundary risks.

Description-Behavior Mismatch

Medium
Confidence
86% confidence
Finding
set_project_env persists arbitrary project environment data into the state file, but this capability is not apparent from the manifest text focused on bug-fix workflow automation. Undisclosed persistence increases the chance that users or host systems expose sensitive or irrelevant data to the skill without understanding retention behavior.

Context-Inappropriate Capability

Medium
Confidence
88% confidence
Finding
The CLI accepts arbitrary JSON or key=value pairs and stores them as project environment data, enabling persistence of unrelated or sensitive information outside the stated bug-fix purpose. In agent settings, this can become a data-retention and scope-creep issue because operators may assume only bug metadata is stored, while the skill can silently accumulate broader project context or secrets.

Vague Triggers

High
Confidence
92% confidence
Finding
The auto-activation conditions are very broad, including implicit bug-fix requests, any failed tests, and generic errors or warnings. In context, this skill can then perform file mutations, shell execution, state updates, and follow-up automation, so ambiguous triggers materially increase the chance of unintended activation and privileged actions without clear user consent.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
cleanup_bugs deletes directories with shutil.rmtree based on age and session state, without a clear user-facing confirmation or dry-run mechanism. Destructive operations on workspace directories are dangerous because path-detection or state inconsistencies can cause irreversible loss of debugging artifacts or other data under the bugs directory.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
compact also performs recursive deletion of orphan directories, but the command description primarily suggests state compaction rather than filesystem cleanup. This mismatch reduces informed consent and can surprise users or orchestration layers, leading to unintended data loss when running a seemingly maintenance-only operation.

Unvalidated Output Injection

High
Category
Output Handling
Content
if not cov_args:
            cov_args = ["--cov", "."]
        cmd = [sys.executable, "-m", "pytest"] + cov_args + [test_path, "--no-header", "-q"]
        result = subprocess.run(
            cmd,
            capture_output=True, text=True, timeout=30,
        )
Confidence
82% confidence
Finding
subprocess.run( cmd, capture_output

Unvalidated Output Injection

High
Category
Output Handling
Content
log.info("Exec: %s (cwd=%s)", cmd, actual_cwd)
        try:
            shell = os.name == "nt"
            proc = subprocess.run(
                cmd, capture_output=True, text=True,
                timeout=self.timeout, cwd=actual_cwd, shell=shell, env=env
            )
Confidence
96% confidence
Finding
subprocess.run( cmd, capture_output

Unpinned Dependencies

Low
Category
Supply Chain
Content
# 零稀泥模式 — 开发依赖
# pip install -r requirements-dev.txt

pytest>=7
pytest-cov>=5
Confidence
93% confidence
Finding
pytest>=7

Unpinned Dependencies

Low
Category
Supply Chain
Content
# pip install -r requirements-dev.txt

pytest>=7
pytest-cov>=5
Confidence
93% confidence
Finding
pytest-cov>=5

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Static analysis

Detected: suspicious.dynamic_code_execution

Dynamic code execution detected.

Critical
Code
suspicious.dynamic_code_execution
Location
lib/backend_checker.py:84