Security audit

Zero Cover Mode

Security checks across malware telemetry and agentic risk

Overview

This skill is not clearly malicious, but it gives the agent broad authority to run project commands, write persistent workflow state, generate follow-up cron tasks, and delete workflow directories, so it should be reviewed before use.

Install only if you want this skill to actively manage bug-fix workflows in your workspace. Review the test command before each run, use a sandbox for untrusted repositories, keep secrets out of project environment/state fields, and be careful with cleanup and cron-related steps because they can persist work and remove old bug artifacts.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output

Findings (19)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: if not cov_args: cov_args = ["--cov", "."] cmd = [sys.executable, "-m", "pytest"] + cov_args + [test_path, "--no-header", "-q"] result = subprocess.run( cmd, capture_output=True, text=True, timeout=30, )
Confidence: 87% confidence
Finding: result = subprocess.run( cmd, capture_output=True, text=True, timeout=30, )

subprocess module call

Medium

Category: Dangerous Code Execution
Content: log.info("Exec: %s (cwd=%s)", cmd, actual_cwd) try: shell = os.name == "nt" proc = subprocess.run( cmd, capture_output=True, text=True, timeout=self.timeout, cwd=actual_cwd, shell=shell, env=env )
Confidence: 95% confidence
Finding: proc = subprocess.run( cmd, capture_output=True, text=True, timeout=self.timeout, cwd=actual_cwd, shell=shell, env=env )

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill directs the agent to read/write files, run shell commands, inspect environment state, and invoke local/networked services, but it does not declare those capabilities up front. That creates a transparency and policy-enforcement gap: users or orchestrators may treat it as low-risk documentation/reporting logic while it actually performs privileged operations affecting the workspace and local system.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The declared purpose suggests a bug-fix closure and reporting workflow, but the content expands into broader operational behavior: state persistence, event publication, cron scheduling, schema migration/rotation, and local service interaction. This mismatch is dangerous because reviewers and permission systems may underestimate the skill's authority and side effects, enabling unexpected persistence, automation, and data handling beyond the stated scope.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The skill instructs the agent to set up cron-based follow-up tasks, which introduces durable operational automation outside a one-time bug-fix workflow. Persistent scheduled execution can create unintended recurring commands, surprise system changes, and a larger attack surface if the scheduled command or its parameters are influenced by untrusted project data.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The CLI `rotate` path uses `os.replace()` directly and then recreates an empty file, which bypasses the safer `safe_rotate_with_backup()` flow used elsewhere in the module. This creates a data-integrity risk: if the rotation target collides, the source is moved without a backup, and any operational mistake or partial workflow failure can cause irrecoverable loss of historical NDJSON records.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: The pipeline accepts a caller-controlled test_cmd and passes it into TestRunner.run(), while the CLI also exposes --test-cmd directly to users. In an agent skill context this creates a general command-execution surface that exceeds narrowly scoped bug-fix orchestration, enabling arbitrary local command execution, data access, or destructive actions if untrusted input reaches this parameter.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The code generates actionable cron job instructions and prints ready-to-use scheduling payloads for later execution. In an agent environment, this grants persistence and deferred execution capability beyond simple reporting, which can be abused to trigger future actions without immediate user review.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The module claims to provide integrity verification and automatic recovery, but verify_integrity() calls json.loads/json.JSONDecodeError without importing json. In practice this can cause the verification/repair path to fail at runtime, disabling a safety mechanism that operators may rely on to detect corruption and restore state consistency.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: The skill performs side effects beyond passive state tracking: it creates workspace marker files and later participates in deletion/cleanup flows. When a skill's manifest/description emphasizes tracking, reporting, and state management but the implementation mutates workspace structure, users and higher-level orchestrators may grant it broader file authority than intended, creating integrity and trust-boundary risks.

Description-Behavior Mismatch

Medium

Confidence: 86% confidence
Finding: set_project_env persists arbitrary project environment data into the state file, but this capability is not apparent from the manifest text focused on bug-fix workflow automation. Undisclosed persistence increases the chance that users or host systems expose sensitive or irrelevant data to the skill without understanding retention behavior.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The CLI accepts arbitrary JSON or key=value pairs and stores them as project environment data, enabling persistence of unrelated or sensitive information outside the stated bug-fix purpose. In agent settings, this can become a data-retention and scope-creep issue because operators may assume only bug metadata is stored, while the skill can silently accumulate broader project context or secrets.

Vague Triggers

High

Confidence: 92% confidence
Finding: The auto-activation conditions are very broad, including implicit bug-fix requests, any failed tests, and generic errors or warnings. In context, this skill can then perform file mutations, shell execution, state updates, and follow-up automation, so ambiguous triggers materially increase the chance of unintended activation and privileged actions without clear user consent.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: cleanup_bugs deletes directories with shutil.rmtree based on age and session state, without a clear user-facing confirmation or dry-run mechanism. Destructive operations on workspace directories are dangerous because path-detection or state inconsistencies can cause irreversible loss of debugging artifacts or other data under the bugs directory.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: compact also performs recursive deletion of orphan directories, but the command description primarily suggests state compaction rather than filesystem cleanup. This mismatch reduces informed consent and can surprise users or orchestration layers, leading to unintended data loss when running a seemingly maintenance-only operation.

Unvalidated Output Injection

High

Category: Output Handling
Content: if not cov_args: cov_args = ["--cov", "."] cmd = [sys.executable, "-m", "pytest"] + cov_args + [test_path, "--no-header", "-q"] result = subprocess.run( cmd, capture_output=True, text=True, timeout=30, )
Confidence: 82% confidence
Finding: subprocess.run( cmd, capture_output

Unvalidated Output Injection

High

Category: Output Handling
Content: log.info("Exec: %s (cwd=%s)", cmd, actual_cwd) try: shell = os.name == "nt" proc = subprocess.run( cmd, capture_output=True, text=True, timeout=self.timeout, cwd=actual_cwd, shell=shell, env=env )
Confidence: 96% confidence
Finding: subprocess.run( cmd, capture_output

Unpinned Dependencies

Low

Category: Supply Chain
Content: # 零稀泥模式 — 开发依赖 # pip install -r requirements-dev.txt pytest>=7 pytest-cov>=5
Confidence: 93% confidence
Finding: pytest>=7

Unpinned Dependencies

Low

Category: Supply Chain
Content: # pip install -r requirements-dev.txt pytest>=7 pytest-cov>=5
Confidence: 93% confidence
Finding: pytest-cov>=5

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal

Static analysis

Detected: suspicious.dynamic_code_execution

Dynamic code execution detected.

Critical

Code: suspicious.dynamic_code_execution
Location: lib/backend_checker.py:84