Artifact Contract Auditor

Security checks across malware telemetry and agentic risk

Overview

This local auditor mostly matches its purpose, but it under-discloses extra workspace writes and ships broader pipeline/execution code than its read-and-report description suggests.

Install only if you are comfortable with a local workspace-audit skill that may also update output/QUALITY_GATE.md and ships broader research-pipeline helpers. Use it on trusted workspaces, review PIPELINE.lock.md before running, and do not rely on the 'only writes CONTRACT_REPORT.md' claim as strict containment.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (14)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: log_path = workspace / log_rel try: completed = subprocess.run(cmd, check=False, capture_output=True, text=True) if completed.stdout or completed.stderr or completed.returncode != 0: ensure_dir(log_path.parent) body = [
Confidence: 95% confidence
Finding: completed = subprocess.run(cmd, check=False, capture_output=True, text=True)

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The skill declares a guardrail of analysis-only and says it should only write `output/CONTRACT_REPORT.md`, but this block also writes `QUALITY_GATE.md` via `write_quality_report`. That creates an undocumented side effect outside the stated artifact contract, which can overwrite or inject pipeline state used by downstream tooling and reviewers.

Scope Creep

High

Confidence: 98% confidence
Finding: This code modifies files beyond the manifest-described write scope by calling `write_quality_report`, despite the skill metadata stating 'analysis-only; do not edit content artifacts; only write the report.' In agent pipelines, undeclared writes are dangerous because they break isolation assumptions and can tamper with authoritative completion or quality status files that influence subsequent execution and trust decisions.

Context-Inappropriate Capability

High

Confidence: 93% confidence
Finding: The module exposes many generic file- and workspace-mutating helpers such as atomic writes, backups, copy operations, and decision/status updates, even though this skill is described as analysis-only and should only emit a contract report. In an agent-skill context, bundling broad mutation primitives increases the chance that the skill or imported helpers can alter unrelated workspace artifacts, weakening least-privilege boundaries and making accidental or unauthorized state changes much easier.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: This file is a general pipeline executor that mutates unit state, approvals, status logs, quality reports, reroute state handling, and run errors, which substantially exceeds the declared scope of an artifact-contract auditing skill. In this context, the mismatch is dangerous because users invoking a supposedly analysis-only auditor could instead trigger workflow progression and persistent workspace state changes.

Scope Creep

High

Confidence: 99% confidence
Finding: The code writes to `UNITS.csv`, `STATUS.md`, and `DECISIONS.md` during approval handling despite the skill guardrail stating it should only write `output/CONTRACT_REPORT.md`. This violates least privilege and allows an audit invocation to alter control-plane artifacts that determine pipeline execution and approval state.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: Subprocess execution is not justified for a contract-audit skill whose advertised role is to verify completeness and self-containment. In this context, invoking `scripts/run.py` transforms the skill from an auditor into an active executor, increasing the blast radius from reporting errors to causing arbitrary downstream pipeline actions.

Intent-Code Divergence

Medium

Confidence: 83% confidence
Finding: The helper writes a persistent `output/RUN_ERRORS.md` artifact while describing itself as a local error sink that should not crash the runner. On its own this is not code execution, but in the context of a skill that promises to only write `output/CONTRACT_REPORT.md`, it is an unauthorized persistent artifact write that can mislead users about what the skill changes.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: This file is fundamentally implementing a research ideation pipeline, not an artifact-contract auditor as declared in the skill metadata. That mismatch is dangerous because users and orchestrators may grant this skill execution based on a narrow, analysis-only reporting purpose, while the code instead performs unrelated logic and can generate outputs inconsistent with the declared trust boundary.

Scope Creep

High

Confidence: 91% confidence
Finding: The module contains generic file-writing helpers that support emitting multiple JSON, JSONL, and Markdown artifacts, which conflicts with the skill's declared single-report output model. In this skill context, undeclared artifact generation is risky because downstream systems may rely on the guardrail that the skill is analysis-only and writes only output/CONTRACT_REPORT.md; extra outputs can pollute the workspace, confuse later pipeline stages, or smuggle unexpected data into shared artifacts.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The code builds, scores, ranks, and reports research directions, which is unrelated to artifact-contract auditing and materially expands the skill's capability beyond its declared purpose. In a tightly scoped agent skill, unjustified capability expansion is dangerous because it breaks least privilege and makes the skill capable of producing misleading or policy-violating outputs under the cover of a benign auditor label.

Intent-Code Divergence

Medium

Confidence: 84% confidence
Finding: The inline error messages and documentation repeatedly refer to an 'ideation contract,' showing the code was copied or repurposed from a different skill and does not match the declared auditor intent. While text alone is less severe than executable behavior, in this context it is strong evidence of control confusion and increases the likelihood that operators misunderstand what the skill actually validates and permits.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The pipeline explicitly enables a pre-retrieval shell step, but the document does not clearly warn users that command execution may occur as part of the workflow. Even with 'approval_surface: false', hidden or implicit shell execution increases the risk of users triggering local commands without informed consent, which is especially concerning in agentic workflows where workspace contents may be adversarial.

Vague Triggers

Medium

Confidence: 82% confidence
Finding: The routing hint "snapshot" is generic and likely to match many unrelated user requests, which can cause the pipeline to be invoked when the user did not intend this specific skill. In this case the skill is analysis/reporting oriented and non-networked, so the main risk is misrouting, confusion, or unintended report generation rather than direct compromise.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal