Chapter Lead Writer

Security checks across malware telemetry and agentic risk

Overview

This looks like a modest chapter-lead helper on the surface, but the package bundles broader research pipelines and workspace-writing behavior that are not clearly disclosed.

Review before installing. Use this only if you intentionally want a broader research-pipeline package, not just a small chapter-lead writer. Run it in a backed-up workspace, inspect the bundled `pipelines/` and `tooling/` files, and avoid using it on sensitive projects unless you are comfortable with broad local artifact creation and workflow-state mutation. VirusTotal was pending and I did not find artifact-backed malware, credential theft, or destructive code.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (13)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill advertises no declared permissions/network access, yet the package clearly invokes a Python script and is designed to read workspace inputs and write output files. This creates a trust and review gap: operators may approve or sandbox the skill based on incomplete capability disclosure, while the executor can still access local files and shell out through Python. In this context, hidden file read/write and shell capability are meaningful because the skill processes arbitrary workspace content and can modify repository state.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 90% confidence
Finding: The documented guardrails say the skill should skip when required inputs are missing, avoid new facts/citations, and validate citation keys against ref.bib, but the described behavior violates those constraints. This is dangerous because downstream users may rely on the written contract for safety and quality guarantees; silent fallback generation, broader citation sourcing, and undeclared report writing can produce fabricated or out-of-scope content and unexpected file modifications. The mismatch is especially risky in a writing pipeline where generated text may be trusted and published with minimal review.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: This pipeline is materially inconsistent with the declared skill purpose of writing chapter lead blocks. Instead of performing a narrow formatting or drafting task, it orchestrates a broad multi-stage research ideation workflow, which is a scope-expansion vulnerability because routing this skill can trigger unauthorized artifact creation, workflow branching, and research-planning behavior outside user expectations. In this context, the mismatch is more dangerous because the parent skill is explicitly constrained to no new facts and no network use, so hidden pipeline behavior undermines trust and policy boundaries.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The file defines literature retrieval, evidence synthesis, screening, and memo-writing capabilities that far exceed what a chapter lead writer should be allowed to do. That creates an overprivileged workflow surface: if invoked through the benign-seeming skill context, the agent may perform broad research operations, generate new analytical content, and write substantial outputs that violate the skill's guardrails against adding new facts. The mismatch makes this especially risky because users and policy systems may grant the skill trust based on its narrow description while the pipeline does much more.

Scope Creep

Critical

Confidence: 99% confidence
Finding: The pipeline requires a literature retrieval stage, which strongly implies external or network-backed access, while the enclosing skill metadata declares 'Network: none.' This is a direct policy-boundary violation: a caller selecting a non-networked writing skill could unintentionally trigger data exfiltration, external queries, or unauthorized retrieval behavior. Because the declared skill is supposed to be a constrained editorial helper, hidden network activity is especially dangerous and should be treated as critical in this context.

Intent-Code Divergence

Medium

Confidence: 85% confidence
Finding: The documentation claims the workflow is not for survey drafting or execution-grade specification, yet the defined outputs include memo synthesis and broad report-generation artifacts. While this may partly reflect imprecise documentation rather than deliberate abuse, the inconsistency still matters because it obscures the true capability surface and can mislead reviewers, routers, and users about what the skill will generate. In a tightly scoped writing skill, such ambiguity increases the chance of unintended use and policy bypass.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: This shared utility module contains capabilities far beyond the declared scope of a chapter-lead-writing skill, including project-state management, approval workflows, pipeline resolution, and query seeding. In an agentic environment, this scope creep is dangerous because a writing-triggered skill can read and modify planning and control files, increasing the blast radius of prompt injection, accidental invocation, or misuse.

Scope Creep

Medium

Confidence: 97% confidence
Finding: The code can write or alter numerous workspace files such as status logs, decisions files, query files, checkpoints, and backups, which exceeds the manifest-declared output of a section lead markdown file. That makes the skill materially more dangerous: if invoked in the wrong context or influenced by adversarial content, it can mutate workflow state and approvals, not just generate prose.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: This file implements a full research-ideation and report-generation pipeline rather than a narrowly scoped chapter-lead writer. That capability mismatch is dangerous because an invoked skill may perform materially different actions than the metadata promises, expanding trust and execution scope in ways users and orchestrators do not expect.

Scope Creep

High

Confidence: 97% confidence
Finding: The code exposes generalized write helpers for JSONL, JSON, and Markdown artifacts, enabling creation of multiple outputs beyond the manifest-declared chapter lead file. In the context of a supposedly narrow, no-network writing skill, undisclosed multi-artifact write capability increases the risk of unauthorized workspace modification, hidden trace generation, and misuse by other code paths.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The pipeline is marked as `routing_default: true` with broad multilingual routing hints such as `survey`, `review`, `调研`, and `literature review`, which can cause this arXiv-specific workflow to be auto-selected for requests that are only loosely related. That can misroute users into a heavyweight literature-survey pipeline with inappropriate assumptions, creating integrity and safety issues through unintended execution scope rather than direct code execution.

Missing User Warnings

Low

Confidence: 76% confidence
Finding: The file documents refinement markers as a freeze switch to prevent regeneration, which implies that absent such markers, later runs may overwrite prior artifacts. Without a prominent warning or safer default behavior, users may unintentionally lose reviewed workspace content, causing integrity and availability issues for their project artifacts.

Missing User Warnings

Low

Confidence: 80% confidence
Finding: The C4 section again describes refinement markers that prevent accidental regeneration, indicating overwrite behavior for evidence bindings, drafts, anchors, and writer packs. In a workspace-oriented pipeline, silent regeneration of these artifacts can destroy manually refined work and undermine trust in the pipeline's artifact integrity.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal