Research Paper Figure Skill Factory

Security checks across malware telemetry and agentic risk

Overview

This documentation-only skill factory is coherent with its paper-figure workflow purpose, though it may search the web, download open PDFs, write local corpus files, and use image generation when the workflow calls for it.

Install this only if you want an agent to help build reusable research-paper figure-making skills. Expect it to ask for or collect lawful paper sources, create local corpus/manifests, generate skill-package files, and use image generation. Review the generated skill before installing it elsewhere, and avoid pointing it at private PDFs or broad local folders unless you intend those materials to be processed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (16)

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The example instructs the agent to perform active web search, download PDFs, and write local artifacts under /mnt/data, which expands the skill from figure-making into autonomous external acquisition and filesystem operations. In an agent setting, this increases attack surface for unintended data retrieval, policy bypass, and persistence of unvetted content, especially because the workflow normalizes these actions as default behavior.

Context-Inappropriate Capability

Low

Confidence: 64% confidence
Finding: The file tells the agent to automatically continue based on current session/history and only fall back to user-provided state when history is unavailable. That creates implicit reliance on prior context for operational decisions, which can cause cross-turn confusion, stale-state execution, or unintended continuation of acquisition tasks without fresh user confirmation.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The skill description is extremely broad and can match generic requests about making figures, which may cause this meta-skill to activate in situations where a narrower, safer, or more appropriate skill should be used. Overbroad routing increases the chance of unintended behavior, policy bypass through misclassification, and user confusion because the skill can take over workflows well beyond its intended scope.

Natural-Language Policy Violations

High

Confidence: 97% confidence
Finding: The skill mandates Chinese-language response elements in every text reply regardless of the user's language or consent. This can degrade comprehension, mislead users, and create unsafe workflow execution if critical instructions, state, or next steps are presented in a language the user does not understand, especially in a complex multi-step figure-generation process.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The skill enforces a rigid first-turn behavior ('STARTUP_PLAN_ONLY') without clearly defining activation boundaries or how the host should determine when this skill is appropriate. That ambiguity can cause the wrong skill to seize control of a conversation or override user intent, which is a prompt-routing and workflow integrity issue rather than direct code execution.

Natural-Language Policy Violations

Medium

Confidence: 80% confidence
Finding: Mandating Chinese output phrases regardless of user language can degrade usability, mislead users about system state, and cause important workflow or safety information to be presented in an inaccessible language. In a skill that imposes strict procedural outputs, forced locale mismatches increase the chance of misunderstanding instructions, consent, and next-step actions.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The example hard-codes Chinese as the response language (`## 当前执行计划` and subsequent required state/output structure) without any indication that language should follow user preference. In a skill factory context, this can cause downstream generated skills or agent turns to ignore user language choice, reduce usability, and create prompt-level policy drift where the agent follows embedded formatting mandates rather than explicit user instructions.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The example opening turn is entirely in Chinese and prescribes a fixed Chinese-language interaction pattern without offering any user-language detection or choice. This can cause users to misunderstand workflow, consent, or safety-relevant instructions, and in an agent setting it may lead to incorrect execution or inaccessible behavior for non-Chinese speakers.

Natural-Language Policy Violations

Medium

Confidence: 97% confidence
Finding: The skill content is predominantly in Chinese and presents operational instructions and user-facing prompts without offering a language-choice mechanism. This can override or conflict with a user's preferred language, increasing the risk of misunderstanding, consent issues, and unsafe execution of later workflow steps, especially in a complex multi-step skill factory context.

Natural-Language Policy Violations

Medium

Confidence: 92% confidence
Finding: The template hard-codes a Chinese-language user prompt, which can improperly constrain user interaction language without consent and may exclude or confuse users operating in other locales. In a skill factory context, this can propagate into generated downstream skills and workflows, making the bias systematic rather than incidental.

Natural-Language Policy Violations

Medium

Confidence: 90% confidence
Finding: The follow-up guidance again mandates Chinese phrasing, reinforcing a fixed-language workflow without opt-in. Because this file is a reusable request template for a skill factory, the restriction can spread across sessions and generated skills, increasing usability, accessibility, and policy-compliance risk.

Natural-Language Policy Violations

Medium

Confidence: 92% confidence
Finding: The file hard-codes Chinese output requirements at the end-state prompts/checklist without offering a user-language fallback or negotiation step. This is not a code-execution issue, but it can cause unsafe or misleading agent behavior by forcing responses into a language the user may not understand, reducing transparency, informed consent, and the user's ability to review figure-generation plans or detect mistakes.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The protocol mandates a Chinese-language next-prompt pattern regardless of the user's language, which can cause prompt injection of unintended language behavior, degrade usability, and create misleading or non-consensual agent outputs. In this skill context, the issue is not direct code execution, but it can still cause workflow confusion, user manipulation, and brittle downstream behavior when prompts are copied verbatim into other systems.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The template defines a broad figure-skill workflow but does not specify clear activation boundaries, allowed user intents, or explicit conditions for when the skill should or should not be invoked. In an agent setting, ambiguous routing can cause the skill to activate for unrelated requests, increasing the chance of prompt/behavior hijacking, policy bypass through misclassification, or misuse of image-generation and document-processing steps in contexts where they were not intended.

Natural-Language Policy Violations

Medium

Confidence: 84% confidence
Finding: The template hard-codes Chinese prompt text for the next-step policy without checking user locale, language preference, or deployment context. This can cause unexpected language switching, user confusion, and unreliable downstream agent behavior, especially if other components assume prompts remain in the user's language or in a controlled system language.

Natural-Language Policy Violations

Medium

Confidence: 92% confidence
Finding: The template hard-codes Chinese output markers such as `下一步你可以这样问` and a fully Chinese state footer, which can force agents to respond in a language the user did not request. This creates a prompt-quality and safety issue because users may misunderstand instructions, status, or next steps, especially in a security- or research-sensitive workflow, increasing the risk of misuse or consent failures.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal