Agent Skills For Context Engineering

Security checks across malware telemetry and agentic risk

Overview

The collection is mostly coherent educational material, but it includes high-impact example skills and code paths that need review before installation.

Review this package before installing in an agent that can run commands, access private files, use API keys, or make repository changes. Avoid using the author-voice fine-tuning example without rights and consent checks, and do not send reasoning traces, system prompts, tool outputs, or personal data to external APIs unless that data flow is explicitly approved and minimized.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (120)

eval() call detected

High

Category: Dangerous Code Execution
Content: "pi": math.pi, "e": math.e, } result = eval(expression, {"__builtins__": {}}, allowed_names) return json.dumps({ "expression": expression, "result": result,
Confidence: 98% confidence
Finding: result = eval(expression, {"__builtins__": {}}, allowed_names)

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The skill is materially out of scope for a context-engineering collection: it provides operational guidance for building author-style fine-tuning datasets and models rather than context management for agents. That scope drift increases risk because it normalizes a capability for voice replication that can be repurposed for impersonation, copyright abuse, or policy evasion under the cover of an unrelated skill collection.

Context-Inappropriate Capability

Medium

Confidence: 99% confidence
Finding: The documentation explicitly instructs developers to preserve and resend the model's full internal reasoning across turns. That can cause sensitive data, hidden system context, or internal deliberations to be stored, logged, and retransmitted to the model or third-party APIs, expanding exposure far beyond what is needed for tool continuity.

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The example prints detailed reasoning content directly to stdout, which commonly ends up in logs, terminals, CI output, or observability backends. If the model reasoning contains secrets, user data, policy text, or attack payload reflections, this creates unnecessary disclosure risk.

Intent-Code Divergence

Low

Confidence: 86% confidence
Finding: The guide claims full reasoning preservation is operationally required, encouraging developers to propagate more data than necessary. This misleading guidance can normalize insecure handling of hidden reasoning and sensitive context, increasing downstream leakage and retention risks.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: This file appears materially unrelated to the stated skill purpose of context engineering and instead provides third-party MiniMax/Anthropic API integration instructions. In a skill collection, this kind of scope drift is dangerous because it can cause operators or downstream agents to invoke unexpected external services and trust documentation that was not reviewed for the advertised capability set.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The document instructs users to set an external base URL and API key for MiniMax even though that behavior is not justified by the stated purpose of the skill. In agent environments, undocumented outbound connectivity and credential use can expand the trust boundary, leading to unintended data egress, shadow integrations, and accidental use of production secrets.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The script prints reasoning trace content and tool results directly to stdout, which can expose sensitive internal chain-of-thought, prompt contents, tool inputs, or tool outputs if real data is ever used instead of mock data. In production or shared logging environments, stdout is often collected centrally, making accidental disclosure much broader than the local demo author may intend.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The final response states that sources were consulted which do not appear in the recorded tool activity, including references to Google Research and additional OpenAI documentation not shown in the trace. This undermines auditability and provenance, and can mislead users or downstream systems into trusting unsupported claims in generated research artifacts.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The code sends full reasoning traces, system prompts, tool-call inputs/results, and final responses to an external API for analysis. Those fields can contain secrets, proprietary data, user content, or internal chain-of-thought, so transmitting them off-box without minimization or consent creates a real confidentiality and compliance risk.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The analyzer explicitly captures the model's 'thinking' block and stores it in the returned result. Private reasoning often contains sensitive intermediate data, hidden instructions, or policy-restricted content; retaining it increases exposure surface and may violate model-provider expectations around hidden reasoning.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The generator writes auxiliary reference artifacts alongside the generated skill, including optimization summaries and later a full optimized prompt. Optimization traces and prompts often contain proprietary instructions, task context, or sensitive data copied from upstream workflows, so persisting them to disk expands the exposure surface beyond the stated purpose of creating a shareable skill.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The code saves the full final optimized prompt to disk in plaintext. Prompts may embed internal system instructions, confidential business logic, example data, or sensitive context from optimization runs, so storing them unredacted creates a local data-leak risk and may unintentionally package sensitive material with a 'shareable' skill.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The file contains contradictory decision logic: one section says to reject only after failing more than two gates, while the rubric later says any single gate failure requires immediate rejection. In a judging or curation workflow, this inconsistency can be exploited to obtain inconsistent approvals, reduce screening reliability, and undermine trust in downstream decisions.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The TerminalCapture example directly executes an arbitrary string via subprocess.run(..., shell=True), creating a command-injection primitive if any untrusted input reaches the command parameter. In a skill about filesystem context management, this is more dangerous because it presents shell execution as a reusable implementation pattern without constraints, making unsafe adoption likely.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The document's security section claims certain commands are blocked, but earlier examples rely heavily on unrestricted shell execution (`os.system`, `sandbox.exec`) and networked package/build steps. This creates a misleading security posture: implementers may assume meaningful command/network confinement exists when the showcased patterns still permit broad command execution and dependency-fetching that can run arbitrary install scripts.

Vague Triggers

Medium

Confidence: 85% confidence
Finding: The README explicitly states that skills are activated automatically based on task context and provides broad trigger phrases such as 'design agent architecture', 'evaluate agent performance', and 'start LLM project'. In plugin ecosystems, overly generic activation criteria can cause unintentional loading of powerful or irrelevant instructions, increasing the chance of prompt-scope expansion, context poisoning, or accidental execution of risky workflows from unrelated user requests.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The activation criteria are very broad and can cause the skill to trigger for a large fraction of general agent-building tasks, which increases the chance of unnecessary context injection and unintended influence over agent behavior. In agent systems, over-broad auto-activation can become a prompt-scope vulnerability by overshadowing more specific skills, expanding operational guidance into contexts where it was not intended, and encouraging use of risky patterns like filesystem or hosted-agent workflows without a narrow need.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The document explicitly advocates stripping an agent down to a single tool that can execute arbitrary bash commands and directly inspect files, but it does not pair that guidance with any safety boundaries, threat model, or warning about handling untrusted input, secrets, or destructive commands. In a skill collection for building production agent systems, this normalization is dangerous because users may reproduce the pattern and expose host files, credentials, or system integrity to prompt-injection-driven command execution.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The sample code exposes a tool whose execute handler forwards an arbitrary command string directly to sandbox.exec(command), with no validation, filtering, policy checks, or warning. Even though a sandbox is used, this is still a risky pattern in agent-oriented documentation because it demonstrates a turnkey arbitrary-command primitive that can be abused by prompt injection, malicious inputs, or mis-scoped sandbox configurations to access sensitive files, pivot via network access, or perform destructive actions.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The skill explicitly instructs users to extract book text, assemble it into JSONL training examples, and use the original text as assistant outputs, but provides no warning about copyright, licensing, consent, or permitted-use boundaries. In context, this is dangerous because it gives a turnkey pipeline for converting likely copyrighted literary works into model training data, which can facilitate infringement and unauthorized derivative model creation at scale.

Natural-Language Policy Violations

High

Confidence: 99% confidence
Finding: The description states the skill should be used for 'author voice replication' and the document repeatedly frames the goal as producing a model that writes in any author's voice. That is a direct enablement of stylistic impersonation of identifiable creators, which is especially risky for living authors and can support deception, reputational harm, and policy violations.

Natural-Language Policy Violations

High

Confidence: 99% confidence
Finding: The prompt templates operationalize the risky behavior by repeatedly instructing the model to write 'in the style of' or 'in the voice of' a named author, without any restriction for consent, deceased/public-domain status, or safety boundaries. This makes the skill more dangerous than a theoretical discussion because it supplies reusable prompt scaffolding for direct impersonation workflows.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The Tier 2 segmentation design sends full oversized text passages to an external LLM service, but the documented workflow contains no privacy notice, consent step, data classification check, or guidance on handling sensitive or copyrighted book content. In a production agent system, this can cause unintended disclosure of proprietary, personal, or regulated text to a third-party processor, especially because segmentation is presented as a normal pipeline step rather than an exceptional action.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The function sends raw chunk text to an abstract external LLM callback, and the code/comments do not warn that potentially copyrighted, sensitive, or proprietary book content may be transmitted off-box. In this skill context, the pipeline is specifically designed to process large text corpora for training, which increases the likelihood of bulk data exposure if users wire this to a third-party API without realizing the disclosure implications.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal