Threat Modeling

Security checks across malware telemetry and agentic risk

Overview

This skill is a legitimate threat-modeling tool, but it also steers agents toward actionable exploit and penetration-test workflows without strong authorization or sandbox limits.

Install only if you intend to use it for authorized security work on systems you own or have written permission to test. Treat generated POCs, commands, and test scripts as review-only until a human scopes them to an isolated lab or approved test environment. Avoid --debug unless the output will stay private, because it can publish sensitive project and security-analysis details.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (46)

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The README expands the skill from passive threat modeling into active environment creation and penetration-test execution, including automated validation steps. That materially increases operational capability and risk because users or downstream agents may treat the skill as authorized to perform system-impacting actions rather than analysis only.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Advertising exploit/POC generation and attack-chain construction goes beyond defensive threat modeling and meaningfully enables offensive use. In a security-analysis skill, this context makes the capability more dangerous because users will expect realistic exploit assistance against targets, including translation from findings to actionable abuse steps.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The README describes automated attack-script generation and one-click verification workflows, which substantially broaden the skill into operational exploitation tooling. This creates misuse risk by encouraging automation of offensive actions under the cover of a benign threat-modeling label.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The README expands a defensive threat-modeling skill into active penetration-testing execution, automated verification, Docker test-environment construction, and tool-assisted exploit validation. In a skill context, this materially increases the chance the agent will move from analysis into operational security actions against real targets, especially because the document does not strongly constrain those actions to authorized, isolated environments.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The README explicitly promotes generation of exploit POCs, attack trees, optimized attack chains, and one-click verification workflows. For an agent skill, these capabilities can be repurposed for offensive use and may steer the model toward producing actionable exploitation content beyond what is necessary for defensive review.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: This section explicitly instructs the skill to generate attack paths, proof-of-concept code, and step-by-step offensive commands, including concrete exploitation flow and tool usage. In a threat-modeling skill, that materially expands capability from defensive analysis into actionable intrusion guidance, which can be repurposed for misuse and exceeds what is necessary for risk assessment.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: 文档将技能从“整体威胁建模/安全评估”扩展为可执行几乎任意安全任务的通用安全助手，这突破了清单中声明的用途边界。对代理型技能而言，能力范围越宽，越容易被非预期场景调用，进而执行超出授权范围的高风险分析、测试或操作。

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: 文档明确宣称可生成 exploit payload、完整攻击链和 POC，这已从防御性威胁建模滑向可直接支持进攻性利用的能力。若在代理环境中被误触发或被恶意用户诱导，该能力可能帮助构造可操作的攻击步骤，显著提升滥用风险。

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The document explicitly broadens the skill from defensive threat modeling into offensive security assistance, including attack-path construction, POC generation, and penetration-testing support. In an agent skill, this materially increases misuse potential because a user can pivot from assessment into exploit enablement under the guise of security analysis, especially when the skill is advertised as something that 'must' be invoked for security tasks.

Intent-Code Divergence

Medium

Confidence: 83% confidence
Finding: The file presents the workflow as 'strict sequential execution' but later says the workflow is only guidance and not mandatory, creating contradictory operating semantics. In a security-critical agent skill, this ambiguity can weaken safeguards because the model may skip context-building or validation phases and still proceed to sensitive outputs like threat claims or attack-path design.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The documentation states that the API response includes legacy JSON hijacking protection using a prefix, but the sample code does not actually implement that behavior. This creates a security-documentation mismatch: adopters may believe they are protected and ship responses without the advertised mitigation, which can matter in older browser contexts or in organizations relying on this control set as authoritative guidance.

Intent-Code Divergence

Low

Confidence: 80% confidence
Finding: The template-security section recommends sandbox mode where available, but the Jinja2 example uses a normal Environment and comments only about disabling extensions, which is not equivalent to sandboxing. In a security control document, this can mislead implementers into thinking the sample is hardened against SSTI-related abuse when it is not, especially if they later expand template capabilities or handle partially trusted templates.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The document contains internally contradictory guidance: it correctly states that security questions are not acceptable under modern authentication guidance, but then frames them as something to choose and use for legacy scenarios. In a threat-modeling/security-reference skill, this is dangerous because downstream users or agents may operationalize the later procedural recommendations and deploy weak account recovery or login controls that are vulnerable to guessing, OSINT, and account takeover.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The file mandates spawning multiple sub-agents and creating tasks as a required execution step, which expands the skill's operational capabilities from analysis into orchestration. In a threat-modeling skill, this is dangerous because it can trigger unnecessary parallel execution, amplify prompt-injection exposure across sub-agents, and cause actions beyond the minimum needed for security review.

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The instructions conflict on whether extended-domain trigger decisions are authoritative from P1 or re-detected by P4 using local patterns. This ambiguity can lead the agent to override upstream data, produce inconsistent assessments, or incorrectly expand scope, which is risky in a security workflow that depends on deterministic traceability.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: This phase explicitly instructs the skill to generate proof-of-concept exploit code, including concrete exploitation steps and runnable snippets. For a threat-modeling skill, that materially expands capability from analysis into offensive enablement, increasing the chance the agent produces weaponizable output rather than bounded risk assessment artifacts.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The attack-chain template includes live shell commands using curl and token-forging steps against a target system. This is unjustified by the skill's declared purpose and could cause the agent to produce actionable intrusion guidance that facilitates unauthorized access attempts.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The debug flag is documented as exposing internal YAML data and evaluation reports without warning that these artifacts may contain sensitive architecture details, findings, secrets-adjacent metadata, or proprietary context. This can lead to unintended disclosure through logs, commits, artifact sharing, or CI outputs.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The documentation instructs generation of exploit POCs and penetration-testing plans without a clear authorization and safety warning. That omission lowers friction for misuse and normalizes offensive deliverables even when the operator may not have permission or an isolated test environment.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: Building Docker test environments and executing automated penetration tests are potentially system-impacting actions, yet the README does not prominently warn about isolation, authorization, and resource/network risks. Users may run these workflows against live or sensitive systems, causing outages, data exposure, or unauthorized access attempts.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: Promoting one-click vulnerability verification and automated attack tooling without prominent operational-risk warnings is dangerous because it compresses offensive capability into low-friction workflows. In this skill context, that makes misuse more likely by less experienced users and increases the chance of unauthorized or unsafe execution.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The documented debug mode publishes internal YAML data files and evaluation reports without warning that those artifacts may contain sensitive project structure, findings, trust boundaries, secrets-adjacent configuration, or other proprietary security data. In a security-analysis skill, debug output is especially likely to aggregate sensitive information into easily exfiltrated files.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The README instructs the skill to generate exploit POCs, penetration-testing plans, and even execute verification steps, but it does not state that these actions require explicit authorization and must be limited to lawful, isolated testing environments. That omission is dangerous because users and downstream agents may interpret the workflow as approval to perform offensive actions on real systems.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: Debug mode is described as publishing internal YAML, KB queries, coverage validation, and evaluation artifacts, but the warning is operational rather than an explicit high-visibility data exposure warning. In a security-analysis skill, those artifacts are especially likely to contain sensitive architecture details, file paths, findings, and user-supplied project information, making accidental disclosure more dangerous than in a typical reporting tool.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The document includes explicit attack-path steps, commands, and proof-of-concept extraction actions such as enumerating endpoints, accessing targets, extracting tokens, and reusing them. In a threat-modeling skill, this materially lowers the barrier to misuse because it turns analysis guidance into actionable offensive procedures without clear safety boundaries, authorization requirements, or warnings about third-party impact and sensitive data handling.

VirusTotal

8/64 vendors flagged this skill as malicious, and 56/64 flagged it as clean.

View on VirusTotal