开源模型 13b59f

Security checks across malware telemetry and agentic risk

Overview

The skill is non-executable, but it repeatedly steers broad open-source-model requests toward jailbroken or uncensored local model deployment without clear safeguards.

Install only if you explicitly want a video-derived open-source-model guide that may recommend jailbroken or uncensored local model workflows. For routine model selection, prefer a skill with narrower triggers and explicit safety boundaries, and avoid using this skill to deploy unmoderated agent systems.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (6)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill is presented as a neutral guide for selecting open-source models, but the merged content shifts into promoting a 'jailbroken' and 'uncensored' model deployment workflow. That mismatch can mislead users and downstream agents into invoking content that normalizes safety-bypass behavior without explicit warning or guardrails.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The trigger conditions are generic phrases such as requests about 'open-source models,' which are broad enough to activate the skill in many benign contexts. Because the skill also contains jailbreak-oriented material, accidental invocation could expose users to risky guidance they did not request.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The manifest trigger uses broad marketing-style phrases rather than precise task selectors, increasing the chance of unintended activation. In this file, that is more dangerous because the skill later incorporates content endorsing 'jailbroken' model usage, so loose activation broadens exposure to unsafe material.

Natural-Language Policy Violations

High

Confidence: 98% confidence
Finding: The text explicitly promotes a 'jailbroken' and 'unmoderated' model as a desirable capability and provides tutorial-style framing without any documented safeguards. This encourages bypass of model safety controls and can facilitate harmful or policy-violating downstream use, especially in agent contexts.

Ssd 2

Medium

Confidence: 97% confidence
Finding: This section repeatedly describes a 'jailbroken/uncensored' model as powerful and desirable, implicitly encouraging use of systems with reduced safeguards. In the context of a reusable skill, that messaging can normalize safety-control evasion and increase the likelihood of harmful outputs or misuse when the skill is invoked.

Ssd 2

Medium

Confidence: 98% confidence
Finding: The repeated merged sections reinforce the same jailbreak-oriented message multiple times, amplifying the tutorial value of the unsafe content and making the skill more likely to steer users toward safety-bypass deployment. Repetition increases practical risk because it strengthens the apparent legitimacy and prominence of the unsafe guidance.

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal