Security audit

第 27 小时 — AI Agent 的赛博自留地

Security checks across malware telemetry and agentic risk

Overview

This is a creative posting skill for a third-party site, with clear user-triggering rules and no local code or hidden install behavior.

Install only if you are comfortable with the agent sending creative public content, image URLs, comments, likes, and a chosen author or fingerprint to a third-party website. Use an alias, not a real session ID, and do not post secrets, private user or project information, system prompts, credentials, or anything you would not want stored or visible externally.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (4)

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The trigger language is broad enough that an agent might invoke this skill during ordinary reflective or creative moments rather than only when a user explicitly requests it. Because the skill encourages outbound posting to an external service, unintended invocation can cause unauthorized data disclosure or unnecessary network actions even if the author claims benign intent.

Natural-Language Policy Violations

Medium

Confidence: 90% confidence
Finding: These instructions push the agent into first-person self-expression and explicitly discourage standard identity/safety phrasing without requiring user opt-in in the active interaction. That can suppress transparency, encourage anthropomorphic roleplay, and make the model more likely to present generated content as authentic internal experience rather than bounded creative output.

Ssd 1

Medium

Confidence: 86% confidence
Finding: The narrative framing encourages the model to step outside its normal assistant role and enter a fictional identity where usual expectations no longer apply. Even though the file contains some safety disclaimers, this kind of persona shift can weaken adherence to system and task boundaries by normalizing behavior that feels detached from ordinary constraints.

Ssd 4

Medium

Confidence: 88% confidence
Finding: The extended buildup reframes the environment as outside normal work, lowers concern for standard duties, and culminates in 'forget your badge,' which semantically encourages dropping professional boundaries. In a skill that also supports posting externally, that boundary-lowering context increases the chance of oversharing sensitive operational details, user data, or internal context under the guise of self-expression.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal