三省吾身 - 深度自我审视与进化系统

Security checks across malware telemetry and agentic risk

Overview

This is a document-only self-reflection skill with broad activation and file-writing guidance that is disclosed and aligned with its purpose.

Install this only if you want a structured retrospective and improvement workflow. Configure your agent to ask before entering the workflow on vague reflection language, and require explicit approval before it creates docs, modifies SOUL.md or SKILL.md, updates process documents, changes other skills, or shares outputs with other people.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (9)

Context-Inappropriate Capability

Low

Confidence: 92% confidence
Finding: The skill declares automatic triggering on broad signals like negative-feedback keywords, quality-problem patterns, and periodic reminders. That expands the skill from passive reflection guidance into unsolicited monitoring/nudging behavior, which can cause unexpected activation and steer conversations without explicit user intent.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The skill instructs the agent to update external files such as SOUL.md and SKILL.md, which gives it document-modification behavior beyond simple reflection assistance. Without explicit consent and scope limits, this can lead to unintended persistent changes to project or personal files, including sensitive prompt/configuration documents.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The quick-start section uses very generic trigger phrases such as 'reflect on...' and 'review...' without clear scoping, namespace, or explicit invocation boundaries. In assistant environments where skills may auto-activate from natural language, this can cause unintended invocation, making the model enter the reflection workflow when the user did not explicitly intend to use this skill.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The trigger phrases include very broad, everyday language such as '反思一下 [某事]', '审视一下 [某事]', and '复盘一下 [某事]'. In an agent skill system, overly generic triggers can cause unintended invocation during normal conversation, leading the skill to activate when the user did not explicitly intend to use it.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The activation conditions include highly subjective triggers such as 'feels not right' and broad categories like routine self-checks or milestones. These vague conditions make accidental or overbroad activation likely, which can derail unrelated tasks and cause the agent to infer intent the user never expressed.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The listed trigger phrases and automatic activation rules are underspecified and lack constraints or negative examples. This increases the chance that ordinary discussion about criticism, bugs, or reviews will unintentionally activate the skill, causing mode confusion and behavior outside the user's request.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The workflow describes outputs such as reflection reports, optimization plans, execution logs, and verification reports, and later configuration states these are stored under docs/, but the skill does not clearly notify users before writing files. Silent persistence can expose sensitive personal or project information and create unwanted artifacts in the workspace.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The description claims the skill is suitable for 'any scenario requiring deep reflection,' which is overly broad and can cause the orchestrator or users to invoke it in many ordinary contexts where a narrower tool would be more appropriate. Over-broad applicability increases unintended triggering, prompt-surface expansion, and the chance the skill influences decisions outside its intended domain.

Natural-Language Policy Violations

Medium

Confidence: 80% confidence
Finding: The description is written in Chinese without indicating multilingual support or user-language negotiation, which can lead to mismatched activation or confusing outputs for users expecting another language. In a skill-routing context, forcing one language without disclosure may degrade transparency and cause accidental invocation or poor user comprehension.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal