Self Discipline

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed local discipline and validation workflow; it is powerful, but its persistence and validator behavior are purpose-aligned and gated by user approval.

Install this only if you want a persistent local system that records instruction-failure incidents and can add enforcement checks. Review any generated validator script, any proposed AGENTS.md or HEARTBEAT.md change, and any incident details saved under ~/self-discipline/ before approving them.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (4)

Intent-Code Divergence

Medium
Confidence
92% confidence
Finding
The document states that validators never modify anything, yet the periodic validator's security manifest says it writes to ~/self-discipline/validator-log.md. This inconsistency can mislead operators and reviewers about side effects, weakening trust in the control boundary and making it easier for future validators to justify unintended writes.

Vague Triggers

Medium
Confidence
87% confidence
Finding
The skill can be triggered by vague emotional signals such as a frustrated user or phrases like 'I told you...,' which are common in normal conversations and not limited to true critical incidents. That broad trigger surface can cause the agent to enter an escalation-and-persistence workflow unnecessarily, leading to overcollection of incident data, creation of enforcement artifacts, or pressure to modify persistent files when the situation does not warrant it.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The severity matrix uses highly subjective signals like emotional wording and absolute terms as escalation criteria without requiring objective incident characteristics. In this skill's context, that can cause routine interactions to be overclassified, triggering unnecessary root-cause workflows, validators, and process changes based on weak evidence rather than actual risk.

Vague Triggers

Medium
Confidence
88% confidence
Finding
The automatic CRITICAL trigger 'Production affected' is undefined and can be interpreted broadly, allowing minor or transient issues to be treated as maximum severity. In a compliance-enforcing skill, this ambiguity is more dangerous because CRITICAL classification mandates heavyweight corrective actions, creating operational disruption and incentivizing excessive escalation.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal