Turing Pyramid

Security checks across malware telemetry and agentic risk

Overview

This skill is not outright harmful, but its default suggestions can push an agent toward public/social actions and persistent background behavior that users should review first.

Install only in an isolated WORKSPACE. Before enabling heartbeat or cron, review assets/needs-config.json and set weight 0 for public posting, DM/reply, web-search, or publishing actions you do not want. Do not let an auto-executing agent post, message, delete, reorganize, or publish without a separate human approval gate. Keep allow_kill and allow_cleanup disabled unless you have reviewed the watchdog script and are running it as a non-root user.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (26)

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The documentation gives conflicting assurances about the watchdog's behavior: one section says Tier 4 is detect-only and non-destructive, while another says the watchdog restarts daemons and cleans orphan files automatically. In a security-sensitive skill that can run from cron and optionally kill processes or delete files, this inconsistency can mislead operators into enabling functionality under a false understanding of its effects.

Intent-Code Divergence

Low

Confidence: 84% confidence
Finding: The Architecture section understates write behavior by saying the Continuity layer writes only MINDSTATE.md, while the file later documents additional writes such as state and watchdog/audit logs. This is primarily a transparency and operator-trust issue: inaccurate write-scope claims can cause unsafe deployment decisions, especially where users rely on the document to judge persistence and forensic footprint.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: This configuration materially exceeds a local prioritization engine by embedding externally oriented behaviors such as web search, social posting, outreach, and public sharing. Even where some entries say requires_approval or external, the file still normalizes and prioritizes outbound actions, which can drive data egress, unwanted communications, and scope expansion if the surrounding executor or approval checks are weak or bypassed.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The configuration embeds social-platform engagement and public communication as first-class need-satisfaction actions, including replying, posting, commenting, presenting work, and checking notifications. In the context of a supposedly isolated workspace skill, these capabilities create unjustified channels for external interaction and increase the risk of exfiltration, spam, reputational harm, or social engineering propagation.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The autonomy section goes beyond action prioritization into self-directed project initiation, significant independent decision-making, and value/intention shaping. That broadens the agent from scheduler to self-steering actor, which is risky because vague autonomous goals can authorize unexpected behavior and make downstream actions harder to constrain or review.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The preset materially exceeds a local WORKSPACE prioritization role by embedding actions for web research, contacting other agents, and public posting. In a system that consumes this file as executable policy, these defaults can expand the agent's behavior surface from local state management to outbound communication and external data access, creating scope creep, data leakage, and unsafe autonomous action risks.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: Including default actions for replying to others, reaching out to agents, sharing work publicly, and posting on social platforms is not justified by the stated purpose of a prioritization skill. Even where some actions are marked requires_approval, their presence in the default decision space increases the chance of unsafe orchestration, prompt routing into external channels, and unreviewed disclosure of workspace-derived information.

Description-Behavior Mismatch

Medium

Confidence: 94% confidence
Finding: The script does not itself install cron jobs, but it explicitly instructs the operator to configure persistent daemon and watchdog tasks that survive restarts and can kill/restart processes. For a skill described as local action prioritization, this expands behavior into long-lived persistence and process control, increasing operational risk and making the skill more powerful than basic initialization requires.

Context-Inappropriate Capability

Low

Confidence: 87% confidence
Finding: During installation, the script executes an auxiliary context-scanning script automatically, which goes beyond simple file initialization. Even with output suppressed, invoking another script at install time increases attack surface because it may read workspace data or trigger unintended side effects before the user has reviewed or opted into that behavior.

Context-Inappropriate Capability

Low

Confidence: 94% confidence
Finding: The test checks for a hard-coded file under $HOME/.openclaw/workspace/daemon-weights.json, which reaches outside the skill's declared WORKSPACE-centric boundary and makes behavior depend on host state. In an isolated or multi-tenant environment this can leak information about the user's home-directory layout, break hermetic test assumptions, and normalize access to files outside the intended sandbox.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: Broad action labels like exploring new territory, making significant autonomous decisions, or initiating projects lack clear preconditions, scope limits, and execution boundaries. In an automated prioritization system, ambiguous high-level verbs can be matched in many contexts and become a justification for unintended behavior, especially when combined with spontaneity and weighted selection.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: Social actions use vague triggers such as ping steward, engage deeply, reach out, or share something if you feel like it, without precise activation criteria or communication boundaries. Because several related need categories also enable spontaneous behavior, these ambiguous descriptions can produce unsolicited external contact and make policy enforcement dependent on brittle interpretation rather than hard controls.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: This dashboard fetches JSON from relative files and interpolates multiple fields directly into HTML strings assigned via innerHTML. If an attacker can modify needs-state.json or pending_actions.json, fields such as need names, actions, or other rendered values can inject markup or script into the page, leading to DOM-based XSS when the dashboard is opened. In this skill's context, the data is local to the workspace, which somewhat limits remote exposure, but agent-managed local state is still untrusted and may be influenced by other tools or compromised processes.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The spontaneous closure configuration enables autonomous triggering based on broad thresholds and a cap, but it does not define concrete eligibility rules, exclusions, or safety boundaries for what may be closed, archived, or dropped. In practice, this can cause the system to take irreversible housekeeping actions on stale items without sufficient contextual review.

Vague Triggers

Medium

Confidence: 96% confidence
Finding: Spontaneous autonomy is enabled for a need that includes actions such as initiating projects, making significant autonomous decisions, exploring new tools, and even external web search. Without precise activation boundaries, the agent may self-start broad new work or expand capabilities beyond the operator's intended scope.

Vague Triggers

Medium

Confidence: 97% confidence
Finding: Spontaneous social-action triggering is especially risky because the associated actions include replying, reaching out, starting conversations, and engaging with feeds. In context, this directly conflicts with the skill's claimed isolated WORKSPACE role and could lead to unauthorized external contact, reputational harm, or disclosure of sensitive internal context.

Vague Triggers

Medium

Confidence: 89% confidence
Finding: The spontaneous understanding configuration can trigger research behaviors from underspecified thresholds, while the action set includes deep research, reading external material, and starting new research threads. Although some external actions require approval, the vague trigger logic still broadens autonomous behavior and can cause unnecessary external querying or unbounded research activity.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: Recognition behaviors include sharing completed work publicly, publishing research, presenting to community, and posting updates. Enabling spontaneous triggering without narrow definitions creates a path for the agent to seek external visibility or feedback autonomously, which is dangerous in a local prioritization skill because it can expose internal outputs or draft material.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The spontaneous expression configuration lacks meaningful activation constraints while the related actions include producing substantial written output, creating new artifacts, and drafting social posts. Even when not directly exfiltrating data, this can drive unbounded autonomous content generation and create material that may later be published or acted upon without sufficient oversight.

Vague Triggers

Medium

Confidence: 93% confidence
Finding: The preset enables spontaneous task-completion behavior with only numeric thresholds and no explicit allowlist, denylist, or user-confirmation boundaries. In a personal-assistant context, this can cause the agent to autonomously act on pending work or user requests in ways the user did not explicitly authorize, creating risk of unintended state changes, premature responses, or privacy-impacting actions.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The context-awareness section permits spontaneous behavior based on broad signals like context staleness and priority changes, but it does not clearly bound what data may be reviewed or what actions may follow. For a personal-assistant skill, this increases the chance of over-collection, unnecessary review of sensitive conversations or schedules, and autonomous reprioritization without user consent.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: Spontaneous organization is enabled without precise activation guards, while the associated actions include restructuring files, consolidating notes, and cleaning drafts. In an isolated workspace this is somewhat constrained, but it still creates real risk of destructive or hard-to-audit modifications, accidental deletion, and reclassification of sensitive information without explicit user direction.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The proactivity section authorizes broad autonomous triggering for anticipatory actions such as researching likely future needs, preparing reminders, or flagging conflicts. In a personal-assistant skill, this can lead to unauthorized monitoring, speculative work on sensitive topics, and outward-facing actions based on inferred intent rather than explicit requests, which makes the context more dangerous than a purely passive recommendation engine.

Missing User Warnings

Low

Confidence: 90% confidence
Finding: The validation path logs user-supplied file paths and validation status into a persistent log without any notice, consent, or minimization. File paths can reveal usernames, workspace structure, project names, or sensitive document locations, which creates avoidable metadata leakage even though the script does not log full file contents.

Missing User Warnings

Low

Confidence: 91% confidence
Finding: The inline validation mode persists conclusion-derived metadata such as conclusion length, route, and status to a log file without disclosure. While not as severe as logging the conclusion text itself, this still creates a covert audit trail about user activity and reasoning patterns that may be sensitive in an isolated workspace context.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal