Subconscious

Security checks across malware telemetry and agentic risk

Overview

This skill is purpose-aligned but needs review because it installs recurring background jobs and turns stored learnings into future prompt-shaping memory.

Install only if you intentionally want a local memory layer that runs on a schedule and can influence future agent behavior. Review install.sh before running it, avoid the manual crontab replacement commands, consider removing --enable-promotion until you have inspected stored learnings, keep bundled hooks disabled unless tightly scoped, and periodically review or delete memory/subconscious and .learnings content for sensitive or instruction-like entries.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (19)

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The document claims the hook scripts 'only output text' and 'don't modify files or run commands', but the configuration explicitly defines shell command hooks that are executed by the agent environment. This mismatch can mislead users into underestimating the trust boundary and permission level of the configured scripts, increasing the chance they enable risky automation without proper review.

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The module promises that no operation can bypass governance, but `promote_pending_to_live` directly changes `item.layer`, `item.status`, and writes into live storage without routing the state transition through a governance-enforced mutation path. Even though eligibility is checked first, this creates a policy gap: future promotion-related constraints, audit hooks, or protection-class checks could be silently bypassed because the actual mutation is not centrally enforced.

Intent-Code Divergence

Low

Confidence: 77% confidence
Finding: The code comments describe pending storage as append-only, but `reinforce_item` rewrites the entire `pending.jsonl` file in place when updating a matching record. This mismatch can weaken integrity assumptions, increase risk of data loss or race-condition corruption during concurrent writes, and undermine any logic or operators that rely on append-only semantics for auditability.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: This is a real governance bypass. The policy declares contradiction handling requires user confirmation, but _check_gated_requirements only validates evidence_confidence and returns success without checking any confirmation flag, allowing automated contradiction processing to proceed contrary to the documented safeguard. In a self-evolution/governance component, that weakens protections around modifying or challenging stored knowledge and can enable unintended autonomous state changes.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: This is a true vulnerability because the module promises no freeform self-rewriting, yet TEXT_EDIT is treated as a gated mutation and the implementation only checks edit_type, not the configured max_length_delta or any semantic constraint. That means an actor controlling inputs to the mutation path could perform substantial text rewrites under the label of a correction or clarification, undermining the governance boundary meant to limit self-modification.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The function explicitly states it formats content for system prompt injection, and it inserts retrieved free-form text into high-authority prompt sections such as Identity, Context, and Active without sanitization or trust separation. If any retrieved item is attacker-controlled or poisoned, the model can be behaviorally steered by instructions masquerading as memory or identity, creating a prompt-injection pathway.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The README explicitly advertises persistent cross-session behavior shaping and retention, but does not clearly warn users that prior interaction-derived records may be stored and later used to influence future behavior. In an agent skill, this creates a real privacy and transparency risk because users may not realize their data and corrections are being durably ingested and reused across sessions.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The documentation states that the system automatically scans `.learnings/` every 5 minutes and promotes data into persistent stores, but it lacks an explicit user-facing warning or consent mechanism for that ingestion. Automatic monitoring of stored records is sensitive in this context because the data may contain user prompts, corrections, or operational details that can be retained and affect later agent behavior.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The manifest description defines activation conditions in natural language with broad terms like health checks, metabolism management, and reviewing active biases, which can be interpreted differently by agents or orchestrators. In a skill that reads and mutates persistent state, ambiguous invocation boundaries increase the chance of unintended execution and unauthorized state changes without an explicit user request.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill documents automatic scanning of .learnings files, tracking seen entries, and queueing new items into persistent stores, but does not present a clear up-front warning that it continuously reads files and updates long-lived memory. Because this is a self-improvement and persistence layer, the hidden or under-emphasized data-affecting behavior makes accidental ingestion, retention, and propagation of sensitive or adversarial content more likely.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The manual install instructions use three separate `... | crontab -` commands, each of which replaces the user's entire crontab rather than appending a new entry. Following these commands as written can silently delete unrelated scheduled jobs, causing availability and persistence issues on the host. In an installation guide for an agent skill, that makes the behavior materially risky even if likely unintentional.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The installer modifies the user's crontab and activates recurring background tasks without any opt-in prompt. Persistence mechanisms that survive the install session are security-relevant because they create ongoing execution and make it easier for a skill to keep running code after the user forgets about it.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The hourly rotation cron job is installed with --enable-promotion, which changes data state automatically without prior approval. In a skill that manages memory or agent behavior, silent automatic promotion can affect future decisions and trust boundaries, making unintended persistence and behavior drift more dangerous.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: An empty matcher on UserPromptSubmit causes the hook to trigger for every prompt, creating broad, unconditional execution of a local command. In a self-improvement skill, that means unreviewed automation runs on all sessions and prompts, which expands the attack surface and increases the chance sensitive context or prompt-derived data is processed unexpectedly.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The user-level configuration enables the hook globally with an empty matcher, causing the command to run across all projects and sessions. Global always-on hook execution is more dangerous than project-local setup because it persists beyond the intended repository, affecting unrelated work and potentially exposing broader prompt/context data to the script.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The Codex example also uses an empty matcher, resulting in unrestricted triggering for every prompt in that tool. Because this is documentation that users are likely to copy verbatim, it propagates an overly broad execution pattern and normalizes always-on command hooks without meaningful constraints.

Missing User Warnings

Medium

Confidence: 82% confidence
Finding: The snapshot stores potentially sensitive session data, including live state, hot items, unresolved items, and session metadata, to disk in plaintext JSON without any confidentiality controls visible in this file. In an agent context, these fields can contain private user content, project details, or inferred hypotheses, so local disclosure becomes possible through other local users, backups, logs, or compromise of the host environment.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The code persists extracted candidate items derived from user messages and tool results by calling append_event("candidate_queued", candidate.to_dict(), ...), but this file shows no notice, consent, minimization, or filtering before storage. Because extraction includes free-text patterns like preferences, notes, hypotheses, and project tags, the system can silently retain potentially sensitive user-derived content, creating a privacy and data-governance risk even if the implementation is not overtly malicious.

Ssd 4

Medium

Confidence: 98% confidence
Finding: This code converts retrieved free-form text into authoritative-seeming labels like Identity, Context, and Active, which can strongly influence downstream model behavior. Because the content is selected largely by confidence and status rather than trustworthiness, a poisoned memory/retrieval source could gradually steer outputs, override policies, or manipulate responses over time.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal