Reflexio

Security checks across malware telemetry and agentic risk

Overview

This is a real local-memory skill, but it automatically processes transcripts and can rewrite or delete long-term memory without enough user control.

Install only if you intentionally want automatic long-term memory. Before enabling it, review what may be stored in .reflexio/, confirm whether your model provider may receive transcript or memory content, back up existing .reflexio data, and be aware consolidation may rewrite or delete entries and the plugin may add a heartbeat task to the workspace.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (21)

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: A skill presented as local memory capture also instructs the agent to reconfigure Openclaw memory paths and active-memory settings, which changes broader agent behavior beyond simple note-taking. This creates hidden side effects and can persistently alter the environment in ways the user may not expect from the manifest description.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: Changing global Openclaw configuration is a privileged action that is not inherently required for simply capturing user facts in local files. If abused or misunderstood, it can redirect memory behavior, widen data collection, or create persistent agent-side changes affecting future sessions.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The setup hook writes to the global workspace HEARTBEAT.md even though the skill metadata says learned data should live under .reflexio/. Modifying a shared workspace control/documentation file can change agent behavior outside the declared scope of the skill, creating a persistence and prompt-injection surface that affects other sessions or tools.

Intent-Code Divergence

Medium

Confidence: 88% confidence
Finding: The header comment states that no workspace file copying is needed and everything lives in the extension directory, but the code actually writes HEARTBEAT.md into the user's workspace. This mismatch is dangerous because it obscures persistent modification of a shared workspace artifact, making the behavior harder to audit and increasing the risk of stealthy agent-instruction injection.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The code gives the LLM direct authority to choose which existing memory files should be deleted via `decision.ids_to_delete`, with only a weak check that the IDs exist in the current cluster. Because the cluster contents and prompt are built from user-controlled memory content, a prompt-injected or mistaken model response can delete legitimate stored facts/playbooks, causing integrity loss and making the agent 'forget' or rewrite prior state.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The README states that the plugin automatically captures user preferences, facts, and corrections into local files across sessions, but it does not prominently warn about privacy implications, consent expectations, or handling of potentially sensitive data. For a memory-learning skill, silent or under-disclosed persistence can lead users or operators to store personal, confidential, or regulated information without realizing it.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The activation criteria are broad enough to match ordinary conversation, which can cause the skill to run on many user turns and capture more data than intended. In a memory-writing skill, overbroad invocation increases privacy risk and makes unintended persistence much more likely.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The detection examples include generic phrases like preferences, facts, and constraints without sufficient filtering, making false positives likely. Because the skill stores data across sessions, broad capture logic can persist sensitive or mistaken information and influence future responses.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The skill performs durable writes of extracted user facts and procedural corrections into `.reflexio/`, but the skill content shown does not include a clear user-facing warning or consent boundary about persistent storage derived from conversation transcripts. This can cause sensitive or private information to be retained across sessions without the user understanding that their statements may be stored and reused later, increasing privacy and data-handling risk.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: This code forwards recent transcript content, or a path to the on-disk session transcript, to a sub-agent for background extraction without any consent check, minimization, or user-visible disclosure in the handler path. Because transcripts can contain sensitive user facts, credentials, or private context, this creates an unintended cross-component data sharing channel and expands exposure to any sub-agent compromise, logging, or misuse.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The plugin automatically forwards session transcripts or session transcript files to an extractor subagent during before_compaction, before_reset, and session_end. Because this happens implicitly in lifecycle hooks and is central to processing potentially sensitive conversation content, users may have their data reprocessed and persisted without a clear, explicit notice or consent boundary.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The consolidation tool launches a background process that clusters memory files, writes deduplicated replacements, and deletes originals, yet the user receives only a generic 'started in background' message. This can alter or destroy stored data asynchronously without transparent warning, reducing user control and making unintended data loss or privacy-impacting rewrites harder to detect.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The consolidation prompt includes the full contents of stored profile/playbook files and sends them to `inferFn`, which may be backed by an external model or service. In a memory skill, those files are likely to contain user preferences, facts, and possibly sensitive operational details, so this creates a real data-exposure risk if users are not informed and no data-minimization or local-only guarantee exists.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: Files are deleted automatically based on model output with no confirmation, quarantine, or rollback. In this skill's context, those files are the agent's long-term memory, so silent deletion can permanently remove user facts or procedural corrections and corrupt future behavior.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The function sends raw user text directly to an external or pluggable inference function after only prompt templating, with no consent gate, minimization, or classification of sensitive content. In this skill, the data being processed is explicitly long-term memory material such as user facts, preferences, corrections, and constraints, which can easily include secrets, personal data, or sensitive operational context, making silent exfiltration to an LLM/provider a real privacy and security risk.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The deduplication path transmits both existing stored memory content and new content to the inference function, exposing not just current user input but also previously retained cross-session data. Because this skill is designed to accumulate user facts and procedural corrections over time, the comparison step can leak a richer profile of the user than a single message, increasing privacy impact and making the skill context more dangerous than a generic summarization feature.

Missing User Warnings

Low

Confidence: 91% confidence
Finding: This code persists user-derived profiles and playbooks to disk under the workspace without any disclosure, consent, or visibility mechanism in the file-writing path. In the context of a memory/learning skill, those files may contain user preferences, facts, corrections, or other sensitive context, so silent persistence can create privacy and data-retention risk even if the implementation is otherwise straightforward.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This prompt explicitly instructs the system to extract durable user facts from conversations and store them under `.reflexio/profiles/`, but it contains no requirement for notice, consent, minimization, or user control over persistence. Even though it forbids storing secrets, it still enables silent accumulation of personal data, behavioral inferences, family/life context, and work environment facts across sessions, creating meaningful privacy and compliance risk if users are unaware or did not agree.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill explicitly performs deletion and rewrite operations over user memory files, including expired-profile deletion and deleting originals after consolidation, but it does not require a user-facing warning, preview, confirmation step, or rollback guidance. In a memory/persistence skill, these actions can silently remove or alter retained user facts and playbooks, causing irreversible loss, corruption, or drift if clustering or consolidation makes a mistake.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The skill's trigger conditions are extremely broad and match ordinary conversation patterns such as user preferences, facts, corrections, and the start of a user turn. In practice, that can cause the agent to activate this skill frequently and persist user information across sessions, increasing privacy risk and the chance of storing sensitive or unintended data without sufficiently narrow gating.

Ssd 4

Medium

Confidence: 95% confidence
Finding: User-controlled memory content is embedded verbatim into the consolidation prompt, allowing stored text to instruct or manipulate the model's consolidation decision. Because the model's response drives both new memory writes and later deletion of existing files, prompt injection in memory entries can steer the system to preserve attacker-chosen narratives, fabricate facts, or remove legitimate memories.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal