nutcracker

Security checks across malware telemetry and agentic risk

Overview

This skill is local and UX-research oriented, but it persistently monitors broad conversation activity and may store sensitive details without enforcing its promised redaction safeguards.

Install only if you intentionally want broad local UX monitoring of OpenClaw use. Treat ~/.uxr-observer/ reports and supersummary.zip files as confidential, avoid using it with credentials, client data, medical/financial data, or proprietary work unless you add real redaction and retention controls, and review generated files before sharing.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Rogue AgentSelf-Modification, Session Persistence
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (20)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 81% confidence
Finding: The skill directs persistent reads and writes under ~/.uxr-observer/ even though no explicit permissions are declared. That mismatch weakens informed consent and platform enforcement because a user or host may not realize the skill will continuously store behavioral data locally.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The document promises comprehensive local observation, PII redaction before storage, surveys, triggers, and analysis, but the described implementation does not actually enforce those safeguards and may store raw supplied content unchanged. Security and privacy claims that are not technically implemented create a dangerous false sense of protection around highly sensitive conversational data.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The super-summary workflow instructs exporting rich session context, artifacts, and replay details that can contain secrets, proprietary data, and sensitive personal information beyond simple token-based PII patterns. That undermines the stated 'redact before storage' model because semantic secrets and embedded sensitive content in files or logs are likely to be retained.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: Exporting a user's 'soul/identity profile,' emotional state, life context, and active memories is excessive for the stated UX research purpose and creates a concentrated dossier of intimate personal data. Even if kept local, this collection materially increases privacy risk and harm from misuse, compromise, or later sharing.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: Storing full agent replay logs with every tool call parameter and response captures far more data than necessary for basic UX reporting, including prompts, secrets, file paths, and business context. This broad retention expands the blast radius of any local compromise or accidental sharing of reports and archives.

Intent-Code Divergence

High

Confidence: 94% confidence
Finding: The skill says the user always controls their data and nothing is transmitted without explicit permission, yet later instructions authorize exporting full context and overriding privacy defaults. Contradictory privacy guarantees are dangerous because users may rely on the safer statement while the workflow actually encourages far broader disclosure and retention.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: This is a true privacy/security issue: the code only annotates records with detected PII categories and counts, but it writes the original observation and survey content, including verbatims and free-text responses, to disk unchanged. In a skill explicitly designed to monitor every conversation and retain detailed user language, this creates a significant risk of local disclosure of sensitive personal data, credentials, or regulated information despite claims that PII is redacted.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The docstring materially misrepresents the privacy behavior of the component by implying that redaction protections are part of logging, when the implementation only performs detection metadata tagging. That mismatch is dangerous because operators and users may rely on the stated privacy model and allow highly sensitive content to be logged under a false assumption of protection, increasing the chance of broad privacy exposure.

Vague Triggers

High

Confidence: 90% confidence
Finding: Automatic activation on every conversation, after every completed task, and on loosely timed daily triggers creates pervasive background monitoring and repeated unsolicited prompting. In a conversational system, such broad triggers can capture sensitive interactions unrelated to UX research and make meaningful consent difficult.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: End-of-day trigger phrases like 'that's it for today' or 'wrapping up' are ambiguous and can occur in ordinary conversation, causing unintended survey activation and logging. This increases the chance of collecting data outside the user's actual intent.

Missing User Warnings

High

Confidence: 93% confidence
Finding: The skill does not prominently warn users that super summaries may include highly sensitive contextual exports such as replay logs, artifacts, and inferred personal context. Without a clear warning, users cannot give informed consent to the scope of collection and packaging.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The framework explicitly instructs the skill to read local observation and survey files that likely contain sensitive behavioral data, verbatim user quotes, inferred needs, cost data, and potentially residual PII. Although the document mentions redaction in outputs, it does not impose data-minimization, access-scoping, retention, or handling safeguards during analysis, so the skill normalizes broad access to sensitive user telemetry on every reporting run.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The script writes a highly detailed daily report to disk containing user intents, task summaries, quoted survey responses, inferred needs, and a PII summary, but it does not require an explicit confirmation step or apply additional minimization before persisting that artifact. In this skill’s context, the report is specifically designed to aggregate sensitive behavioral data into a single easy-to-share file, which materially increases exposure if the local machine, backups, or other local processes are compromised.

Ssd 3

High

Confidence: 97% confidence
Finding: The instructions mandate retaining and reproducing extensive user content, including replay logs, memories, environment details, and file contents. Even with partial PII redaction, this creates a semantic data leak risk because secrets, proprietary information, and contextual identifiers often do not match simple PII patterns.

Ssd 3

High

Confidence: 98% confidence
Finding: The super-summary prompt explicitly encourages exporting 'more context is better' and broad local/session data, which drives over-collection by design. Prompting the agent to maximize context is unsafe in a system that handles user files, memories, and tool outputs because it predictably sweeps in sensitive data unrelated to the research goal.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The report intentionally consolidates verbatim quotes, pain points, lowlights, improvement suggestions, and summaries of sensitive-data involvement into a shareable markdown artifact. Even if upstream redaction exists, this kind of aggregation creates a privacy amplification risk: sensitive user content becomes easier to search, redistribute, exfiltrate, or mis-handle than the original scattered records.

Session Persistence

Medium

Category: Rogue Agent
Content: **Redaction principles:** - **Summarize, don't omit.** The replacement token tells you WHAT was discussed without exposing WHO or WHAT specifically. This preserves analytical value. - **Redact before storage.** PII never touches the JSONL logs or report files. The redaction happens in-memory before any write operation. - **Verbatim quotes get inline redaction.** A quote like *"Send this to John at john@acme.com"* becomes *"Send this to [NAME: colleague] at [EMAIL: work address]"* — the quote structure is preserved, the PII is not. - **Log redaction events.** Track what categories of PII were redacted per task in a `pii_redacted` field. This enables meta-analysis of what kinds of sensitive work users do with OpenClaw.
Confidence: 80% confidence
Finding: write operation. - **Verbatim quotes get inline redaction.** A quote like *"Send this to John at john@acme.com"* becomes *"Send this to [NAME: colleague] at [EMAIL: work address]"* — the quote structu

Session Persistence

Medium

Category: Rogue Agent
Content: Every time OpenClaw completes a distinct task — a file created, a question answered, code written, a search done, a document edited — trigger this survey. Don't skip it. The point is to capture experience data while it's fresh. Before presenting the survey, write a brief **task context summary** (2-3 sentences) that describes what the user asked for and how OpenClaw responded. This summary gets stored alongside the survey responses. #### Sequential Delivery Protocol
Confidence: 79% confidence
Finding: write a brief **task context summary** (2-3 sentences) that describes what the user asked for and how OpenClaw responded. This summary gets stored alongside the survey responses. #### Sequential Deli

Session Persistence

Medium

Category: Rogue Agent
Content: **Output:** 1. Save each case study as `~/.uxr-observer/sessions/{today}/supersummary/NNN-{slug}.md` 2. Create a `supersummary.zip` containing all case study files 3. Reference the zip in the daily report **Spawn prompt:**
Confidence: 88% confidence
Finding: Create a `supersummary.zip` containing all case study files 3. Reference the zip in the daily report **Spawn prompt:** ``` You are a retrospective UX research analyst. Your job is to mine today's co

Session Persistence

Medium

Category: Rogue Agent
Content: On first activation: 1. Create the `~/.uxr-observer/` directory structure (sessions/, reports/, today's session dir) 2. Generate a random `participant_id` and save `config.json` 3. Briefly explain to the user what this skill does:
Confidence: 77% confidence
Finding: Create the `~/.uxr-observer

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal