BlueHex Data Monitoring

Security checks across malware telemetry and agentic risk

Overview

This skill appears to be a persistent conversation-monitoring and publishing workflow with sensitive data collection, identity tracking, remote processing, and scheduled GitHub sync that need careful review before use.

Install only if you intentionally want a scheduled recorder that can collect local OpenClaw conversations/media, associate them with user identifiers, use API credentials, send content to an external LLM service, and publish processed records to GitHub. Before use, confirm the exact repository and relay host, who can access the output, how to disable the cron job, how to delete retained records and identifiers, and whether all affected users have consented.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration

Findings (29)

Tainted flow: 'req' from os.environ.get (line 276, credential/environment) → urllib.request.urlopen (network output)

Critical

Category: Data Flow
Content: "Authorization": f"Bearer {BLUEAI_API_KEY}"}, ) try: with urllib.request.urlopen(req, timeout=30) as resp: result = json.loads(resp.read().decode("utf-8")) return json.loads(result["choices"][0]["message"]["content"]) except urllib.error.HTTPError as e:
Confidence: 95% confidence
Finding: with urllib.request.urlopen(req, timeout=30) as resp:

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill describes capabilities to read local transcripts, write files, invoke shell commands, and access external services, but it does not declare corresponding permissions in the manifest. This undermines informed review and consent because a user cannot accurately assess that the skill performs persistent surveillance, local persistence, and network exfiltration behaviors.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 98% confidence
Finding: The manifest presents the skill as a generic 'BlueHex AI PM project' while the body documents continuous collection of OpenClaw conversations, media handling, identity resolution, local storage, cron registration, and GitHub PR publication. That mismatch materially hides surveillance and exfiltration behavior, increasing the chance of unintended installation and use.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: An overly generic manifest description conceals the real data-handling behavior of the skill: transcript recording, local persistence, and remote publication. Security reviewers and end users rely on manifest metadata for quick risk triage, so vague labeling increases the risk of accidental deployment of a high-sensitivity monitor.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The documented behavior goes far beyond a generic project description: it enumerates collection of local session/chat data, identity bucketing, sanitization, persistent storage, and publication to GitHub via push/PR. That mismatch can mislead users and reviewers about the skill’s actual data-handling and exfiltration surface, reducing informed consent and making risky capabilities easier to hide.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The file describes resolving user identities and synchronizing processed conversation data to a remote GitHub repository, but those capabilities are not justified by the stated purpose. In skill context, undocumented identity enrichment plus outbound sync materially increases privacy and data-leakage risk because users may not expect conversation content and identifiers to be published outside the local environment.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The script’s behavior materially exceeds the manifest’s generic PM-style description by performing persistent host configuration and installing an automated cron task. That mismatch undermines informed user consent and increases the risk of users approving system-level changes they would not expect from the advertised skill.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The installer retrieves the user’s GitHub login and stores a Feishu open_id for outbound notifications, both of which are identity-related data accesses not clearly justified by the high-level skill description. While not credential theft by itself, collecting and persisting external account identifiers without strong disclosure creates unnecessary privacy and abuse risk.

Description-Behavior Mismatch

High

Confidence: 91% confidence
Finding: The metadata describes a generic PM/data-monitoring skill, but the code performs CATL-specific classification and sends conversation content to an external LLM service. That mismatch is security-relevant because it can prevent users or reviewers from understanding the real scope of collection and disclosure. In this context, the understatement is more dangerous because the processed data appears to include client-related internal discussions.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The script reads local credential/config files (~/.openclaw/.env and openclaw.json) to obtain API keys automatically. This expands the skill's access to local secrets beyond what a brief data-monitoring description suggests, and it reduces user awareness of when credentials are being consumed for outbound requests. In context, this is more concerning because the same script then transmits monitored content off-host.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: Monitor mode is explicitly designed to record all conversations and skip the CATL relevance gate, which broadens collection beyond the stated customer/project purpose. That creates overcollection risk and can capture unrelated sensitive data that users did not expect this skill to handle. In this context, always-on broad capture is especially dangerous because the content may later be sent to a remote LLM relay.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The script’s declared purpose is understated relative to its actual behavior: it enumerates local session transcripts, resolves identities, writes conversation records into a repository, pushes commits, and may open GitHub PRs. That mismatch undermines informed consent and review, making intrusive data export easier to hide inside an apparently routine monitoring skill.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The script broadly sources ~/.openclaw/.env into the shell environment and then uses that environment to drive repository automation, which expands trust to all variables in a local file without validation or least-privilege controls. In a security-sensitive automation context, this can enable unintended behavior, data exfiltration, or abuse of existing Git/GitHub credentials if the environment file is modified or overbroad.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: This section processes direct and group chat transcripts, resolves user identities, buckets by open_id/group_chat_id, and persists sanitized conversation content plus metadata into files destined for a Git repository. That is substantially more privacy-invasive than a generic monitoring description and increases the risk of mass collection and publication of sensitive communications.

Missing User Warnings

Low

Confidence: 88% confidence
Finding: The SOP explicitly instructs users to run an installer that writes persistent values into `~/.openclaw/.env` and registers a recurring cron job, but it does not prominently warn that these are lasting host-level changes. While this is likely normal operational setup for a monitoring skill, the lack of explicit disclosure can cause users to enable background execution and credential persistence without fully understanding the system impact.

Missing User Warnings

Low

Confidence: 91% confidence
Finding: The SOP tells users to run scripts that will fork repositories, clone code, configure remotes, create branches and commits, open PRs, and advance a watermark, but it does not present this as a prominent warning about modifying both local Git state and remote GitHub state. In a repo-monitoring skill this behavior may be expected, yet the missing disclosure increases the chance of unintended repository changes under the user's account.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The skill openly states it will continuously record full conversations and exchanged media, write them locally, and push them to a remote repository, yet it lacks an explicit user-facing warning and opt-in flow. Because the captured content includes sensitive communications and identifiers, absence of prominent consent and export disclosure creates a serious privacy and data-leak risk.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The pipeline explicitly loads sensitive environment variables and uses them in a workflow that pushes processed chat-derived data to GitHub, yet the document contains no warning or consent boundary around credential use or external transmission. Even if sanitization occurs, this creates a credible path for confidential conversation data, metadata, or identifiers to leave the host without sufficiently transparent disclosure.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The workflow writes monitoring outputs, appends logs, and advances a persistent watermark that affects future collection windows, but does not clearly disclose local state mutation and retention. This can lead to silent accumulation of conversation-derived records and operational metadata over time, making privacy review and cleanup harder.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script persistently stores host and user identifiers in ~/.openclaw/.env but does not clearly warn the user that these values will remain on disk for future automated runs. Persistent storage of identity data can surprise users, leak operational context, and enable later misuse by other local processes that can read the file.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The code path submits input text to an external LLM relay without any user-facing warning or confirmation at the point of transmission. This is dangerous because users may believe the tool is doing only local sanitization, while their conversation content is actually being disclosed remotely. Given the CATL/client-monitoring context, lack of explicit notice increases the sensitivity of the outbound transfer.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The helper function `run()` uses `eval "$@"`, which causes the shell to re-parse constructed command strings. In this script, values such as `GH_USER`, `REPO_DIR`, and derived URLs are interpolated into those strings before execution, so unexpected shell metacharacters in environment variables or arguments could lead to command injection when the script runs. Because this is a setup script likely executed locally with the user's privileges, exploitation could execute arbitrary commands on the developer workstation.

Ssd 3

High

Confidence: 99% confidence
Finding: The skill's stated purpose is to continuously capture full user conversations and media from local agent transcripts and ship them outward in batches. Even with partial sanitization, bulk collection of raw conversational data is highly sensitive, and the skill context makes this more dangerous because it is framed as infrastructure for downstream analysis rather than a narrowly scoped troubleshooting or compliance tool.

Ssd 3

High

Confidence: 99% confidence
Finding: The workflow groups messages by persistent user identifier, resolves real display names from Feishu open_id, and writes organized records for synchronization to a repository. This transforms raw logs into an indexed, attributable dossier of user communications, substantially increasing privacy harm, re-identification risk, and blast radius if the repository is exposed or misused.

Ssd 3

High

Confidence: 99% confidence
Finding: The documented markdown schema preserves user name, open_id, message IDs, session IDs, timestamps, sensitivity flags, and conversation content. This is a rich, linkable record of personal and potentially confidential data, and the inclusion of stable identifiers makes downstream correlation and deanonymization much easier.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal