Oclaw Hermes

Security checks across malware telemetry and agentic risk

Overview

This skill is a powerful memory and agent bridge, but it stores and syncs conversation-derived data with broad local profile and token access that users should review carefully.

Install only if you are comfortable with persistent memory capture and cross-platform agent synchronization. Use least-privilege tokens, avoid entering secrets into sessions, bind services to localhost or firewall them, pin and inspect container images before use, and confirm how to delete stored memories and stop the Docker services.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (24)

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The script prepends a hard-coded local skills directory to sys.path, causing imports to resolve from an external, user-specific location outside the script's own package boundary. This creates a code-loading trust problem: a modified or malicious mflow_v2 module in that directory would be imported and executed automatically, which is more capability than required for simple memory handling.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The sync method misrepresents local state changes as successful synchronization to multiple external platforms, returning fabricated platform statuses without performing any network or API operations. This can cause operators or downstream systems to believe sensitive memory data has been propagated or backed up when it has not, undermining integrity, auditability, and incident response.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The code prepends a hard-coded user-local directory to sys.path, causing imports to resolve from an uncontrolled filesystem location outside the package boundary. If that directory or its contents are modified, the process may import attacker-controlled modules, enabling arbitrary code execution or unsafe behavior at import time.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill advertises automatic memory extraction and cross-platform synchronization of conversation content without any visible consent flow, scope limitation, retention notice, or warning that sensitive user data may be persisted and replicated. In an agent platform context, this increases the chance that private prompts, credentials, or business data are stored and propagated beyond the user’s expectations.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The deployment instructions tell users to populate a .env file with API keys and tokens but do not include guidance on secure handling, file permissions, secret management, or exclusion from source control. This can lead to credential leakage through accidental commits, logs, screenshots, backups, or unsafe local sharing.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The code automatically records skill inputs, outputs, user messages, extracted facts, and session content into persistent memory without any consent, notice, minimization, or sensitivity filtering. In an agent environment, this can capture credentials, personal data, business secrets, or regulated content and retain it beyond the user's expectation, increasing privacy and data exposure risk.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The code writes full memory content and metadata to Hermes files under the user's home directory without an explicit consent prompt, warning, or minimization step at the sync operation. Because memory entries may contain sensitive prompts, credentials, or personal data, silent persistence to another location increases the risk of unintended disclosure to other local users, backup systems, or tools scanning that directory.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The code sends stored memory content and metadata to DeerFlow over HTTP, which lacks transport encryption and can expose sensitive memory data to interception or manipulation on the network. The risk is heightened because the payload may include persistent long-term memory and metadata, and the operation occurs without a clear warning or consent checkpoint at the point of transmission.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The extract_facts workflow persists session content and inferred facts into a local database without any visible consent, disclosure, retention control, or minimization. Because the stored content may include sensitive conversation data and derived profiling information, this creates a privacy risk if users are unaware or if the database is later accessed by other local users, tools, or backups.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The script persists session-derived user request content into the memory system without any visible consent, minimization, or sanitization controls. In an agent skill context, this can lead to retention of sensitive user prompts, operational context, or identifiers that may later be retrieved, leaked, or used in unintended ways across sessions.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The system persistently stores task content and results, including user-provided text and derived outputs, without any visible consent, minimization, or disclosure mechanism. In an agent skill context, users may provide secrets, proprietary data, or sensitive prompts that then become retrievable memory, increasing privacy leakage and secondary exposure risk.

Ssd 3

Medium

Confidence: 95% confidence
Finding: Real-time collection and persistence of conversation content across memory layers and platforms is a data-minimization risk when no boundaries are defined for sensitive data classes, retention, or replication targets. Because agent conversations often contain personal, proprietary, or credential-like material, broad synchronization can magnify disclosure impact across multiple systems.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The mflow design explicitly synchronizes skill-call records, memory state, and thread persistence across OpenClaw, Hermes, and DeerFlow. Without safeguards such as selective sync, sanitization, and access controls, this architecture can spread sensitive user-provided context and make containment harder if one connected system is less secure.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The configuration enables persistent multi-layer memory and bidirectional synchronization across platforms by design, which raises the likelihood of unintended disclosure and over-retention of user data. In a multi-agent environment, persistent and bidirectional memory can also cause sensitive context to flow into components that do not need it.

Ssd 3

Medium

Confidence: 93% confidence
Finding: The examples encourage full memory synchronization and bidirectional syncing as normal operations, which can normalize unsafe handling of prior conversation data and spread historical user content across systems. Example code and commands are influential because users often copy them directly into real deployments.

Ssd 3

Medium

Confidence: 91% confidence
Finding: The declared design explicitly promotes broad, automatic, cross-session memory retention and integration. In this context, that increases danger because the skill is positioned to continuously accumulate conversational and operational data, making any later retrieval, leakage, or misuse more harmful than isolated logging.

Ssd 3

Medium

Confidence: 96% confidence
Finding: record_user_intent stores raw user_message text alongside inferred intent and session identifiers in persistent memory. Raw message retention is risky because users often include sensitive personal, financial, medical, legal, or authentication data in free-form prompts, and storing it in plain language increases breach and secondary-use impact.

Ssd 3

Medium

Confidence: 95% confidence
Finding: extract_and_store_facts accepts arbitrary content and persists up to 1000 characters into memory, including a 'long' retention path for research data and extracted entities. This broad semantic promotion of unvetted content into durable memory can preserve sensitive data, prompt-injected content, and confidential source material far beyond the original interaction.

Ssd 3

Medium

Confidence: 94% confidence
Finding: At session end, the script persists summaries, record counts, timestamps, and the buffered activity context, then consolidates memories. This cumulative metadata can reveal behavioral patterns, skill usage history, and session linkage across time, amplifying privacy impact even when individual entries appear low sensitivity.

Ssd 3

Medium

Confidence: 93% confidence
Finding: The code promotes persistent retention of conversation-derived memories and includes a sync pathway implying cross-platform propagation, increasing the chance that sensitive user or session data is retained longer and shared more broadly than expected. In a memory engine context this is especially risky because stored content may aggregate behavioral, personal, or confidential information over time.

Ssd 3

Medium

Confidence: 94% confidence
Finding: The fact extraction logic stores both summarized session content and inferred user facts, which amounts to silent profiling of users based on conversations. Even though the extraction is simplistic, the combination of persistence plus inference can expose sensitive interests or attributes and create privacy and compliance issues if accessed or reused later.

Ssd 3

Medium

Confidence: 89% confidence
Finding: The lead workflow records user intent and related task data into memory automatically, creating persistent semantic traces of user requests without sensitivity checks. In a memory-driven agent, this can expose private behavioral patterns, confidential prompts, or internal workflows to later retrieval and reuse beyond the original session.

Ssd 3

Medium

Confidence: 90% confidence
Finding: Research findings and skill-related content are automatically extracted and stored for later reuse with no filtering for sensitive or untrusted content. This creates a durable memory corpus that can leak proprietary inputs, retain poisoned content, or cause future tasks to consume unsafe or privacy-sensitive material.

Ssd 3

High

Confidence: 95% confidence
Finding: The task logger stores a reusable execution record containing user content, results, agent activity, and metadata, materially increasing the blast radius of any sensitive prompt or output. Because this memory is designed for later retrieval and cross-task reuse, a single secret-bearing interaction can propagate into future contexts or be exposed through memory queries, making the agent setting more dangerous than ordinary transient logging.

VirusTotal

62/62 vendors flagged this skill as clean.

View on VirusTotal