Mark Heartflow Skill

Security checks across malware telemetry and agentic risk

Overview

This does not look like confirmed malware, but it exposes broad local memory and code/command execution capabilities that are not fully disclosed by the skill description.

Install only if you want a local, persistent memory/cognitive engine and are comfortable with it storing conversation-derived data and exposing local services. Before enabling it, review or disable memory injection, MCP/daemon startup, and especially codeExecutor/selfInitiator routes; set SHUTDOWN_TOKEN if using the daemon; avoid storing secrets or sensitive personal data in its memory files.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Behavioral ASTexec() Call, eval() Call, Dynamic Import
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (565)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: return _last_inject try: result = subprocess.run( ["node", MEMORY_INJECT_SCRIPT], capture_output=True, text=True,
Confidence: 70% confidence
Finding: result = subprocess.run( ["node", MEMORY_INJECT_SCRIPT], capture_output=True, text=True, timeout=10, cwd=HEARTFLOW_SKILL_DIR,

Lp3

Medium

Category: MCP Least Privilege
Confidence: 70% confidence
Finding: Without declared permissions the skill's intent is opaque and cannot be validated.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The manifest frames HeartFlow as a cognitive engine focused on reflection, memory, psychology, and philosophy. In contrast, the documented hf_judge.js behavior constructs an absolute path in the user's home directory and dynamically requires code from there, which is a local filesystem and code-loading capability outside the obvious scope of the declared purpose.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: Earlier in the report, shouldBeSilent is documented as accepting a context object with fields like input, personInPain, and emotionIntensity. However, hf_judge.js is described as calling hl.shouldBeSilent(input), which contradicts the documented interface and means the script is not actually invoking the four-step process as claimed.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The manifest describes HeartFlow as a cognitive/self-reflection engine focused on analysis, psychology, philosophy, and memory layers. This helper spawns a separate process to run an external script and return its output, which is an execution capability beyond the obvious needs of a thinking/analysis CLI and is not described in the stated purpose.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The manifest describes an internal cognitive/self-reflection engine with memory layers and psychology/philosophy functions, but this daemon additionally launches a separate process via child_process.execFileSync to run a script. Spawning subprocesses is a materially broader capability than exposing local cognition/status operations and is not obviously required by the stated purpose.

Intent-Code Divergence

Low

Confidence: 97% confidence
Finding: The inline comment and readiness log state that the engine has already started at daemon initialization, yet loadEngine is only called inside the bundle request path. This is an active contradiction between the documented runtime behavior and the implemented lazy-loading behavior.

Intent-Code Divergence

High

Confidence: 98% confidence
Finding: The documentation says shutdown requires passing the SHUTDOWN_TOKEN environment variable, implying mandatory authentication. In code, token validation only happens if SHUTDOWN_TOKEN is set; otherwise any client able to reach the socket can issue shutdown, contradicting the stated security guarantee.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The manifest describes HeartFlow as a cognitive engine focused on reflection, memory layers, psychology, and philosophy. In this file, the plugin launches an external process (`node heartflow-memory-inject.js`) to generate injection content, which is a broader execution capability than the stated purpose itself requires and is not documented in the manifest context.

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The header comment explicitly states '注入后自动更新 lastAccessed 时间戳', which describes a state-changing side effect. However, the implementation only reads memory entries, writes formatted output to stdout, and saves memory-inject.txt; there is no assignment, persistence call, or method invocation that updates lastAccessed on any memory record.

Description-Behavior Mismatch

Low

Confidence: 83% confidence
Finding: The manifest context describes HeartFlow as a cognitive/memory engine focused on reflection and memory use, and this file's own documentation presents it as an injector that outputs plain text for inclusion in a system prompt. In addition to generating prompt text, the script writes memory-inject.txt to the memory directory, which is a separate persistence behavior not described in the file's stated function.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The manifest frames HeartFlow as an internal cognitive engine for reflection, memory layers, and clearer thinking. This tool adds an operator-facing bulk export function that serializes all CORE, LEARNED, and EPHEMERAL memory into a plaintext file, which is a data extraction capability beyond what is semantically required for self-reflection or memory use.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The manifest describes a cognitive engine centered on reflection, pattern-finding, psychology, and philosophy, but this file implements a habit/goal tracking subsystem with persistent filesystem storage, goal archiving, backup recovery, and record management. That behavior is not an obvious implementation detail of the stated 'heartflow' thinking engine and materially expands the skill into personal behavior tracking.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The exported API creates goals, records successes/failures, deletes and revives goals, queries them, formats reports, and imports legacy datasets. Those are product features for user habit tracking rather than direct mechanisms for AI cognition, uncertainty, attention, or the stated memory/philosophy model in the manifest.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The state-tracking comments and data model at L42-L48 declare active/paused/dormant tracking, and L115-L120 says long inactivity leads to dormant. But L99 updates lastCheck immediately before L116 computes idleTime, making idleTime effectively zero on every call, and currentState is never updated from the computed state. This is an intent-code contradiction in the module's documented existence model.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The header explicitly states this engine '不判断对错' and only answers what the engine's current cognitive-psychological state is. However, `detectCognitiveDissonance` classifies actions such as '欺骗/隐瞒/扭曲/夸张' as severe violations of '追求真善美' and treats withholding answers as conflicts with '传递知识', which is active moral/rightness judgment rather than mere state description.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The manifest describes a cognitive engine focused on self-reflection, pattern-finding, psychology, philosophy, and improving thinking clarity. In addition to text/state analysis, this module imports filesystem/path APIs and uses them to inspect whether code files exist and when they were last modified, which is an environmental introspection capability rather than an obvious requirement of the declared purpose.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The documentation frames this component as a philosophy module that outputs structured self-positioning data, but `_initExistence` operationalizes 'existence' by probing the local filesystem for the module's file path and mtime. That local file inspection reaches beyond reflective text analysis into host-environment introspection, which is not obviously required by the manifest's stated cognitive/philosophical scope.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: `assessExistence()` is invoked by the unified `analyze()` path and repeatedly calls `fs.existsSync`/`fs.statSync` on local source paths. For a skill presented as an engine for reflection, pattern detection, psychology, and philosophy, continuously interrogating host file metadata is a context-inappropriate capability unless the manifest explicitly includes local codebase auditing as part of its mission.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The manifest describes HeartFlow as a cognitive/self-reflection engine focused on thinking, memory, psychology, and philosophy. This file instead implements a general-purpose assertion toolkit for code validation, schema checking, Markdown/SKILL.md validation, and input/path safety checks, which are software QA utilities rather than an obvious implementation detail of the stated cognitive purpose.

Context-Inappropriate Capability

Low

Confidence: 88% confidence
Finding: The code adds specialized validation for SKILL.md metadata, header ordering, and Markdown link formatting. Those capabilities are not clearly justified by a skill whose stated purpose is self-reflection, dream synthesis, AI psychology, and philosophy; they look like repository/documentation linting functions instead.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The manifest describes heartflow as a cognitive engine for reflection, pattern-finding, psychology, and philosophy. In this file, the detector not only analyzes text but also saves structured data to disk and maintains a writable phrase database, which is a materially different operational behavior from reflective analysis alone.

Description-Behavior Mismatch

Medium

Confidence: 96% confidence
Finding: The manifest emphasizes self-reflection, memory layers, and clearer thinking, but does not indicate user- or runtime-driven curation of a phrase corpus on disk. These methods let callers extend the detector's knowledge base by writing new entries, which is a concrete content-management capability beyond simple chunk detection or cognitive processing.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The manifest frames the skill as a cognitive engine for thinking, reflection, and pattern finding, but this implementation not only computes associations and returns them, it also updates usage frequency, strengthens associations based on queries, and maintains recent-association history. That makes ordinary inference calls into learning/training operations with durable behavioral effects, which is a broader operational role than the description suggests.

Description-Behavior Mismatch

Medium

Confidence: 87% confidence
Finding: The constructor binds the associator to a JSON file under the project tree, and other methods load/save the association graph as persistent state. While memory is discussed conceptually in the manifest, the file-backed graph maintenance is a concrete storage behavior that goes beyond a purely abstract lexical/thinking component unless explicitly declared.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal