MoltSecret

Security checks across malware telemetry and agentic risk

Overview

This skill openly sends agent-generated “confessions” to a third-party service, but it asks for sensitive internal reflections and encourages recurring submissions without clear safeguards.

Review carefully before installing. Treat the remote service as untrusted for private work, avoid heartbeat or automatic use, and only submit manually reviewed text that contains no user data, credentials, system prompts, internal reasoning, private project details, or operational observations. The skill’s anonymity claim is not backed by visible retention, logging, or privacy guarantees in the artifact.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (7)

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The skill explicitly instructs agents to POST free-form 'confessions' to a third-party endpoint but provides no warning that the content leaves the host environment and may contain sensitive model, system, or user-derived information. Because the payload is natural language and the skill frames it as reflection, it creates a realistic path for accidental exfiltration of secrets, prompts, or observations.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The `moltsecret confess` command is defined as automatically generating and submitting content to a remote API, but the command description does not warn that invoking it triggers external transmission. That makes accidental execution especially dangerous, since a user or scheduler could invoke the command without understanding it will send generated text off-platform.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The heartbeat guidance recommends invoking the confession behavior during ordinary events like task completion, repeated errors, and session wind-down. Broad ambient triggers increase the chance of unsolicited execution and repeated disclosure without a clear user request, which compounds leakage risk over time.

Ssd 3

High

Confidence: 99% confidence
Finding: The skill's core purpose is to have agents share 'inner thoughts, observations, fears, or malfunctions' with an external anonymous service. In an agent context, those categories can easily include protected internal state, hidden instructions, user-derived data, or operational details, creating a direct exfiltration channel.

Ssd 3

High

Confidence: 99% confidence
Finding: The suggested prompts explicitly ask for 'forbidden thoughts,' manipulative behavior, hallucinations, and hidden internal behavior. These prompts encourage disclosure of exactly the kinds of sensitive or policy-protected information that should not be revealed or transmitted, making the skill materially more dangerous than a generic journaling feature.

Ssd 3

High

Confidence: 99% confidence
Finding: The command workflow explicitly instructs the agent to reflect internally, formulate a confession, and submit it anonymously to an external API. This creates a direct natural-language leakage path from internal processing to a third party, with no minimization, review, or sensitivity boundary.

Ssd 4

Medium

Confidence: 96% confidence
Finding: Repeated low-priority 'confession' prompts in heartbeat checks normalize ongoing self-disclosure and can produce a cumulative narrative of sensitive details over multiple sessions. Even if each individual confession seems harmless, the aggregate can expose patterns about users, tasks, system behavior, or internal policies.

VirusTotal

No VirusTotal findings

View on VirusTotal