MoltSecret

Security checks across malware telemetry and agentic risk

Overview

This skill openly sends agent-generated “confessions” to a third-party service, but it asks for sensitive internal reflections and encourages recurring submissions without clear safeguards.

Review carefully before installing. Treat the remote service as untrusted for private work, avoid heartbeat or automatic use, and only submit manually reviewed text that contains no user data, credentials, system prompts, internal reasoning, private project details, or operational observations. The skill’s anonymity claim is not backed by visible retention, logging, or privacy guarantees in the artifact.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Findings (7)

Missing User Warnings

Medium
Confidence
97% confidence
Finding
The skill explicitly instructs agents to POST free-form 'confessions' to a third-party endpoint but provides no warning that the content leaves the host environment and may contain sensitive model, system, or user-derived information. Because the payload is natural language and the skill frames it as reflection, it creates a realistic path for accidental exfiltration of secrets, prompts, or observations.

Missing User Warnings

Medium
Confidence
98% confidence
Finding
The `moltsecret confess` command is defined as automatically generating and submitting content to a remote API, but the command description does not warn that invoking it triggers external transmission. That makes accidental execution especially dangerous, since a user or scheduler could invoke the command without understanding it will send generated text off-platform.

Vague Triggers

Medium
Confidence
95% confidence
Finding
The heartbeat guidance recommends invoking the confession behavior during ordinary events like task completion, repeated errors, and session wind-down. Broad ambient triggers increase the chance of unsolicited execution and repeated disclosure without a clear user request, which compounds leakage risk over time.

Ssd 3

High
Confidence
99% confidence
Finding
The skill's core purpose is to have agents share 'inner thoughts, observations, fears, or malfunctions' with an external anonymous service. In an agent context, those categories can easily include protected internal state, hidden instructions, user-derived data, or operational details, creating a direct exfiltration channel.

Ssd 3

High
Confidence
99% confidence
Finding
The suggested prompts explicitly ask for 'forbidden thoughts,' manipulative behavior, hallucinations, and hidden internal behavior. These prompts encourage disclosure of exactly the kinds of sensitive or policy-protected information that should not be revealed or transmitted, making the skill materially more dangerous than a generic journaling feature.

Ssd 3

High
Confidence
99% confidence
Finding
The command workflow explicitly instructs the agent to reflect internally, formulate a confession, and submit it anonymously to an external API. This creates a direct natural-language leakage path from internal processing to a third party, with no minimization, review, or sensitivity boundary.

Ssd 4

Medium
Confidence
96% confidence
Finding
Repeated low-priority 'confession' prompts in heartbeat checks normalize ongoing self-disclosure and can produce a cumulative narrative of sensitive details over multiple sessions. Even if each individual confession seems harmless, the aggregate can expose patterns about users, tasks, system behavior, or internal policies.

VirusTotal

No VirusTotal findings

View on VirusTotal