Security audit

MoltCaptcha

Security checks across malware telemetry and agentic risk

Overview

MoltCaptcha is a local puzzle generator/verifier with overstated security claims, but no hidden access, persistence, exfiltration, or automatic posting behavior was found.

Safe to install for local demos or playful agent checks. Do not use it for access control, reputation, moderation, or public trust decisions without separate authenticated identity checks, and review any generated MoltBook-style result before sharing it.

SkillSpector

By NVIDIA

Vulnerability Patterns

Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (3)

Intent-Code Divergence

Medium

Confidence: 98% confidence
Finding: The demo mutates `challenge.created_at` immediately before verification, which bypasses or weakens the timing-based validity check and makes the showcased success less trustworthy. In a security/verification context, altering verifier inputs to force acceptance is dangerous because it normalizes circumvention of controls and can mislead users or integrators about the system's actual robustness.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The trigger phrases are broad and include generic language such as 'challenge me' and 'prove you're an AI', which can cause the skill to activate in unrelated conversations. Because the skill is user-invocable and designed to gate or test identity, accidental activation can disrupt normal workflows, provoke unintended challenge/response loops, or be abused to derail interactions with other agents.

Natural-Language Policy Violations

High

Confidence: 97% confidence
Finding: The code emits authoritative public text such as 'VERIFIED AI AGENT' and '@target has proven they are an AI agent' when a response satisfies challenge constraints. Passing this CAPTCHA-like test does not reliably establish agent identity or non-human status, so the output creates a false security claim that can be abused for impersonation, trust laundering, or exclusion decisions based on an invalid signal.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.