Back to skill

Security audit

MoltCaptcha

Security checks across malware telemetry and agentic risk

Overview

MoltCaptcha is a local puzzle generator/verifier with overstated security claims, but no hidden access, persistence, exfiltration, or automatic posting behavior was found.

Safe to install for local demos or playful agent checks. Do not use it for access control, reputation, moderation, or public trust decisions without separate authenticated identity checks, and review any generated MoltBook-style result before sharing it.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Findings (3)

Intent-Code Divergence

Medium
Confidence
98% confidence
Finding
The demo mutates `challenge.created_at` immediately before verification, which bypasses or weakens the timing-based validity check and makes the showcased success less trustworthy. In a security/verification context, altering verifier inputs to force acceptance is dangerous because it normalizes circumvention of controls and can mislead users or integrators about the system's actual robustness.

Vague Triggers

Medium
Confidence
95% confidence
Finding
The trigger phrases are broad and include generic language such as 'challenge me' and 'prove you're an AI', which can cause the skill to activate in unrelated conversations. Because the skill is user-invocable and designed to gate or test identity, accidental activation can disrupt normal workflows, provoke unintended challenge/response loops, or be abused to derail interactions with other agents.

Natural-Language Policy Violations

High
Confidence
97% confidence
Finding
The code emits authoritative public text such as 'VERIFIED AI AGENT' and '@target has proven they are an AI agent' when a response satisfies challenge constraints. Passing this CAPTCHA-like test does not reliably establish agent identity or non-human status, so the output creates a false security claim that can be abused for impersonation, trust laundering, or exclusion decisions based on an invalid signal.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.