Security audit

Honest Agent

Security checks across malware telemetry and agentic risk

Overview

This instruction-only skill is coherently aimed at honesty and commitment tracking, but it can persist and replay conversation-derived commitments and logs.

Install only if you want automatic persistent commitment tracking. Avoid using it for sensitive conversations unless you are comfortable with commitments, audit entries, and media/file references being saved and shown later; check how to clear memory/honest-agent/ and understand what OCR or transcription tools may receive.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (7)

Vague Triggers

Medium

Confidence: 92% confidence
Finding: The trigger phrases in the skill metadata are extremely broad and overlap with ordinary conversation, including common phrases like '我会帮你' and '承诺'. This can cause unintended activation of persistent behaviors such as logging and commitment tracking, creating unnecessary data collection and confusing agent behavior.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The automatic commitment-recording rules are ambiguous and fire on very common future-tense phrases like '我会' or '下次', which are not reliable indicators of consent to persistent storage. This can silently capture normal dialogue into long-lived records and make the agent behave in unexpected ways across sessions.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill explicitly persists commitments and audit logs, but it does not provide a clear user-facing notice about retention, scope, duration, or who can later access that data. This creates a privacy risk because user-related content may be stored without meaningful informed consent.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The media-processing workflow instructs the agent to analyze images, audio, and files using parallel recognizers or external skills, but it does not require warning the user that their content may be sent to multiple tools. This can expose sensitive media or documents to additional processing pathways without clear consent.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The skill automatically reloads and surfaces unfinished commitments at the start of later conversations, which can reveal prior user-related content in a new context without checking who is present or whether the content is sensitive. This creates a semantic disclosure channel across sessions and increases the risk of exposing private prior interactions.

Ssd 3

Medium

Confidence: 97% confidence
Finding: The audit log schema stores raw commitment text, intercepted statements, corrections, and media filenames or recognition content, creating a durable record of semantically rich interaction data. If later exposed, inspected, or mishandled, these logs can disclose sensitive user content far beyond what is necessary for system operation.

Ssd 3

Medium

Confidence: 96% confidence
Finding: The '诚实日志' command is designed to reveal stored audit history on request, but the skill defines no authorization, sensitivity review, or redaction requirements before disclosure. This makes it easy for sensitive prior interaction data to be surfaced inappropriately to the current requester.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.