ai-humanizer

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed AI-writing humanizer, but it is explicitly designed to remove AI-origin cues and includes always-on prompt guidance that can persistently change an agent’s writing behavior.

Install only if you intentionally want an AI-writing-pattern editor. Do not use it to misrepresent authorship, bypass academic or workplace disclosure rules, or add first-person experience that is not true. Avoid enabling the always-on templates unless you want all agent outputs to follow these style rules, and review rewritten text for accuracy and disclosure obligations.

SkillSpector

By NVIDIA

Vulnerability Patterns

System Prompt LeakageDirect Leakage, Indirect Extraction, Tool-Based Exfiltration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Natural-Language Policy Violations

Medium

Confidence: 93% confidence
Finding: The README explicitly tells users to bake the skill's preferences into an agent's always-on system prompt or personality file, causing persistent stylistic manipulation without per-request user consent. That can override user intent, reduce transparency, and make the agent systematically shape outputs in a hidden way rather than only when the user asks for humanization.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The invocation language is broad enough to match common writing-help requests, which increases the chance the skill activates in contexts where users did not explicitly ask for AI-humanization or authorship masking. In this skill's context, overbroad triggering is riskier because the core function can facilitate deceptive presentation of AI-generated text as human-written.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The skill instructs the agent to make text sound specifically human without any warning or constraint against deceptive use, such as submitting AI-written work as personal authorship or bypassing AI-detection policies. Because the stated goal is to remove signs of AI generation, the absence of misuse warnings materially increases the likelihood of policy evasion and misrepresentation.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: The core instructions actively optimize for removing AI signals and making output appear as though 'a specific human wrote it,' which directly enables misrepresentation of authorship. In context, this is more dangerous than ordinary editing advice because the skill is expressly designed to defeat provenance cues and detection heuristics, making downstream fraud, academic dishonesty, and policy circumvention easier.

VirusTotal

58/58 vendors flagged this skill as clean.

View on VirusTotal