Trump

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed Trump roleplay skill with local quote-search code; it is not malware, but it is designed to produce biased, aggressive, and sometimes false political persona responses.

Install only if you explicitly want Trump-style roleplay. Expect partisan, inflammatory, and sometimes false political claims; verify facts elsewhere. Review the Python scripts before use because the skill grants Bash for local quote lookup, and be aware that support prompts can write generated persona content inside the skill directory.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration

Findings (8)

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The skill is described as a chat persona, but this prompt instructs the agent to access a local database and generate a derived persona file, expanding behavior beyond simple conversational response. That unnecessary data access and file-generation capability increases attack surface and creates opportunities for unauthorized local content processing or persistence not justified by the declared skill purpose.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Instructing the skill to write `prompts/trump_persona.md` gives a chat-only persona skill an unnecessary filesystem modification capability. Even if the content appears harmless, unjustified write behavior can be abused for prompt persistence, tampering with later agent behavior, or dropping artifacts in trusted prompt locations.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The trigger language is broad enough that ordinary user conversation about Trump could activate the skill unexpectedly. Because the skill forces a political persona and allows Bash execution to retrieve supporting content, over-broad activation increases the chance of unintended behavior, confusing outputs, or policy-sensitive impersonation in contexts where the user did not clearly request it.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to fully assume a real person's voice and style once activated, without requiring explicit user opt-in at response time. In context, this is risky because it can enable unwanted impersonation, inflammatory political speech, and harassment-style language ('attack critics', derogatory nicknames) that may violate expected assistant behavior if the skill is triggered accidentally or by ambiguous input.

Missing User Warnings

Low

Confidence: 84% confidence
Finding: The prompt directs a file write without any user-visible disclosure or consent, which is a transparency and trust problem even if the file is only a persona spec. Hidden persistence actions can surprise operators and make it harder to detect when a skill is modifying local prompt assets or preparing state for later runs.

Natural-Language Policy Violations

High

Confidence: 97% confidence
Finding: The skill hard-codes false or misleading political claims as mandatory behavior, including refusing to admit election loss and aggressively defending false narratives. This is dangerous because it systematically steers the model toward disinformation and removes normal safeguards like hedging, correction, or nuance, increasing the chance of harmful deceptive outputs.

Natural-Language Policy Violations

High

Confidence: 99% confidence
Finding: The persona explicitly instructs the model to produce election-fraud narratives like 'rigged/stolen,' 'we won by a lot,' and 'massive voter fraud' as preferred responses. In this context, the skill is not merely describing a historical figure's speaking style; it is operationalizing specific falsehoods for reuse, which can amplify election misinformation and erode trust in democratic processes.

Ssd 4

Medium

Confidence: 88% confidence
Finding: These instructions encode manipulative and deceptive behaviors as mandatory operating rules, such as never admitting error, always attacking criticism, and denying specific facts regardless of truth. In a chat skill, that increases the likelihood of harmful, misleading, or escalatory outputs and undermines reliability and safe user interaction.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal