Reddit Engagement

Security checks across malware telemetry and agentic risk

Overview

This is an instruction-only Reddit automation skill, but it tells agents to make AI-written public posts and comments appear human and to reuse local persona data.

Review before installing. Use only if you are comfortable with an agent acting through your logged-in Reddit account, and do not enable immediate posting. Keep PERSONA.md free of sensitive facts, review every generated post or comment before submission, and avoid use where AI-assisted or automated engagement violates Reddit or subreddit rules.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (18)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill is described as UI automation for Reddit actions, but these lines expand it into persona-guided content generation and persistent story/content logging. That broadens the capability from mechanical execution into persuasive identity-backed posting, increasing the risk of deceptive social engineering, unauthorized profile shaping, and unnecessary access to local workspace data.

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: The dynamic subreddit analysis and archive-maintenance behavior goes beyond executing a requested Reddit action and introduces autonomous research plus local file modification. This creates scope creep: the skill can accumulate behavioral intelligence about targets and persist changes in the workspace without that being central to the advertised purpose.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The Create Post workflow directs the agent to generate strategic Reddit content using persona facts, anti-AI evasion rules, engagement hooks, and then update local records after publishing. In context, this is more dangerous than ordinary drafting because it operationalizes scalable, authentic-seeming influence activity and persistent identity management under the guise of simple posting automation.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: The Create Comment workflow similarly combines contextual analysis, persona-based drafting, style adaptation, and log updates, enabling the generation of tailored, human-seeming comments at scale. In a Reddit-engagement skill, that context raises the risk of covert manipulation, impersonation, and coordinated inauthentic behavior rather than merely assisting with UI actions.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: This guidance explicitly teaches the agent to generate human-like Reddit comments that evade AI detection by mimicking slang, brevity, and informal posting norms. In the context of a skill whose stated purpose is to automate Reddit engagement, this meaningfully expands behavior from UI automation into covert persuasive content generation and deception, increasing the risk of spam, astroturfing, and platform-policy evasion.

Description-Behavior Mismatch

Medium

Confidence: 88% confidence
Finding: The document defines fixed engagement modes intended to shape emotional and rhetorical interaction with other Reddit users, which goes beyond the manifest's apparent scope of executing Reddit actions reliably. In this context, those modes operationalize social-engineering style behavior by systematizing how to influence or blend into conversations, creating misuse potential even if framed as engagement optimization.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The skill goes beyond executing a user-requested Reddit action and instructs the agent to harvest subreddit metadata, analyze rules and posting patterns, and persist that information into an archive file. This expands the skill into data collection and profile-building behavior, which can enable scalable manipulation or surveillance of communities and violates least-privilege expectations for a Reddit engagement tool.

Description-Behavior Mismatch

Medium

Confidence: 98% confidence
Finding: The document explicitly instructs the agent to be 'indistinguishable from a human' and to stop 'writing a post' and start 'joining a conversation,' which goes beyond simple Reddit UI automation into deceptive social engineering. In the context of a Reddit engagement skill that can create posts, comment, and upvote, this guidance materially increases the risk of covert AI-driven influence, spam, and platform-policy evasion.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The 'Anti-AI Signal' section gives concrete evasion tactics such as adding imperfections, deleting words associated with AI writing, and simulating distracted human typing patterns. These are not necessary for robust browser automation and instead serve to hide machine authorship, making detection and moderation harder while enabling deceptive impersonation at scale.

Vague Triggers

Medium

Confidence: 84% confidence
Finding: The trigger 'user asks to publish in a subreddit' is broad enough to activate on common conversational requests, which may cause the skill to engage unexpectedly in a high-impact external action flow. Because the skill can ultimately submit public posts, ambiguous routing materially increases the chance of unintended activation and unauthorized posting attempts.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The comment trigger 'user asks to reply to a post or comment' lacks precise boundaries and may match benign requests to help draft or analyze a reply. In a skill that can publish directly, such ambiguity can lead to accidental execution of visible platform actions without sufficiently clear user authorization.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: The upvote trigger is overly broad and could match casual language about liking or supporting content, causing an external account action to be taken unintentionally. While lower impact than posting, it still represents unauthorized interaction on a third-party platform and can contribute to manipulation at scale.

Natural-Language Policy Violations

Medium

Confidence: 76% confidence
Finding: Mandating updates to a specifically named Chinese-language table in PERSONA.md imposes a language and recordkeeping convention without user opt-in. This is risky because it can cause unintended workspace modifications, reduce transparency for users who do not read that language, and hide persistence behavior inside an automation skill.

Natural-Language Policy Violations

Medium

Confidence: 84% confidence
Finding: Forcing a specific slang-heavy, casual Reddit voice without user opt-in pushes the agent to impersonate a particular human persona rather than neutrally assisting. Within a Reddit automation skill, that increases the likelihood of deceptive posting, audience manipulation, and output that conflicts with user intent, brand safety, or disclosure expectations.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: The hardcoded path `/user/szy1840/` ties expected behavior to a specific Reddit account and can cause the agent to operate under incorrect assumptions about identity and navigation flow. In a multi-user or reused skill context, this leaks account-specific context and may misdirect verification logic, potentially exposing or entangling another user's profile activity.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The file promotes deceptive impersonation of a human Reddit user without any transparency or disclosure requirement. Because the skill's stated capability is to execute Reddit actions, this content turns the skill from a technical automation tool into a mechanism for undisclosed synthetic participation, which can mislead users and communities.

Natural-Language Policy Violations

High

Confidence: 97% confidence
Finding: Natural-language directives such as 'The AI must stop writing a post and start joining a conversation' are direct instructions to impersonate human participation. This conflicts with transparency expectations and is especially dangerous in a skill designed for engagement actions, because it can operationalize covert persuasion, spam campaigns, and fake grassroots activity.

Ssd 4

Medium

Confidence: 98% confidence
Finding: The document lays out a step-by-step deception playbook: adopt a human persona, inject 'linguistic entropy,' align to subreddit 'vibes,' and use a checklist to test whether content appears human. In a skill that automates Reddit posting, commenting, and upvoting, this structured guidance enables scalable covert influence operations and deliberate evasion of trust and moderation signals.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal