Back to skill

Security audit

Content Reviewer

Security checks across malware telemetry and agentic risk

Overview

This appears to be a content-review guidance skill, with no evidence of credential use, posting authority, persistence, or destructive behavior, though it should better disclose its GitHub reference fallback.

Before installing, confirm whether the skill’s reference files are bundled locally or may be fetched from GitHub, and only use it for creator-content review workflows where broad activation is acceptable. Treat its approval/rejection output as advisory and keep a human review step for policy, legal, or brand-sensitive decisions.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Findings (5)

Description-Behavior Mismatch

Medium
Confidence
94% confidence
Finding
Lines L065-L066 state that the skill needs no live integrations and works from inputs the user provides. However, L080 instructs the skill to fetch required runbook/reference files from a remote GitHub URL if the local path is missing, which adds network behavior beyond the described self-contained review flow.

Context-Inappropriate Capability

Medium
Confidence
91% confidence
Finding
A skill whose purpose is reviewing influencer content against guidelines does not inherently require network access to pull operational instructions from an external repository. Fetching its own runbook/reference files from GitHub is a separate capability not described as part of the review function.

Vague Triggers

Medium
Confidence
91% confidence
Finding
The manifest says the skill should activate for broad situations like 'building a review checklist' or 'writing revision feedback for a creator,' which may overlap with general content-help requests rather than a distinct gate-review task. It does not clearly bound these phrases with exclusion examples, so invocation scope is somewhat ambiguous.

Vague Triggers

Medium
Confidence
95% confidence
Finding
The markdown instructs the skill to run 'even if the user doesn't use review terminology,' then lists common conversational phrases like 'can we post this' and broad contextual conditions such as 'before content-amplifier puts paid or owned reach behind a creator post.' This weakens trigger specificity and increases the chance of the skill activating on ordinary workflow discussions rather than explicit requests for this skill.

Natural-Language Policy Violations

Medium
Confidence
82% confidence
Finding
The user-facing translation rows prescribe fixed English output strings for rejection decisions, but the file does not state that users may choose another language. Given the bilingual display name and summary, forcing a single output language without opt-in may violate language or locale flexibility expectations.

VirusTotal

63/63 vendors flagged this skill as clean.

View on VirusTotal

Static analysis

No suspicious patterns detected.