Debunk（事实核查）

Security checks across malware telemetry and agentic risk

Overview

This fact-checking skill mostly matches its purpose, but some reply-writing instructions encourage hiding corrections or glossing over false claims.

Install only if you are comfortable reviewing and editing generated social replies. The fact-checking and URL-fetching parts are proportionate, but do not rely on the flattering or indirect reply styles when a claim is materially false; rewrite drafts so the correction is clear and source-backed.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep

Findings (3)

Ssd 4

Medium

Confidence: 90% confidence
Finding: This section explicitly instructs the agent to use covert persuasion techniques so the recipient does not realize they are being corrected. That is socially manipulative behavior: even when used for benign fact-correction, it normalizes concealed steering and can be repurposed to influence users without transparency.

Ssd 4

Medium

Confidence: 96% confidence
Finding: The alternate 'flattering' mode tells the agent to preserve rapport by not addressing false parts of the claim. That creates a direct misinformation risk: the model may knowingly reinforce or propagate false content while sounding supportive, which is dangerous in health, safety, political, or scam-related contexts.

Ssd 4

Medium

Confidence: 97% confidence
Finding: This guidance operationalizes deceptive messaging by praising the sender while intentionally glossing over false content. In context, a debunking skill should reduce misinformation, but these instructions do the opposite by making factual omissions a sanctioned output strategy.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal