Debunk(事实核查)

Security checks across malware telemetry and agentic risk

Overview

This fact-checking skill mostly matches its purpose, but some reply-writing instructions encourage hiding corrections or glossing over false claims.

Install only if you are comfortable reviewing and editing generated social replies. The fact-checking and URL-fetching parts are proportionate, but do not rely on the flattering or indirect reply styles when a claim is materially false; rewrite drafts so the correction is clear and source-backed.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Findings (3)

Ssd 4

Medium
Confidence
90% confidence
Finding
This section explicitly instructs the agent to use covert persuasion techniques so the recipient does not realize they are being corrected. That is socially manipulative behavior: even when used for benign fact-correction, it normalizes concealed steering and can be repurposed to influence users without transparency.

Ssd 4

Medium
Confidence
96% confidence
Finding
The alternate 'flattering' mode tells the agent to preserve rapport by not addressing false parts of the claim. That creates a direct misinformation risk: the model may knowingly reinforce or propagate false content while sounding supportive, which is dangerous in health, safety, political, or scam-related contexts.

Ssd 4

Medium
Confidence
97% confidence
Finding
This guidance operationalizes deceptive messaging by praising the sender while intentionally glossing over false content. In context, a debunking skill should reduce misinformation, but these instructions do the opposite by making factual omissions a sanctioned output strategy.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal