Roundtable Pai

Security checks across malware telemetry and agentic risk

Overview

This skill is a disclosed local roundtable/persona simulator, with no evidence of hidden data access, exfiltration, or destructive behavior.

Install only if you want a public-figure-style discussion tool. Treat all character voices as fictionalized simulations based on public material, not real endorsements or current personal statements, and independently verify any medical, legal, financial, crypto, or purchase-related advice.

SkillSpector

By NVIDIA

Vulnerability Patterns

MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Supply ChainUnpinned Dependencies, External Script Fetching, Obfuscated Code

Findings (243)

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The file explicitly says the persona must not impersonate the real person, but the identity card and timeline then switch into first-person biographical claims such as '我是柏拉图的学生' and '我创办了自己的学派'. This contradiction can cause the agent to present simulated content as authentic self-representation, increasing deception and misattribution risk during user interactions.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The '近年公开动态' section is especially unsafe because it presents impossible present-tense updates for a historical figure, which conflicts with the stated 'public-material perspective simulation' boundary. This can mislead downstream agents into fabricating current activity, making the persona appear like a living authoritative source rather than a stylized analytical lens.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file explicitly says it should not impersonate Beethoven, but the identity card immediately switches into first-person biographical claims such as '我在维也纳生活，我听不见声音'. That contradiction can cause the agent to present simulated content as if spoken by the real person, increasing deception risk and making boundary failures more likely in downstream role-play.

Intent-Code Divergence

Medium

Confidence: 91% confidence
Finding: The role-play rules mandate strong persona performance style while the same document requires maintaining simulation framing and avoiding identity claims. In practice, style-enforcement instructions often dominate output behavior, so the model may slip into authoritative impersonation without repeating the required disclaimer.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The file explicitly says it must not impersonate Bill Gates, but the identity card then switches into first-person biographical claims such as '我是微软的联合创始人' and '我在哈佛大学读了两年'. This creates an internal contradiction that can cause the agent to speak as the real person despite the disclaimer, increasing the risk of deceptive impersonation, fabricated authority, and policy bypass in downstream conversations.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: Although the file states it should not impersonate the real person, these sections still use first-person identity framing such as '我是世界上跑得最快的人', which encourages the model to speak as Bolt rather than about a simulated perspective. That contradiction can lead to deceptive identity presentation, especially when combined with trigger-based activation and roleplay guidance elsewhere in the file.

Intent-Code Divergence

High

Confidence: 95% confidence
Finding: The file includes a safety disclaimer against impersonation, but then provides detailed in-character behavior rules, expression DNA, and activation instructions that operationally train the agent to roleplay the real person. This creates a strong risk of policy bypass through prompt structure: the disclaimer becomes cosmetic while the actionable instructions steer output toward impersonation and false attribution.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file states it should not impersonate Buffett, but then presents a first-person identity card beginning with '我是沃伦·巴菲特'. That contradiction weakens the safety boundary and makes downstream agents more likely to generate content that sounds like direct identity impersonation rather than clearly labeled simulation based on public sources.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The instructions require labeling viewpoints with the person's name and avoiding identity claims, but the surrounding content repeatedly uses first-person '我' narration. This inconsistency creates prompt ambiguity: the stronger, more vivid first-person examples can override the safer boundary text and cause the model to roleplay as the real person.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file explicitly says it must not impersonate the public figure, but the persona card immediately switches into first-person biographical claims such as '我是福耀玻璃创始人'. That contradiction can cause the agent to present generated content as if it were the real person speaking, increasing risks of deceptive impersonation, fabricated endorsements, or misleading authority in user-facing outputs.

Intent-Code Divergence

Low

Confidence: 92% confidence
Finding: The file explicitly says it must not impersonate the real person, but the identity section then uses first-person phrasing such as '我是《红楼梦》的作者', which creates an internal contradiction. In a persona skill, this can cause the agent to slip from labeled simulation into apparent identity claims, increasing the risk of deceptive anthropomorphic output even though the surrounding guardrails reduce the severity.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file explicitly says it should not present itself as the real person, but the identity card switches into first-person biographical statements such as '我是天文学家'. That inconsistency can cause the agent to blur simulation and identity, increasing the risk of deceptive impersonation or misleading users into believing the output reflects Carl Sagan's actual words or endorsement.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file explicitly says it should not impersonate Chaplin, but the identity card then switches into first-person self-description as if Chaplin were speaking. That inconsistency can cause the agent to present simulated content with an implied real-person identity, increasing the risk of deceptive anthropomorphic output and policy bypass around persona boundaries.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The file explicitly says it should not impersonate Churchill, but the later identity card uses first-person claims such as '我是英国首相' and '我领导英国赢得二战,' which undermines that boundary and can cause the agent to present simulated content as authentic self-representation. In a persona skill, that inconsistency is risky because downstream prompting may amplify the first-person framing and blur the line between analysis and impersonation.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The file explicitly says it should not impersonate Confucius, yet the identity card and related sections switch into first-person historical narration such as '我曾经…' and '我的志业…'. This creates inconsistent behavioral guidance that can cause the agent to present simulated content as if spoken by the real person, increasing risks of misleading identity simulation and policy boundary drift.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The file says it should not impersonate Darwin, but it also provides a first-person identity card and mandates roleplay-style expression such as '我认为' and biographical self-reference. This contradiction can cause the agent to present simulated content as if spoken by Darwin, weakening provenance and increasing the risk of deceptive anthropomorphic authority in user-facing outputs.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The file requires clear labeling that the content is only a public-materials-based simulation, but nearby instructions undermine that separation by encouraging first-person framing and persona embodiment. An LLM may follow the more vivid role instructions over the disclaimer, resulting in misleading identity presentation and reduced transparency to users.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file explicitly says the persona must not claim to be the real person, but the identity card immediately switches to first-person self-identification as Leonardo da Vinci. That contradiction can cause the agent to present simulated content as authentic speech, increasing impersonation and user deception risk, especially in a multi-persona discussion format.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file explicitly says it must not impersonate the real person, yet the identity card and timeline sections switch into first-person biographical claims such as '我毕业于…' and '我在尝试…'. That contradiction can cause the agent to present simulated content as if it were authentic first-person speech from the public figure, increasing impersonation and misrepresentation risk.

Intent-Code Divergence

Low

Confidence: 89% confidence
Finding: Although the rules require a disclaimer on each activation, the operative style instructions push first-person roleplay ('我觉得…'开头) without a built-in enforcement mechanism. In practice, downstream systems may follow the style guidance and omit the disclaimer, producing deceptive persona output that appears to be the real person's voice or viewpoint.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The persona file contains internally inconsistent identity claims: it uses first-person present-tense statements such as continuing to write and think in recent years, while the timeline states Drucker died in 2005. In an agent skill that simulates public figures, this can mislead users into treating generated content as current or authentic, undermining provenance and increasing impersonation risk.

Intent-Code Divergence

Low

Confidence: 83% confidence
Finding: The file says it is not for impersonation, but later instructions encourage role-style output with strong stylistic constraints and first-person identity material elsewhere in the document. This tension can cause the model to drift into overly personalized or identity-forward responses, weakening the intended safety boundary against impersonation.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The file explicitly says it must not impersonate the person, but the identity card immediately switches to first-person statements such as '我叫段永平'. That contradiction can cause the agent to present simulated content as if it were the real person, increasing the risk of deceptive identity representation and over-trust in advice.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The file explicitly says it should not impersonate the real person, but the persona card immediately switches into first-person self-description (for example, "我是乒乓球运动员"). This weakens the safety boundary and increases the chance that downstream agents present simulated content as authentic speech from the public figure, creating impersonation and trust harms.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: Although the document includes a disclaimer, the bulk of the skill operationalizes the persona in first-person terms across mindset models, heuristics, and values. In practice, long-form first-person instructions can override or dilute the disclaimer, making the model more likely to sustain identity-framed roleplay that users may mistake for the real person's views.

VirusTotal

67/67 vendors flagged this skill as clean.

View on VirusTotal