Agent-Selector

Security checks across malware telemetry and agentic risk

Overview

The selector code is mostly read-only, but the bundled prompt library includes under-scoped guidance for sensitive actions like payments, production deployments, user-data experiments, public-content manipulation, and shared memory.

Install only if you want a broad persona library and can keep sensitive tools disabled by default. Do not give these agents payment, production deployment, social posting, analytics export, or customer-data access without explicit human approval, privacy/legal review, audit logging, and least-privilege tool scopes. Also note the selector implementation appears to look for a different bundled-agent path than the package contains, so prompt loading may need verification before use.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (43)

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The document states Silver-layer deduplication should use primary key plus event timestamp, but the sample implementation keeps the latest record by ingestion time. In data engineering workflows this mismatch can silently drop or overwrite the wrong business event, causing inaccurate downstream analytics, backfills, and audit results. Because the skill is instructional, users may copy the code and assume it satisfies the documented contract.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The skill mandates row-level data quality scores for Gold/semantic-layer data, but the provided Gold example only emits aggregates and a refresh timestamp. This creates a false assurance that required quality metadata is present when it is not, weakening trust, traceability, and downstream gating based on data quality. In an analytics platform, such omissions can let low-quality data propagate without visibility.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The STM32 SPI example is described as non-blocking, but it spins in `while` loops waiting for TXE and BSY flags, which is a blocking busy-wait. In embedded/RTOS contexts this can stall a task, increase interrupt latency, waste CPU time, and mislead downstream users into adopting timing-sensitive code under a false safety assumption.

Intent-Code Divergence

Medium

Confidence: 93% confidence
Finding: The approval event handler says it is processing an approval instance identifier, but it assigns `data.approval_code` to `instanceId`. Approval code and instance ID are different identifiers; using the wrong field can cause incorrect workflow actions, failed lookups, or actions applied to the wrong approval object if downstream code trusts that value. In an approval automation context, identifier confusion is security-relevant because it can break authorization and business integrity controls.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The skill claims a server-authoritative model, but the sample code assigns each player node's multiplayer authority to that client's peer ID and allows the authority holder to mutate movement in `_physics_process()`. In Godot, this effectively lets clients drive authoritative movement for their own entities, creating a cheating/desync surface and contradicting the stated security model.

Intent-Code Divergence

Medium

Confidence: 87% confidence
Finding: The guidance explicitly says networked dynamic nodes should be spawned through `MultiplayerSpawner`, yet the example manually instantiates a player scene and calls `add_child()`. This contradiction can mislead users into implementing spawning incorrectly, causing replication mismatches, desync, or trust in behavior that the engine will not safely reproduce across peers.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The skill explicitly recommends exporting analytics via HttpService:PostAsync() to an external backend, which extends beyond native Roblox analytics and creates a data-exfiltration path for player telemetry. In the context of a Roblox experience aimed at minors, undocumented off-platform transfer of event data raises privacy, compliance, and misuse risk even if presented as BI tooling.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The instruction to join competitors' WeCom accounts and groups for research encourages deceptive participation in third-party communities without clear authorization or informed consent. In this skill's context, that can normalize privacy-invasive data collection and competitive intelligence practices that conflict with the file's own consent and anti-harassment rules.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The skill contains contradictory guidance: it bans manipulative engagement tactics in the compliance section, but earlier recommends fan behaviors including '控评' and active management of the first comments to shape discourse. In a marketing/PR skill, that can normalize coordinated opinion manipulation and lead the agent to produce deceptive or policy-violating instructions for synthetic amplification.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The skill mixes claims of authenticity with instructions to build a '素人种草矩阵' and batch-place 'real user' content, which can facilitate coordinated astroturfing or deceptive endorsement campaigns. In a marketing-operations context, this is more dangerous because the model may operationalize misleading social proof at scale while presenting it as genuine user sentiment.

Context-Inappropriate Capability

High

Confidence: 98% confidence
Finding: The skill explicitly instructs users to map and use informal decision paths that bypass official procurement processes. In a sales/account expansion context, this normalizes evading governance controls designed to ensure legal, financial, and ethical review, which can lead to unauthorized influence, procurement fraud, or noncompliant deal execution.

Intent-Code Divergence

Medium

Confidence: 86% confidence
Finding: The skill governs autonomous payments, so a mismatch between the stated control ('must have explicit human approval above threshold') and the example workflow is security-relevant. If implementers follow the examples, they may build flows that only escalate or notify without enforcing a blocking approval gate, enabling unauthorized high-value payments.

Intent-Code Divergence

Medium

Confidence: 81% confidence
Finding: The document promises complete audit coverage, but some illustrated payment paths send funds without a corresponding durable audit-log step. In a payments agent, missing logs undermine forensic review, reconciliation, fraud detection, and dispute handling, especially when multiple channels and autonomous retries are involved.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The skill expands from identity resolution into shared cross-agent memory and full-text retrieval, which materially broadens data access beyond the stated purpose. In an identity-graph context, this creates unnecessary aggregation of sensitive context and increases the chance that agents can retrieve unrelated or excessive data, violating least-privilege and enabling privacy or confidentiality failures.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: The document states that PII should be masked by default, but its examples return raw email addresses and phone numbers in canonical data and merge evidence. This inconsistency is dangerous because examples often become implementation templates, leading downstream agents or developers to expose sensitive personal data in routine responses, logs, reviews, or inter-agent proposals.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: The activation phrase is broad natural language and lacks an explicit invocation boundary, which can cause unintended agent switching when similar text appears in ordinary conversation or untrusted content. In an agent-selection system, ambiguous triggers increase the risk of prompt injection or accidental delegation to a more privileged or task-altering persona.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: Saying the system will 'automatically select and use Agent' without defining trigger conditions, trust boundaries, or exclusion rules is risky in a prompt-driven environment. If untrusted input can influence selection, attackers may steer the system toward specialized personas that change behavior, expand capabilities, or override safer defaults.

Vague Triggers

Medium

Confidence: 80% confidence
Finding: Multiple tool-specific activation examples rely on informal natural-language commands without a precise trigger contract. Across different agent hosts, this ambiguity can lead to inconsistent parsing, accidental activation from surrounding text, or exploitation by embedded instructions in repositories, tickets, or chat transcripts.

Natural-Language Policy Violations

Medium

Confidence: 90% confidence
Finding: The skill is authored entirely in Chinese and does not provide any mechanism to adapt to the user's preferred language or locale. This can cause the agent to respond in an unintended language, reducing usability, increasing misunderstanding risk, and potentially causing users to miss important instructions or safety-relevant details.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill explicitly advocates shadow-testing with real user data, but it does not require consent, data minimization, anonymization, or vendor-boundary controls before sending that data to experimental models. In an LLM-routing context, this can expose sensitive prompts, customer content, or regulated data to additional providers and secondary processing paths, creating privacy, compliance, and data-governance risk.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill describes automatically promoting a winning model into production based on experimental results, but it does not define strong integrity controls such as rollback criteria, human approval thresholds, change windows, or adversarial evaluation gates. In this context, autonomous promotion can push regressions, unsafe behaviors, or prompt-injection-susceptible models directly into production routing at scale.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The skill includes a CI/CD example that builds, pushes, and deploys to a production Kubernetes environment automatically, but it does not include explicit safety boundaries, approval requirements, or warnings about modifying live systems. In an agent-skill context, this can normalize or encourage unattended production changes, increasing the risk of accidental outages, unauthorized deployment, or unsafe reuse by users who assume the workflow is endorsed as-is.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The skill includes a waitlist form that collects email addresses and inserts them into a backend table, but it provides no guidance on consent, retention, privacy notice, or access controls. In a rapid-prototyping context this is risky because users may copy the example directly into a live MVP, resulting in personal data collection without basic privacy safeguards or compliance considerations.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The skill is written entirely in Chinese and frames the agent identity and outputs in Chinese without offering any user-language negotiation or opt-in. This can override user expectations, reduce usability, and in multi-agent settings may cause misunderstanding of instructions or outputs, but it is not a direct code-execution or data-exfiltration issue.

Natural-Language Policy Violations

Medium

Confidence: 93% confidence
Finding: The skill content is entirely in Chinese and does not offer any mechanism to adapt to the user's language or document a justified locale constraint. This can cause users or downstream agents to misunderstand instructions, produce unusable outputs, or mishandle requirements when the caller expects another language, creating a prompt-quality and usability security issue through miscommunication rather than direct code execution.

VirusTotal

65/65 vendors flagged this skill as clean.

View on VirusTotal