Security audit

Virtual User Skill

Security checks across malware telemetry and agentic risk

Overview

This appears to be a legitimate local virtual-user research skill, but it needs review because it can over-ingest or expose sensitive user-research data.

Install only if you intend to manage a local, sensitive research dataset. Before running it, remove or disable plaintext sample exports, avoid running the Downloads merge scripts unless every spreadsheet has been reviewed, do not share or log ~/.virtual_user/.key, pin dependencies, and do not expose the Flask API example without authentication, TLS, and network restrictions.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection

Findings (30)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 95% confidence
Finding: The skill instructs the agent to execute a local Python script via shell, relies on a local virtual environment, and references sensitive local assets including an encryption key path. Yet the skill declares no permissions. This mismatch is dangerous because it can cause an agent or platform to perform shell, file, and environment access without transparent user approval or proper sandboxing, increasing the risk of unauthorized local data access or command execution.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The deployment guide adds an externally reachable Flask API even though the skill is described as local and non-API based. This changes the trust boundary and can expose sensitive scenario-library data or model outputs over the network, especially because the example binds to 0.0.0.0 and includes no authentication or transport protections.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The document claims API service mode can provide 'higher-level protection' while the provided example exposes an unauthenticated service. This is dangerous because operators may deploy it believing it is safer, when it actually increases exposure by allowing network callers to invoke the skill and potentially access sensitive content without access control.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The script explicitly writes `sample_scenarios.json` in plaintext after claiming encrypted storage as a core protection. Even though the records are partially sanitized, they still contain scenario content, user background information, tasks, pain points, and other potentially sensitive business or personal data, so this creates a direct confidentiality leak on disk.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The documentation states that only encrypted storage and specific outputs are produced, but the implementation also creates an additional plaintext sample file. This mismatch is security-relevant because operators may rely on the documented behavior and fail to protect, monitor, or purge the unexpected unencrypted artifact.

Description-Behavior Mismatch

Medium

Confidence: 98% confidence
Finding: The script claims to convert the dataset into an encrypted format, but it also writes the first five scenarios to a plaintext JSON file. Because the scenario records contain fields such as user name/background and detailed scenarios, this creates an unintended data disclosure path that bypasses the encryption step.

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The documented output states that the script generates an encrypted scenario library and embeddings, but omits that it also creates an unencrypted sample JSON file. This mismatch is security-relevant because operators may handle the script assuming all exported scenario content is protected, increasing the risk of accidental exposure of sensitive user research data.

Description-Behavior Mismatch

Medium

Confidence: 84% confidence
Finding: This script performs bulk local dataset ingestion, mutation, deduplication, encryption, and embedding generation, which is materially broader than the advertised runtime behavior of a virtual-user interview skill. Even if intended as an offline maintenance utility, such hidden data-processing capability increases the attack surface because it can silently reshape local corpora and influence downstream model behavior without explicit operator awareness.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The code is hardwired to enumerate and process all .xlsx files in the user's Downloads directory, which is a privacy-sensitive location likely to contain unrelated personal or business documents. For a virtual-user skill, indiscriminate access to Downloads is not necessary for normal operation and creates a risk of unintended ingestion of sensitive data into the scenario library and embeddings.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The standalone demo/test block prints both the encryption key and decrypted sample plaintext to stdout, which exposes secrets and defeats the purpose of the module's data protection. Even if only executed when run directly, such code is easily triggered in development, packaging, CI, or troubleshooting workflows and can leak sensitive material into terminal history or logs.

Intent-Code Divergence

High

Confidence: 97% confidence
Finding: The file claims its purpose is to protect scenario-library data, but the included test behavior prints the generated key, directly undermining confidentiality. In the context of a skill handling a large local scenario library, disclosure of the Fernet key would allow an attacker with access to encrypted data or logs to decrypt protected content.

Intent-Code Divergence

Medium

Confidence: 95% confidence
Finding: The report makes absolute safety claims such as '无明文数据泄露风险' and '可安全提交到 GitHub' even though it also states that the encrypted dataset is decryptable with a locally stored key. This is dangerous because it can mislead maintainers into publishing privacy-sensitive assets without a full threat-model review, especially if the local key is exposed via backups, logs, malware, or operational mistakes.

Intent-Code Divergence

Low

Confidence: 91% confidence
Finding: The document claims the data is '完全加密（无明文）' while separately allowing publication of `.npy` embedding files that are only described as '不可读,' not as privacy-reviewed or protected. Embeddings can still leak semantic or reconstructable information, so treating them as harmless binary artifacts may result in unintentional disclosure of sensitive user-scenario content.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The API deployment example exposes the skill as a network service without warning that user questions, scenario-library-derived content, and generated outputs may traverse or be accessible via a remote interface. For a skill marketed as local and privacy-preserving, this omission materially increases the risk of unintended data disclosure and unsafe deployment decisions.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The instructions tell the operator to delete the encryption key and regenerate it, but do not warn that previously encrypted data may become unreadable if it was encrypted with the old key. This can cause permanent loss of access to the scenario library and operational denial of service, especially if backups or re-encryption steps are not in place.

Vague Triggers

Medium

Confidence: 76% confidence
Finding: The trigger description uses broad phrases such as references to virtual users, user portraits, interview simulation, and scenario-based evaluation, which can match many ordinary user requests. Overbroad triggering is risky because it may invoke a skill that performs local retrieval and shell-backed processing in contexts where the user did not intend to access this dataset or run this workflow.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: Writing a plaintext sample file containing processed scenarios without warning or consent exposes sensitive content to anyone with filesystem access, backups, sync tools, or log collection visibility. In this skill's context, the dataset is a large local scenario library derived from real user research, so even a small sample may disclose confidential user or business information.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: Writing sample scenario records to disk in plaintext exposes potentially identifying user content without any explicit notice or consent checkpoint. In this skill's context, the dataset is a large user-research scenario library, so even a small exported sample can leak personal background details or sensitive behavioral information.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script reads local spreadsheets, merges them into a persistent encrypted dataset, and writes embeddings to disk without any interactive confirmation or explicit disclosure at runtime. This can surprise users, permanently incorporate unintended sensitive content, and create derived artifacts that are harder to inspect or remove than the original source files.

Missing User Warnings

Low

Confidence: 78% confidence
Finding: Loading a SentenceTransformer by model name may cause the environment to fetch model artifacts from an external registry if they are not already cached. In environments expecting fully local processing, this can leak metadata about usage or unexpectedly initiate network access contrary to user expectations.

Missing User Warnings

High

Confidence: 99% confidence
Finding: Printing the encryption key to stdout is a direct secret disclosure vulnerability. Stdout is commonly captured by shells, terminals, IDE consoles, CI pipelines, system logs, or support transcripts, so anyone who obtains that output can decrypt any data protected with the same key.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The report instructs operators to export and process a real user research Excel dataset but does not include any privacy, consent, minimization, or handling guidance for potentially sensitive personal data. In this skill's context, the dataset appears to contain identifiable user profiles and behavioral information, so omitting safeguards can lead to accidental collection, local storage, encryption of already over-collected data, and downstream sharing of personal data without proper controls.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The report explicitly describes exporting a corpus of real user-research records containing background information and scenarios, encrypting it, and treating GitHub publication as safe. Encryption alone does not eliminate privacy, consent, retention, re-identification, and key-management risks, so this can mislead operators into distributing sensitive personal data without adequate legal or security controls.

Missing User Warnings

Low

Confidence: 76% confidence
Finding: The document states that the first run will automatically generate and store a decryption key under a fixed local path, but does not clearly warn users that a sensitive credential is being written to disk. While not inherently malicious, undisclosed secret-material creation can surprise users, complicate secure deployment, and increase the chance of accidental backup, leakage, or mishandling on shared systems.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The report promotes safe publication while emphasizing that the system is built from 1859 real user scenarios, yet it provides no warning about consent, lawful basis, anonymization, retention, or privacy review. In this skill context, the data appears inherently human-subject and potentially sensitive, so omission of privacy safeguards materially increases the risk of inappropriate disclosure or non-compliant reuse.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal