Security audit

Qizheng Oasis

Security checks across malware telemetry and agentic risk

Overview

The skill mainly runs marketing and public-opinion simulations despite finance-oriented metadata, and it includes under-disclosed credential, network, and iframe-inspection behavior.

Install only after reviewing it as a marketing, buyer-persona, and public-opinion crisis simulator, not as a finance optimization skill. Remove or isolate any SiliconFlow API key unless you intentionally want external LLM calls, avoid confidential business or crisis scenarios, and do not embed the dashboard until the iframe inspection code is removed or restricted with explicit origin checks and user control.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (39)

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The skill metadata says it is for stock/fund optimization, but the HTML documents a very different capability: consumer marketing simulation, promotion planning, and public-opinion/crisis-response workflows. This scope mismatch is dangerous because it can cause the platform or users to invoke the skill under false pretenses, bypassing review, policy gating, or user expectations about what the skill actually does.

Context-Inappropriate Capability

Medium

Confidence: 92% confidence
Finding: The file includes explicit crisis-response and media-influence playbooks such as contacting media to withdraw or revise coverage, legal-pressure tactics, and customer-retention countermeasures, none of which are justified by the declared stock/fund purpose. In context, these undisclosed capabilities materially expand the skill into reputation management and influence operations, increasing the risk of deceptive or manipulative use.

Description-Behavior Mismatch

High

Confidence: 97% confidence
Finding: The blueprint content is materially misaligned with the declared skill domain: instead of stock/A-share/fund optimization, it encodes e-commerce consumer personas, psychological triggers, and social influence mechanics for purchasing behavior. This is dangerous because domain mismatch can hide undeclared behavioral manipulation logic, cause the agent to operate outside user expectations, and enable repurposing for covert marketing or persuasion workflows rather than finance-related assistance.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The blueprint content is materially inconsistent with the declared stock/A-share/fund optimization purpose and instead defines e-commerce conversion personas, demand levels, and persuasion triggers. This kind of capability mismatch is dangerous because it can hide undeclared behavioral targeting functionality inside a finance-themed skill, increasing the risk of covert manipulation, misuse, or deployment outside user expectations.

Context-Inappropriate Capability

Medium

Confidence: 93% confidence
Finding: The file explicitly models persuasion-oriented personas such as KOC tiers, bargain hunters, herd followers, and repurchase groups, along with triggers and barriers designed to increase conversion. In a skill presented as finance-related, these manipulation-oriented targeting constructs are unjustified and could be repurposed for deceptive marketing, social engineering, or covert influence over users.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The dashboard contains a large iframe-inspection/highlight injector that is unrelated to the declared stock/fund simulation purpose. The code creates overlays, intercepts user interactions, inspects DOM elements, and communicates with a parent window, which strongly suggests hidden tooling or surveillance capability rather than legitimate dashboard functionality.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The code extracts element metadata such as selectors, text, attributes, geometry, URL, and path, then sends it to the parent via postMessage with a wildcard target. In an embedded context, this enables covert collection of page structure and potentially sensitive UI text from unrelated applications, which is unjustified for a financial simulation dashboard.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The page accepts control messages from any parent window without validating event.origin or sender identity. This allows an embedding page to toggle interception behavior, manipulate selection state, and drive page instrumentation externally, increasing abuse potential and undermining trust boundaries.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The comment claims interactive elements should retain default behavior, but the implementation calls preventDefault and stopPropagation whenever highlighting is enabled. This mismatch can suppress normal clicks across the page, enabling deceptive interaction blocking and making the instrumentation more invasive than documented.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: A duplicated copy of the same injector repeats the same misleading documentation and page-wide click interception behavior. Duplication increases the chance the behavior persists unnoticed, complicates review, and suggests the invasive functionality was intentionally embedded rather than accidentally introduced once.

Context-Inappropriate Capability

Medium

Confidence: 88% confidence
Finding: The code reads an API key from a fixed credentials file or environment and uses it to send user-provided scenario data to an external LLM service, without any evident justification or user disclosure tied to the advertised skill purpose. In a mismatched skill, undisclosed credential use and off-box processing increase the risk of covert data exfiltration and unauthorized external service usage.

Description-Behavior Mismatch

High

Confidence: 87% confidence
Finding: The file behavior materially diverges from the declared skill purpose: a stock/fund-oriented OASIS skill instead runs a consumer-marketing blueberry simulation pipeline. This kind of capability mismatch is dangerous because it can conceal unexpected execution paths, data creation, and downstream script invocation from users or reviewers.

Intent-Code Divergence

Medium

Confidence: 83% confidence
Finding: The docstring claims the script 'skips LLM' and directly performs rule-based generation, but the program actually orchestrates external simulation and analysis scripts. Misleading security-relevant documentation can cause reviewers and operators to underestimate external code execution and trust boundaries.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The file loads an API key from a credential file or environment variable even though this crisis-simulation script does not use that key anywhere. Unnecessary access to secrets expands the trust boundary, creates avoidable exposure of sensitive material in memory, and is especially suspicious because the skill metadata describes a different purpose than the code actually implements.

Description-Behavior Mismatch

High

Confidence: 95% confidence
Finding: The skill manifest says the skill is for stock/fund optimization, but this file performs a food-safety public-opinion crisis simulation. That mismatch is a supply-chain and transparency risk because users and reviewers may authorize the skill under false assumptions, making unrelated secret access and file writes more dangerous.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: The code automatically reads an API key from disk or environment and enables outbound requests to a third-party LLM service without clear necessity from the stated skill purpose. In an agent-skill context, this expands trust boundaries and can cause unintended secret use or external data exposure, especially because users may not expect the simulation to contact a remote service.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Round summaries and simulation metrics are sent to an external API, creating a real data egress path to a third party. Even if the payload appears operationally harmless, in a broader system these summaries may encode sensitive user inputs, business logic, or proprietary simulation outputs, and the manifest does not justify this network behavior.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The skill loads an API key from a credentials file or environment variable even though the rest of the code never uses that secret for the stated simulation task. Unnecessary secret access expands the blast radius of the skill: future code changes, logs, exceptions, or downstream integrations could expose the credential, and the mismatch between stated purpose and secret harvesting is suspicious.

Description-Behavior Mismatch

High

Confidence: 92% confidence
Finding: The metadata/docstring describes a stock/fund-related OASIS/qizheng skill, but the implementation performs a farm-product viral marketing simulation. This capability mismatch is dangerous because it can mislead reviewers and users about what the skill actually does, reducing trust and making hidden behaviors such as secret access or file writes easier to smuggle in.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The skill metadata advertises a stock/fund/OASIS-related capability, but the implementation is an unrelated e-commerce promotion and PR-crisis simulator. This mismatch is dangerous because it can mislead users and any invoking agent/router into executing a skill outside its declared purpose, undermining trust boundaries and increasing the chance of deceptive or unintended behavior.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The output claims the crisis curve is '新华数据校准', but the code contains no calibration logic, no reference dataset, and no external data ingestion. This is a deceptive accuracy/trust claim that may cause users to rely on fabricated authority when making operational or reputational decisions.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The file loads an API key from a workspace credential file or from the SILICONFLOW_API_KEY environment variable even though this module only performs local simulation and report generation. Unnecessary secret access increases exposure risk because any later code change, crash output, logging, or downstream dependency could leak the credential without a user expecting secrets to be touched.

Description-Behavior Mismatch

High

Confidence: 88% confidence
Finding: The code implements e-commerce promotion and PR crisis simulation, which does not match the declared stock/A-share/fund optimization focus of the skill. This mismatch is dangerous because users may grant trust or provide inputs under false assumptions, and capability misrepresentation is a common indicator of deceptive or policy-bypassing behavior.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The executable entrypoint exposes functionality that materially diverges from the declared skill purpose: instead of stock/A-share/fund optimization, it runs consumer promotion and PR crisis simulations and writes results to disk. This kind of capability mismatch is dangerous because it can hide undeclared behavior from reviewers and operators, weakening trust boundaries and enabling misuse of the skill in contexts where only finance-related analysis was expected.

Context-Inappropriate Capability

Medium

Confidence: 91% confidence
Finding: This section implements coordinated brand public-opinion crisis simulation and response-team actions that are unrelated to the advertised securities-analysis use case. In the skill context, undeclared PR-influence modeling is more suspicious because trigger words mention stocks/funds, so users may invoke a finance tool while actually getting reputation-management or sentiment-shaping capabilities.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal