openclaw-fallback-skill

Security checks across malware telemetry and agentic risk

Overview

This skill is a plausible cloud fallback integration, but a logic bug makes it send essentially every prompt and recent chat context to the configured model API.

Review before installing. Use this only if you are comfortable with prompts, recent chat history, and possible user metadata being sent to the configured API on ordinary responses, not just failures. The fallback gate should be fixed, external data handling should be clearly disclosed, and the bundled API-key-like config value should be replaced before use.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (8)

Intent-Code Divergence

Medium

Confidence: 99% confidence
Finding: The fallback decision function is supposed to decide when cloud escalation is necessary, but it unconditionally returns true at the end. In practice this routes every interaction to the external model, defeating any privacy, cost, or policy expectations tied to local-only handling and making the documented safeguards ineffective.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The manifest explicitly describes automatic transmission of requests to a cloud LLM API when the local model fails, but provides no explicit warning, consent flow, or data-minimization statement about sending user content off-device. In an agent setting, prompts may contain sensitive user data, secrets, or internal context, so silent fallback can cause unauthorized disclosure to a third party.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill sends user messages and recent conversation history to a configured external API whenever fallback occurs; because the fallback logic currently always triggers, this becomes routine data exfiltration to a third party. Without explicit consent, disclosure, redaction, or tenant-controlled policy checks, sensitive prompts and prior context may be transmitted outside the local trust boundary.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The system prompt appends serialized userInfo metadata and sends it to the cloud model, potentially exposing profile or sensitive contextual data that the user did not intend to share externally. Embedding raw metadata in prompts also increases the chance of unnecessary over-collection and leakage of personal information.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The README instructs users to configure a cloud API URL and API key, which implies prompts and possibly conversation content may be transmitted to third-party services. It does not warn users about privacy, retention, cross-border transfer, or the sensitivity of data that may be sent, so users may enable the skill without informed consent about external disclosure.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The README says the skill will automatically and seamlessly switch to a cloud model, which increases risk because external transmission can happen without a visible prompt or explicit user acknowledgment at the time of use. In a fallback skill, this context makes the issue more dangerous because users may assume responses remain local while their data is silently sent to remote infrastructure.

Vague Triggers

Medium

Confidence: 74% confidence
Finding: The manifest wires the skill to broad events like beforeResponse and modelFailure while describing automatic cloud fallback, but it does not define strict conditions for what data is sent, when fallback is allowed, or how user consent is handled. In a fallback skill, ambiguous activation boundaries can cause unintended transmission of prompts or system context to an external API, increasing privacy and compliance risk.

Missing User Warnings

High

Confidence: 94% confidence
Finding: The description explicitly states that when the local model cannot answer, the skill will automatically call a cloud LLM API, but it does not warn that user prompts, conversation context, or possibly system instructions may be sent to a third party. This creates a clear data disclosure risk because operators may enable the skill without understanding that sensitive information could leave the local environment.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal