Clawpaw Android Control Template

Security checks across malware telemetry and agentic risk

Overview

This is a disclosed Android remote-control skill, but it gives an agent broad phone control and sensitive data access without enough scoping or safety gates.

Install only if you intentionally want an agent to control your Android phone. Use the minimum Android permissions needed, avoid SMS/call/storage/contact/photo permissions unless required, keep HTTP access on a private trusted network or tunnel, leave vision analysis disabled unless you trust the provider, and supervise any action that can send messages, make calls, buy items, post content, read private data, or change phone settings.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (13)

Description-Behavior Mismatch

High

Confidence: 94% confidence
Finding: The README advertises extensive capabilities beyond basic Android UI automation, including access to location, contacts, calendar, photos, SMS, phone calls, installed apps, and broad file access. Even though this is documentation rather than executable code, documenting and normalizing such broad data-access scope creates a real security concern because it signals a high-privilege skill design that could be used for surveillance, data exfiltration, or account abuse if enabled.

Intent-Code Divergence

Medium

Confidence: 87% confidence
Finding: The storage section describes the feature as read-oriented while also listing WRITE_EXTERNAL_STORAGE, which is write-capable and materially expands risk. This can mislead reviewers and users into granting broader filesystem access than expected, enabling modification or overwriting of user files in addition to reading them.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The top-level description frames the skill as an Android control utility, but the documented commands include broad access to sensitive personal data and powerful system actions beyond that narrow framing. In security terms, incomplete disclosure is dangerous because it obscures privacy, integrity, and abuse potential from reviewers and end users.

Description-Behavior Mismatch

Medium

Confidence: 97% confidence
Finding: HTTP mode sends screenshots and UI/layout-derived content to an external DashScope model for visual analysis, but this external data transfer is not disclosed in the manifest description. That creates a significant privacy risk because screen contents may contain messages, credentials, personal media, or other sensitive app data leaving the local environment.

Description-Behavior Mismatch

High

Confidence: 96% confidence
Finding: The skill metadata describes general Android automation, but the implementation also exposes broad access to sensitive personal/device data such as contacts, photos, calendar, notifications, location, Bluetooth, Wi‑Fi, and file reads. This is a material scope expansion that can mislead users and downstream agents into granting or invoking capabilities with privacy and surveillance impact they were not clearly told about.

Description-Behavior Mismatch

High

Confidence: 99% confidence
Finding: The code can send screenshots and up to 30,000 characters of UI layout XML to an external vision LLM, but this data sharing is not disclosed in the skill description. Screenshots and layout XML may contain credentials, messages, personal content, and app metadata, so undisclosed third-party transmission creates a serious privacy and compliance risk.

Context-Inappropriate Capability

High

Confidence: 97% confidence
Finding: External vision-model upload is not necessary for the stated core purpose of Android control and introduces an additional exfiltration path for highly sensitive phone data. Because the skill already has privileged device access, adding third-party model transmission materially increases risk beyond what a user would reasonably expect from a control utility.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The README tells OpenClaw to automatically invoke this skill for very broad natural-language requests such as checking notifications, ordering food, or scrolling apps, without defining authorization, confirmation, or scope boundaries. For a skill that can remotely control a phone and access sensitive data, broad activation guidance materially increases the chance of unintended execution, privacy violations, and risky real-world actions.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: The Chinese section repeats the same overly broad invocation examples, encouraging automatic triggering from common conversational requests with no mention of user confirmation or restricted operation classes. Because the skill can drive UI interactions and retrieve private phone data, ambiguous activation text increases the likelihood of misuse or accidental execution.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The README advertises remote control, screenshots, layout extraction, location, notifications, contacts, calendar, and SMS access, but does not prominently warn about the security and privacy consequences of granting those permissions or exposing the device over networked control paths. In this context, the skill is inherently more dangerous because it bridges an LLM-driven automation system to a personal mobile device with extensive privileged access.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill documents access to highly sensitive data and privileged controls—such as contacts, photos, SMS, phone calls, notifications, files, camera, and WiFi changes—without prominent privacy and safety warnings proportionate to the risk. In this context, the skill is more dangerous because it operates on a real personal mobile device with accessibility and other elevated permissions, enabling both surveillance and direct action.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The analyze workflow retrieves a live phone screenshot and prepares it for transmission to an external API without an explicit warning or confirmation at the point of upload. Since phone screenshots often contain highly sensitive content, silent or implicit transfer to a third party can cause privacy breaches and accidental data disclosure.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The code embeds UI layout XML into the prompt sent to the vision API, and that XML can include on-screen text, resource identifiers, and other metadata revealing app state and private content. Sending this structured data externally without clear disclosure expands the leak surface beyond the screenshot itself and may expose secrets that are not visually obvious.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal