Back to skill

Security audit

Embodied-OS - AI Robot Control System

Security checks across malware telemetry and agentic risk

Overview

This robot-control skill is not clearly malicious, but it grants high-impact physical robot authority and bundles unrelated video-generation instructions with extra installs and API use.

Review before installing. Use this only in simulation or a supervised, bounded robot workspace until you have verified the installed packages, hardware limits, emergency stop behavior, and provider data flows. Remove or ignore the bundled old video-generator files, use dedicated API keys with spend limits, and require explicit confirmation before any real robot movement, long-running monitoring, repository install, or external API submission.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (10)

Description-Behavior Mismatch

High
Confidence
98% confidence
Finding
The file's behavior is materially unrelated to the parent skill metadata for embodied robot control, instead describing a general-purpose video generation workflow. This kind of scope mismatch is dangerous because it can cause an agent or reviewer to trust and invoke capabilities outside the declared domain, increasing the chance of unexpected code execution, repository cloning, and API use under a misleading label.

Context-Inappropriate Capability

Medium
Confidence
94% confidence
Finding
The skill instructs the agent to clone an external repository, install dependencies, set credentials, and run local shell/Node commands. In a skill ecosystem, these actions create a supply-chain and arbitrary code execution risk, especially when they are not tightly justified by the surrounding embodied-robot-control context and could be triggered by ordinary user requests.

Vague Triggers

Medium
Confidence
92% confidence
Finding
The auto-trigger keywords are broad enough to match many ordinary conversations about videos, causing the skill to activate in situations where the user may not have intended repository access, dependency installation, or script execution. Overbroad activation increases the risk that unsafe actions occur with weak user intent validation.

Vague Triggers

Medium
Confidence
90% confidence
Finding
The condition that text merely 'looks like a video script' is ambiguous and subjective, which can cause false activations. In this skill, false activation is more dangerous because the documented workflow escalates quickly from interpretation to shell commands and external code execution.

Vague Triggers

Medium
Confidence
94% confidence
Finding
The auto-trigger criteria are broad enough to activate on normal discussion about videos, scripts, or related tooling, which can cause the agent to invoke this skill when the user did not clearly request execution. In this skill, activation can lead to cloning repos, installing packages, and sending user-provided script content to external APIs, so accidental triggering creates meaningful privacy and safety risk.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The skill directs use of OpenAI TTS and Whisper and instructs users to configure an API key, but it does not clearly warn at the point of use that user scripts and audio-derived content may be transmitted to third-party services. Because this skill is designed to process arbitrary user-provided content, missing disclosure and consent can expose sensitive or proprietary data to external providers.

Vague Triggers

Medium
Confidence
94% confidence
Finding
The skill defines very broad automatic trigger conditions such as generic video-generation-related queries and any text that merely looks like a script. This can cause unintended invocation of the skill in normal conversation, leading the agent to run local shell commands, access external services, and incur API usage or side effects without sufficiently explicit user intent.

Vague Triggers

Medium
Confidence
92% confidence
Finding
The examples state that common phrases like '做个视频' and '我想要一个关于AI的短视频' must use this skill, even though such requests may be underspecified and may not include enough detail for safe execution. This increases the risk of the agent prematurely invoking an operational pipeline, running shell commands, and making network/API calls based on vague prompts.

Vague Triggers

Medium
Confidence
93% confidence
Finding
The activation criteria are overly broad for a skill that can control physical robots, so the skill may be invoked during general robotics discussions rather than only when the user explicitly intends robot actuation. In this context, unintended invocation is more dangerous than a typical software skill because it can route natural-language requests into planning or action pathways that affect real-world systems.

Missing User Warnings

High
Confidence
97% confidence
Finding
The skill advertises natural-language control of physical robots without prominent warnings about motion hazards, property damage, or injury risk. Because the skill targets embodied systems and presents high-level autonomous actions, omission of explicit safety warnings and operational limits materially increases the chance of unsafe use or overtrust by users.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal