xgorobot

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real XGO robot-control skill, but it needs Review because it combines physical robot control, camera/audio/cloud features, and several unsafe command-execution paths.

Install only if you trust the publisher and intentionally want this environment to control an XGO robot. Review every command before execution, avoid untrusted filenames, URLs, prompts, or generated speech text, and treat camera/audio features as privacy-sensitive because data may be saved locally or sent to DashScope. Firmware, calibration, motor unload, and long-running motion should be limited to explicit operator-approved maintenance or supervised workflows.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
Behavioral ASTexec() Call, eval() Call, Dynamic Import
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain

Findings (47)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: ''' def xgoSpeaker(self,filename): path="/home/pi/xgoMusic/" os.system("mplayer"+" "+path+filename) def xgoVideoAudio(self,filename): path="/home/pi/xgoVideos/"
Confidence: 99% confidence
Finding: os.system("mplayer"+" "+path+filename)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: path="/home/pi/xgoVideos/" time.sleep(0.2) #音画速度同步了但是时间轴可能不同步这里调试一下 cmd="sudo mplayer "+path+filename+" -novideo" os.system(cmd) def xgoVideo(self,filename): path="/home/pi/xgoVideos/"
Confidence: 99% confidence
Finding: os.system(cmd)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: command2 = "-f S32_LE -r 8000 -c 1 -t wav" cmd=command1+" "+str(seconds)+" "+command2+" "+path+filename print(cmd) os.system(cmd) def xgoCamera(self,switch): global camera_still
Confidence: 98% confidence
Finding: os.system(cmd)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: edu.lcd_clear() edu.lcd_text(5, 5, "播放中...", "GREEN", 14) subprocess.run(f'mplayer -really-quiet "{audio_url}"', shell=True, check=True) print(f"语音合成完成: {args.text}") print(f"音色: {args.voice} ({VOICE_OPTIONS.get(args.voice, '')})") else:
Confidence: 98% confidence
Finding: subprocess.run(f'mplayer -really-quiet "{audio_url}"', shell=True, check=True)

os.system() or os exec-family call

High

Category: Dangerous Code Execution
Content: parser.add_argument('--filename', type=str, required=True, help='音频文件名(位于/home/pi/Music/)') args = parser.parse_args() os.system(f"mplayer /home/pi/Music/{args.filename}") print(f"已播放音频: {args.filename}") if __name__ == '__main__':
Confidence: 99% confidence
Finding: os.system(f"mplayer /home/pi/Music/{args.filename}")

subprocess module call

Medium

Category: Dangerous Code Execution
Content: args = parser.parse_args() cmd = f'mplayer "{args.url}"' subprocess.run(cmd, shell=True, check=True) print(f"已播放HTTP音频: {args.url}") if __name__ == '__main__':
Confidence: 99% confidence
Finding: subprocess.run(cmd, shell=True, check=True)

subprocess module call

Medium

Category: Dangerous Code Execution
Content: # eSpeak中文语音，speed控制语速（默认120，越小越慢越笨拙） # -v zh 中文，-s 语速，-p 音调(50=低沉) cmd = f'espeak -v zh -s {speed} -p 50 "{text}"' subprocess.run(cmd, shell=True, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL) return True except Exception as e:
Confidence: 98% confidence
Finding: subprocess.run(cmd, shell=True, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

Tainted flow: 'image_url' from requests.get (line 85, network input) → requests.get (network output)

Medium

Category: Data Flow
Content: if "results" in query_result["output"] and len(query_result["output"]["results"]) > 0: image_url = query_result["output"]["results"][0].get("url", "") if image_url: img_response = requests.get(image_url, timeout=10) image = Image.open(BytesIO(img_response.content)) image = image.resize((320, 240))
Confidence: 89% confidence
Finding: img_response = requests.get(image_url, timeout=10)

Tainted flow: 'audio_url' from requests.post (line 61, network input) → subprocess.run (code execution)

Critical

Category: Data Flow
Content: edu.lcd_clear() edu.lcd_text(5, 5, "播放中...", "GREEN", 14) subprocess.run(f'mplayer -really-quiet "{audio_url}"', shell=True, check=True) print(f"语音合成完成: {args.text}") print(f"音色: {args.voice} ({VOICE_OPTIONS.get(args.voice, '')})") else:
Confidence: 99% confidence
Finding: subprocess.run(f'mplayer -really-quiet "{audio_url}"', shell=True, check=True)

Description-Behavior Mismatch

Low

Confidence: 81% confidence
Finding: The API docs expose person-attribute inference such as age/sex estimation that is not declared in the skill description. This is a privacy-sensitive capability, and omitting it prevents meaningful consent and can cause the skill to be used for biometric or demographic profiling beyond expected robot-control functions.

Context-Inappropriate Capability

Medium

Confidence: 89% confidence
Finding: The skill allows displaying images and playing audio from arbitrary URLs, which introduces unrestricted remote content retrieval into a robot-control skill. This expands the attack surface for tracking, malicious or inappropriate content delivery, and unexpected network access that is not necessary for core local robot operation.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: These helper methods execute shell commands built from attacker-influenced filenames for audio and video playback. In a skill that may expose these helpers through natural-language-triggered actions, this becomes more dangerous because seemingly harmless media requests can be turned into arbitrary command execution on a hardware-controlling device.

Intent-Code Divergence

Medium

Confidence: 94% confidence
Finding: The constructor accepts a user-supplied serial port but ignores it and always opens /dev/ttyAMA0. In a hardware-control library, this can direct commands to the wrong attached device, defeating caller safety assumptions and potentially causing unintended robot movement or modification on a different serial-connected target.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The constructor accepts a port argument but ignores it and always opens /dev/ttyAMA0. In a hardware-control library, this can direct commands to an unintended serial device, causing unauthorized or unsafe actuation and defeating caller-side safety controls that rely on explicit device selection.

Intent-Code Divergence

Medium

Confidence: 90% confidence
Finding: The code exposes public upgrade methods that invoke a helper explicitly marked as test-stage/do-not-use and then streams an arbitrary local binary to the device. On a robot control interface, unsafe firmware flashing can brick the device or install unintended firmware, creating both availability and safety risks.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The audio-playback purpose does not require invoking a shell, yet the code routes attacker-controlled input through one. Because the URL is inserted directly into a shell command string, this creates a command injection path that can be exploited to run arbitrary OS commands under the privileges of the script.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The TTS helper executes a shell command containing untrusted text derived from args.target via speech_text. Because this skill already controls a physical robot, arbitrary command execution on the host increases risk substantially: an attacker could run system commands, alter files, or chain into further unsafe robot behavior.

Vague Triggers

Medium

Confidence: 87% confidence
Finding: The activation triggers are overly broad and overlap with normal conversation about movement, vision, and AI, increasing the chance of accidental invocation. In a skill that can move hardware, use sensors, access cloud AI, and interact with remote content, unintended activation materially raises safety and privacy risk.

Missing User Warnings

High

Confidence: 96% confidence
Finding: The skill description does not warn that commands may immediately cause physical motion and actuator operation. For a robot dog with locomotion, arm, and claw features, omission of a safety warning increases the likelihood of surprise movement, collisions, pinching, or damage to nearby people, pets, and objects.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The skill describes camera, microphone, and AI capabilities without clear privacy or data-transmission warnings, despite documented use of cloud AI services via DashScope. This can lead users to unknowingly capture or transmit sensitive audio/images, making the issue more serious in a mobile robot with onboard sensing.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: This method captures and saves a photo immediately without any built-in user confirmation, consent check, or visible pre-recording warning. In the context of a robot skill with camera access, silent image capture creates a real privacy risk even if it is not a code-execution bug.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The method records microphone input and writes it to disk without any explicit consent or warning mechanism. Because this skill targets a physical robot with audio capabilities, undisclosed recording is a meaningful surveillance/privacy issue and is compounded here by the separate shell-injection risk in command construction.

Missing User Warnings

High

Confidence: 96% confidence
Finding: This method records video to disk programmatically without an explicit advance warning or consent gate. In a camera-equipped robot context, covert or surprising video capture materially increases privacy and safety concerns beyond a normal local utility script.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The calibration method issues a safety-relevant hardware calibration command immediately based on a simple parameter, with no confirmation, interlock, or explicit precondition checks. In a robot-control context, accidental or scripted invocation could recalibrate actuators in an unsafe state, leading to misalignment, unstable motion, or degraded control behavior.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The upgrade path sends an upgrade command and then writes an arbitrary binary file directly to hardware without strong validation, confirmation, integrity checking, or rollback protections. In this robot-control skill, misuse or accidental invocation could corrupt firmware, brick the device, or place hardware into an unsafe or undefined operating state.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal