xgorobot

Security checks across malware telemetry and agentic risk

Overview

This appears to be a real XGO robot-control skill, but it needs Review because it combines physical robot control, camera/audio/cloud features, and several unsafe command-execution paths.

Install only if you trust the publisher and intentionally want this environment to control an XGO robot. Review every command before execution, avoid untrusted filenames, URLs, prompts, or generated speech text, and treat camera/audio features as privacy-sensitive because data may be saved locally or sent to DashScope. Firmware, calibration, motor unload, and long-running motion should be limited to explicit operator-approved maintenance or supervised workflows.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • Behavioral ASTexec() Call, eval() Call, Dynamic Import
  • Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
Findings (47)

os.system() or os exec-family call

High
Category
Dangerous Code Execution
Content
'''
    def xgoSpeaker(self,filename):
        path="/home/pi/xgoMusic/"
        os.system("mplayer"+" "+path+filename)

    def xgoVideoAudio(self,filename):
        path="/home/pi/xgoVideos/"
Confidence
99% confidence
Finding
os.system("mplayer"+" "+path+filename)

os.system() or os exec-family call

High
Category
Dangerous Code Execution
Content
path="/home/pi/xgoVideos/"
        time.sleep(0.2)  #音画速度同步了 但是时间轴可能不同步 这里调试一下
        cmd="sudo mplayer "+path+filename+" -novideo"
        os.system(cmd)

    def xgoVideo(self,filename):
        path="/home/pi/xgoVideos/"
Confidence
99% confidence
Finding
os.system(cmd)

os.system() or os exec-family call

High
Category
Dangerous Code Execution
Content
command2 = "-f S32_LE -r 8000 -c 1 -t wav"
        cmd=command1+" "+str(seconds)+" "+command2+" "+path+filename
        print(cmd)
        os.system(cmd)

    def xgoCamera(self,switch):
        global camera_still
Confidence
98% confidence
Finding
os.system(cmd)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
edu.lcd_clear()
            edu.lcd_text(5, 5, "播放中...", "GREEN", 14)
            
            subprocess.run(f'mplayer -really-quiet "{audio_url}"', shell=True, check=True)
            print(f"语音合成完成: {args.text}")
            print(f"音色: {args.voice} ({VOICE_OPTIONS.get(args.voice, '')})")
        else:
Confidence
98% confidence
Finding
subprocess.run(f'mplayer -really-quiet "{audio_url}"', shell=True, check=True)

os.system() or os exec-family call

High
Category
Dangerous Code Execution
Content
parser.add_argument('--filename', type=str, required=True, help='音频文件名(位于/home/pi/Music/)')
    args = parser.parse_args()
    
    os.system(f"mplayer /home/pi/Music/{args.filename}")
    print(f"已播放音频: {args.filename}")

if __name__ == '__main__':
Confidence
99% confidence
Finding
os.system(f"mplayer /home/pi/Music/{args.filename}")

subprocess module call

Medium
Category
Dangerous Code Execution
Content
args = parser.parse_args()
    
    cmd = f'mplayer "{args.url}"'
    subprocess.run(cmd, shell=True, check=True)
    print(f"已播放HTTP音频: {args.url}")

if __name__ == '__main__':
Confidence
99% confidence
Finding
subprocess.run(cmd, shell=True, check=True)

subprocess module call

Medium
Category
Dangerous Code Execution
Content
# eSpeak中文语音,speed控制语速(默认120,越小越慢越笨拙)
        # -v zh 中文,-s 语速,-p 音调(50=低沉)
        cmd = f'espeak -v zh -s {speed} -p 50 "{text}"'
        subprocess.run(cmd, shell=True, check=True,
                      stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
        return True
    except Exception as e:
Confidence
98% confidence
Finding
subprocess.run(cmd, shell=True, check=True, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

Tainted flow: 'image_url' from requests.get (line 85, network input) → requests.get (network output)

Medium
Category
Data Flow
Content
if "results" in query_result["output"] and len(query_result["output"]["results"]) > 0:
                image_url = query_result["output"]["results"][0].get("url", "")
                if image_url:
                    img_response = requests.get(image_url, timeout=10)
                    image = Image.open(BytesIO(img_response.content))
                    image = image.resize((320, 240))
Confidence
89% confidence
Finding
img_response = requests.get(image_url, timeout=10)

Tainted flow: 'audio_url' from requests.post (line 61, network input) → subprocess.run (code execution)

Critical
Category
Data Flow
Content
edu.lcd_clear()
            edu.lcd_text(5, 5, "播放中...", "GREEN", 14)
            
            subprocess.run(f'mplayer -really-quiet "{audio_url}"', shell=True, check=True)
            print(f"语音合成完成: {args.text}")
            print(f"音色: {args.voice} ({VOICE_OPTIONS.get(args.voice, '')})")
        else:
Confidence
99% confidence
Finding
subprocess.run(f'mplayer -really-quiet "{audio_url}"', shell=True, check=True)

Description-Behavior Mismatch

Low
Confidence
81% confidence
Finding
The API docs expose person-attribute inference such as age/sex estimation that is not declared in the skill description. This is a privacy-sensitive capability, and omitting it prevents meaningful consent and can cause the skill to be used for biometric or demographic profiling beyond expected robot-control functions.

Context-Inappropriate Capability

Medium
Confidence
89% confidence
Finding
The skill allows displaying images and playing audio from arbitrary URLs, which introduces unrestricted remote content retrieval into a robot-control skill. This expands the attack surface for tracking, malicious or inappropriate content delivery, and unexpected network access that is not necessary for core local robot operation.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
These helper methods execute shell commands built from attacker-influenced filenames for audio and video playback. In a skill that may expose these helpers through natural-language-triggered actions, this becomes more dangerous because seemingly harmless media requests can be turned into arbitrary command execution on a hardware-controlling device.

Intent-Code Divergence

Medium
Confidence
94% confidence
Finding
The constructor accepts a user-supplied serial port but ignores it and always opens /dev/ttyAMA0. In a hardware-control library, this can direct commands to the wrong attached device, defeating caller safety assumptions and potentially causing unintended robot movement or modification on a different serial-connected target.

Intent-Code Divergence

Medium
Confidence
96% confidence
Finding
The constructor accepts a port argument but ignores it and always opens /dev/ttyAMA0. In a hardware-control library, this can direct commands to an unintended serial device, causing unauthorized or unsafe actuation and defeating caller-side safety controls that rely on explicit device selection.

Intent-Code Divergence

Medium
Confidence
90% confidence
Finding
The code exposes public upgrade methods that invoke a helper explicitly marked as test-stage/do-not-use and then streams an arbitrary local binary to the device. On a robot control interface, unsafe firmware flashing can brick the device or install unintended firmware, creating both availability and safety risks.

Context-Inappropriate Capability

Medium
Confidence
97% confidence
Finding
The audio-playback purpose does not require invoking a shell, yet the code routes attacker-controlled input through one. Because the URL is inserted directly into a shell command string, this creates a command injection path that can be exploited to run arbitrary OS commands under the privileges of the script.

Context-Inappropriate Capability

High
Confidence
99% confidence
Finding
The TTS helper executes a shell command containing untrusted text derived from args.target via speech_text. Because this skill already controls a physical robot, arbitrary command execution on the host increases risk substantially: an attacker could run system commands, alter files, or chain into further unsafe robot behavior.

Vague Triggers

Medium
Confidence
87% confidence
Finding
The activation triggers are overly broad and overlap with normal conversation about movement, vision, and AI, increasing the chance of accidental invocation. In a skill that can move hardware, use sensors, access cloud AI, and interact with remote content, unintended activation materially raises safety and privacy risk.

Missing User Warnings

High
Confidence
96% confidence
Finding
The skill description does not warn that commands may immediately cause physical motion and actuator operation. For a robot dog with locomotion, arm, and claw features, omission of a safety warning increases the likelihood of surprise movement, collisions, pinching, or damage to nearby people, pets, and objects.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The skill describes camera, microphone, and AI capabilities without clear privacy or data-transmission warnings, despite documented use of cloud AI services via DashScope. This can lead users to unknowingly capture or transmit sensitive audio/images, making the issue more serious in a mobile robot with onboard sensing.

Missing User Warnings

Medium
Confidence
90% confidence
Finding
This method captures and saves a photo immediately without any built-in user confirmation, consent check, or visible pre-recording warning. In the context of a robot skill with camera access, silent image capture creates a real privacy risk even if it is not a code-execution bug.

Missing User Warnings

High
Confidence
97% confidence
Finding
The method records microphone input and writes it to disk without any explicit consent or warning mechanism. Because this skill targets a physical robot with audio capabilities, undisclosed recording is a meaningful surveillance/privacy issue and is compounded here by the separate shell-injection risk in command construction.

Missing User Warnings

High
Confidence
96% confidence
Finding
This method records video to disk programmatically without an explicit advance warning or consent gate. In a camera-equipped robot context, covert or surprising video capture materially increases privacy and safety concerns beyond a normal local utility script.

Missing User Warnings

Medium
Confidence
88% confidence
Finding
The calibration method issues a safety-relevant hardware calibration command immediately based on a simple parameter, with no confirmation, interlock, or explicit precondition checks. In a robot-control context, accidental or scripted invocation could recalibrate actuators in an unsafe state, leading to misalignment, unstable motion, or degraded control behavior.

Missing User Warnings

Medium
Confidence
91% confidence
Finding
The upgrade path sends an upgrade command and then writes an arbitrary binary file directly to hardware without strong validation, confirmation, integrity checking, or rollback protections. In this robot-control skill, misuse or accidental invocation could corrupt firmware, brick the device, or place hardware into an unsafe or undefined operating state.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal