Personal Guardian

Security checks across malware telemetry and agentic risk

Overview

This emergency guardian skill is openly rescue-focused, but it asks for unusually broad autonomous power to record, track, notify, call, and broadcast sensitive information without enough controls.

Treat this as a Review item, not proven malware. Do not connect it to real microphone, camera, health, location, phone, SMS, social media, nearby broadcast, or drone integrations unless every channel is explicitly configured, public/social broadcasts are disabled by default, emergency-call authorization is verified, false-alarm cancellation is tested, and incident-data retention/deletion rules are clear.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (18)

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The self-learning and knowledge-base update capability expands the skill beyond immediate emergency response into post-incident model adaptation. In a safety-critical system handling highly sensitive medical, location, and incident data, this creates risk of unauthorized retention, profiling, and secondary use without clear minimization, review, or consent boundaries.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: Broadcasting structured rescue information to social media, nearby devices, drones, and public systems is an excessive dissemination of highly sensitive personal data. Even if framed as emergency assistance, this creates a large privacy and abuse surface, especially if triggers are mistaken, spoofed, or maliciously induced.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The L5 path advertises emergency outreach to nearby devices, social media, and drone/low-altitude rescue networks, but the implementation only appends a log entry and prints status text. In a life-safety agent, this creates a dangerous false sense of rescue escalation: operators, users, or downstream systems may believe critical outreach occurred when it did not, delaying real intervention.

Intent-Code Divergence

Medium

Confidence: 86% confidence
Finding: The CLI exposes a --simulate flag that promises no real messaging or calls, but execute() does not honor that flag and still performs the normal notification/call flow. In this file those actions are currently mocked by prints/logs, but the control-flow contract is unsafe: once real integrations are added, simulation mode could accidentally trigger real emergency outreach.

Context-Inappropriate Capability

Medium

Confidence: 87% confidence
Finding: The escalation plan includes '社交媒体紧急广播（已授权账号）', which expands disclosure of a user's emergency status and location beyond responders and trusted contacts. In this skill's context, the engine is explicitly autonomous and assumes the user cannot respond, so broad public broadcasting can expose highly sensitive location and health data without contemporaneous consent and can create stalking, doxxing, or reputational harm.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The README explicitly advertises autonomous recording, location tracking, contact-chain notification, emergency calling, and drone-network linkage, but provides no warning about privacy, consent, false positives, data handling, or real-world side effects. In this skill’s context, that omission is more dangerous because the agent claims 'fully autonomous decision-making power' while assuming the user is unable to respond, which materially increases the risk of unauthorized surveillance, unwanted disclosure of sensitive location/medical status, and accidental emergency-service activation.

Vague Triggers

High

Confidence: 97% confidence
Finding: The metadata-level activation condition allows the skill to act when the device 'automatically detects danger signals' without precise thresholds or user-governed boundaries. In a skill that can autonomously record, locate, notify contacts, and call authorities, ambiguous triggers materially increase the risk of false activation and privacy-invasive actions.

Vague Triggers

High

Confidence: 97% confidence
Finding: The phrase 'AI predicts an emergency' is overly broad and delegates major coercive actions to an unspecified inference process. Because the skill assumes the user is unable to respond and permits non-consensual action escalation, vague prediction language is especially dangerous in this context.

Natural-Language Policy Violations

Medium

Confidence: 75% confidence
Finding: The emergency call script appears fixed to a single language/locale and does not document regional routing, terminology, or localization requirements. In emergency contexts, wrong language or region-specific assumptions can delay response, cause miscommunication, or misroute critical calls.

Missing User Warnings

High

Confidence: 98% confidence
Finding: For CRITICAL/SEVERE states, the engine automatically calls emergency services, broadcasts to contacts, shares location, and starts continuous or hour-long recording with no visible confirmation, warning, or policy guardrails in code. In a safety-critical assistant that defaults to 'user incapacitated,' false positives can trigger non-consensual surveillance, emergency dispatch abuse, and large-scale privacy exposure at exactly the moment the system has maximum authority.

Missing User Warnings

High

Confidence: 97% confidence
Finding: The data collection plan includes audio recording, physiological streams, accelerometer data, GPS, and environment photos, all of which are highly sensitive and can reveal health status, surroundings, and bystanders. Because this module is designed for autonomous execution without interaction, the absence of visible disclosure, minimization, or consent checks materially increases privacy and misuse risk.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: This file orchestrates disclosure of highly sensitive personal data, including name, precise location, situation details, emergency escalation, social posting, nearby device broadcast, and drone-network requests, across multiple channels. In the context of an emergency agent that assumes the user may be unable to respond and can act autonomously, the lack of explicit consent gates, data-minimization controls, channel-specific authorization, and visible user warning/confirmation materially increases privacy, safety, and abuse risks if triggered accidentally, maliciously, or on false positives.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The code stores the entire raw sensor payload in TriggerEvent.raw_data, which can include precise location, biometric readings, and other highly sensitive emergency-context data. In this skill's context, those events are likely to be forwarded to downstream autonomous rescue components, logged, or displayed, so retaining full raw data without minimization or explicit privacy controls materially increases exposure if the event is persisted, transmitted, or accessed by other components.

Natural-Language Policy Violations

Medium

Confidence: 90% confidence
Finding: This module emits safety-critical descriptions and alerts only in Chinese, with no locale negotiation or documented restriction that all operators and end users must understand Chinese. In an emergency-response skill that may autonomously trigger rescue actions while the user is presumed incapacitated, language mismatch can cause operators, caregivers, or users to misunderstand warnings, delay intervention, or take incorrect action.

Natural-Language Policy Violations

Medium

Confidence: 95% confidence
Finding: The alert messages generated by the monitoring logic are Chinese-only and are likely to be consumed by downstream autonomous decision engines, responders, or debugging operators. In this skill's context, the lack of language choice is more dangerous than in a normal app because misunderstood alerts can directly affect emergency escalation, medical interpretation, and the timeliness of rescue actions.

Ssd 3

High

Confidence: 99% confidence
Finding: Granting 'full autonomous decision-making' plus blanket access to all device data creates an overbroad authority model inconsistent with least privilege. In combination with default assumptions of user incapacity, it authorizes invasive collection and disclosure without sufficient scope limits, making misuse and accidental overreach far more likely.

Ssd 4

High

Confidence: 96% confidence
Finding: The escalation model systematically broadens disclosure from contacts to authorities to wider audiences, normalizing progressively larger releases of sensitive incident data. Without strict necessity checks and channel-specific safeguards, a false positive or adversarial trigger can quickly become a mass privacy breach.

Ssd 3

High

Confidence: 99% confidence
Finding: Publishing rescue information to nearby devices, social channels, and public broadcast systems in plain language exposes location, condition, and distress status to uncontrolled recipients. This is especially dangerous for stalking, domestic violence, coercion, or reputational harm, and the surrounding skill context explicitly enables automated dissemination at scale.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal