Google Voice Caller

Security checks across malware telemetry and agentic risk

Overview

This skill can automate Google Voice calls, but it also ships Google session cookies and records call audio without clear disclosure or consent controls.

Review carefully before installing. Do not use the bundled Google cookies; revoke any exposed Google sessions and require each user to supply their own securely stored credentials. Only run this with explicit confirmation before every call, verify the number and message/audio content, and treat any saved recordings under /tmp as sensitive data subject to consent and retention rules.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (12)

Description-Behavior Mismatch

Medium

Confidence: 90% confidence
Finding: The script records call audio to a predictable temporary location and then exports it as an MP3, which materially expands the skill from call placement/TTS injection into call interception and retention. In a telephony context, undisclosed recording is highly sensitive because it can capture private conversations and may violate consent laws, making the capability more dangerous than ordinary media processing.

Intent-Code Divergence

Low

Confidence: 76% confidence
Finding: The comment says the post-call step only transcodes audio, but the code also prints a machine-readable FINISHED_FILE path, which exposes the existence and location of a captured recording to downstream automation. This mismatch increases the risk of silent collection or chaining into other workflows that move or process sensitive call recordings without clear operator awareness.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The file contains live-looking Google and Google Voice authentication cookies, including high-value session tokens such as SID, SAPISID, and __Secure-* cookies. Bundling these in a skill enables direct reuse of an authenticated Google account session, which goes far beyond call automation and can allow account impersonation, access to Google Voice data, and potentially broader Google account actions.

Description-Behavior Mismatch

High

Confidence: 98% confidence
Finding: The declared purpose is Google Voice call automation, but the bundled data provides authenticated Google account session state for google.com and voice.google.com, which materially exceeds the stated scope. This mismatch makes the skill more dangerous because anyone using or inspecting the package could leverage the cookies for broader authenticated access, indicating covert credential inclusion rather than a narrowly scoped integration artifact.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The code records call audio in the browser, converts it to Base64, and writes it to /tmp/gv_recorded_incoming.webm, but the skill description only mentions placing calls with TTS or audio injection. This hidden data collection materially changes the privacy and compliance posture of the skill because users and callees may be recorded without informed disclosure.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: Recording and persisting call audio is not necessary for the stated purpose of automating Google Voice calls with TTS or local audio injection. This adds a surveillance capability that can capture sensitive conversations and creates legal, privacy, and retention risk far beyond the declared functionality.

Vague Triggers

Medium

Confidence: 91% confidence
Finding: Broad triggers such as common words for 'call' or regex-like everyday phrases can cause accidental activation during normal conversation. In this skill's context, unintended activation is especially risky because the resulting action places real phone calls, potentially causing privacy harm, charges, or harassment from misfires.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The documentation encourages a highly ambiguous natural-language trigger that blends with ordinary assistant usage. Because this skill performs an external real-world action, unclear activation boundaries materially raise the chance of unintended calls and unauthorized message delivery.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The English example uses a generic conversational phrase that could easily occur in routine dialogue, making accidental triggering plausible. In a calling skill with automated speech injection, such ambiguity can lead to unintended outbound communications and associated cost and reputational impact.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The script invokes telephony and media-processing steps that can place calls, inject audio, and process recordings without any explicit notice about whether the call is being recorded or where captured audio goes. In the context of an automated Google Voice caller, lack of disclosure materially increases privacy, legal, and abuse risk because operators may unknowingly record third parties or automate deceptive calls.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The skill silently starts MediaRecorder via getUserMedia, records audio during the call, and saves it to disk without any explicit warning, consent prompt, or confirmation flow. In many jurisdictions, recording calls without proper notice or consent can violate law and expose users to serious privacy and regulatory consequences.

Missing User Warnings

Medium

Confidence: 86% confidence
Finding: The code reads Google Voice session cookies from disk and injects them into the browser session, enabling account access through bearer-style session material. While local cookie loading can be legitimate automation, failing to disclose this sensitive credential handling increases the chance of insecure storage, accidental reuse, or misuse of another user's authenticated session.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal