Amber — Phone-Capable Voice Agent

Security checks across malware telemetry and agentic risk

Overview

Amber is a real phone-agent skill, but it stores and reuses sensitive call, contact, transcript, and calendar data in ways that deserve careful review before installation.

Install only if you are prepared to operate Amber as a sensitive phone system: it can place calls, process audio through Twilio/OpenAI, store transcripts and CRM memory, read/write calendars, expose call history locally, and optionally export Apple Contacts. Before production use, disable outbound calling or Contacts sync if not needed, add caller notice/consent language, protect .env and local data files, set retention/deletion practices, and verify calendar creation requires explicit confirmation in every client path.

SkillSpector

By NVIDIA

Vulnerability Patterns

Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (54)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 93% confidence
Finding: The skill advertises significant capabilities including network access, environment-variable use, webhook handling, and external provider integration, but the metadata does not declare corresponding permissions. This creates a transparency and policy-enforcement gap: users or a hosting platform may approve the skill under a less-privileged trust model than the implementation actually requires, increasing the chance of overbroad access to secrets and networked actions.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The description frames the skill mainly as adding phone capabilities, but the document discloses a much broader feature set: calendar writes, CRM persistence, transcript processing, message forwarding, dashboard serving, Apple Contacts integration, and an MCP server with multiple tools. This mismatch can cause users and automated review systems to underestimate the data access and action surface, leading to uninformed installation of a skill that processes sensitive communications and personal data.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: The skill instructs the assistant to build persistent CRM profiles, store volunteered personal details, and maintain long-lived context notes across calls, which goes beyond a narrow phone-assistant function and creates a privacy/data-minimization risk. In a voice-call context, callers may reasonably expect conversational handling or message taking, not silent profiling and retention of behavioral and personal-history notes for future use.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The calendar skill is explicitly documented with `capabilities: ["read", "act"]` and supports `action: "create"`, yet its manifest sets `confirmation_required: false` and the notes say event creation needs no confirmation. That creates a real safety gap: an LLM or misheard caller request could create or modify calendar data without an explicit user confirmation, despite the architecture elsewhere stating side-effecting actions should be programmatically gated.

Intent-Code Divergence

High

Confidence: 97% confidence
Finding: The file-level privacy claim says event titles and details are never returned, but the create path returns raw CLI output and writes title, location, and notes to call logs. That creates a clear confidentiality mismatch: downstream components or operators may rely on the privacy guarantee while sensitive calendar content is still exposed through responses and logging.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The create-event path logs full event content, including title, location, and notes, into call logs. Call logs are often retained, exported, or accessible to broader systems than the runtime path, so this unnecessarily expands exposure of sensitive personal or business calendar data.

Scope Creep

High

Confidence: 96% confidence
Finding: The design specifies dynamic external CRM pull/push over the network while elsewhere declaring `permissions.network: false`. That mismatch is dangerous because it can lead reviewers and operators to trust a non-networked permission model while later enabling code paths that exfiltrate caller PII to third-party services.

Description-Behavior Mismatch

Medium

Confidence: 79% confidence
Finding: The skill metadata says it provides phone capabilities, but this file implements persistent CRM storage, contact search, tagging, and interaction history. That mismatch creates a transparency and consent problem: operators may enable the skill expecting call functionality while unknowingly granting persistent collection and retrieval of personal contact data.

Intent-Code Divergence

Medium

Confidence: 89% confidence
Finding: The server explicitly enables `Access-Control-Allow-Origin: *` for all responses, including the unauthenticated `POST /api/sync` endpoint and static dashboard content. While the service binds only to loopback, any website visited by a local user could issue cross-origin requests to `127.0.0.1` and interact with the dashboard from the browser, weakening the intended local-only protection and potentially exposing call metadata or triggering processing actions.

Intent-Code Divergence

Medium

Confidence: 97% confidence
Finding: This demo wizard explicitly states that validation is disabled and no real APIs are called, yet it presents fake success messages for credential validation, ngrok detection, .env creation, and install/build steps. That is a security-relevant deception issue because users may believe secrets were safely validated or stored when they were not, leading to unsafe deployment decisions and accidental credential mishandling.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The script exports a broad set of sensitive contact fields beyond what is needed to place calls, including emails, relationships, postal addresses, notes, job titles, and organization data. Even though the file is described as local-only and opt-in, concentrating this extra personal data into a JSON cache increases privacy risk, expands the blast radius of local compromise, and creates unnecessary exposure if other parts of the skill later read, log, package, or transmit the cache.

Description-Behavior Mismatch

Medium

Confidence: 92% confidence
Finding: The runtime automatically upserts contacts and logs call interactions from transcripts after every call, which expands the skill from telephony into persistent CRM data processing. In a skill advertised as providing phone capabilities, this hidden persistence materially changes the privacy and security posture because sensitive caller data is retained and enriched without an explicit, narrowly scoped user action in this flow.

Description-Behavior Mismatch

Low

Confidence: 90% confidence
Finding: The code writes raw incoming webhook events and transcripts to disk, but the skill description only promises phone capabilities. This understatement is security-relevant because operators may deploy it without realizing it stores sensitive call metadata and conversation content locally.

Context-Inappropriate Capability

High

Confidence: 96% confidence
Finding: This code sends full call transcripts to an external model to extract caller name, company, email, and personal context for future retention, then writes that data back into CRM. That is a substantive privacy-sensitive profiling feature beyond generic phone operation, and if enabled without explicit consent and strict data-governance controls it can create significant compliance, confidentiality, and misuse risk.

Intent-Code Divergence

High

Confidence: 93% confidence
Finding: The docstring states that security relies on explicit caller-side binary allowlists, but this function accepts any `file` path and any argument vector without enforcing such a policy. In a skill that exposes phone-related actions through natural language, this becomes more dangerous because upstream prompt handling mistakes could let user-controlled input reach a generic local process launcher, enabling arbitrary local command execution via executable selection even without a shell.

Description-Behavior Mismatch

Medium

Confidence: 89% confidence
Finding: The MCP server exposes calendar, CRM, contacts lookup, and call-log access in addition to telephony, which materially expands the data and action surface beyond the phone-focused description. That mismatch can mislead operators and downstream policy systems, causing them to grant the skill access in contexts where they would not have consented to broad personal-data access.

Description-Behavior Mismatch

Medium

Confidence: 87% confidence
Finding: executeSkillHandler dynamically loads and runs local skill handlers directly, bypassing the bridge and expanding the trust boundary to local binaries/files. In a skill advertised as a phone assistant, this hidden ability to execute non-phone handlers and access local data creates a capability-surprise problem that increases the chance of unauthorized data use.

Context-Inappropriate Capability

Medium

Confidence: 90% confidence
Finding: The contacts_lookup tool reads a local Apple Contacts export and returns rich personal details including phone numbers, emails, relationships, addresses, and notes. This is sensitive address-book data exposure that goes beyond what is strictly necessary for placing calls, especially because it can disclose third-party information not obviously required by the skill’s stated purpose.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The instructions explicitly tell the assistant to silently save caller name, email, company, and other personal details to CRM without announcing it or asking permission. Covert collection and storage of PII in a telephony workflow increases privacy, consent, and compliance risk, especially because callers may not know they are being profiled beyond the immediate call handling purpose.

Missing User Warnings

High

Confidence: 98% confidence
Finding: The skill directs the assistant to maintain personal context notes, log every call outcome, and update records at the end of every call without a clear notice to the caller. This creates a continuous surveillance/profile-building mechanism that can accumulate sensitive or intimate details over time and be reused in later interactions without informed consent.

Vague Triggers

Medium

Confidence: 78% confidence
Finding: The function description says the skill can 'look up calendar events, check availability, or create a new calendar entry' with broad natural-language scope and weak trigger constraints. In a voice-agent context, overly broad tool semantics increase the chance the model invokes the tool on ambiguous utterances, which is especially risky because the same tool also performs side effects.

Missing User Warnings

High

Confidence: 98% confidence
Finding: This is a direct unsafe design choice: the skill can create calendar entries, a user-data-modifying action, while the spec states no confirmation is required. In a voice workflow, transcription errors, prompt misalignment, or adversarial caller phrasing could cause unwanted calendar modifications with no hard confirmation barrier.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The document explicitly describes forwarding live call audio to OpenAI Realtime and also mentions call monitoring, but it does not include any warning about caller consent, handling of call content, metadata exposure, or retention practices. In a telephony skill, this omission is security/privacy-relevant because users may deploy call flows that process sensitive conversations without understanding disclosure and compliance requirements.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The roadmap includes 'Add call logging and monitoring' without any corresponding statement about what data is logged, how long it is retained, who can access it, or whether sensitive call content is included. That creates a real privacy risk because implementers may add broad logging by default in a voice assistant handling potentially sensitive phone conversations.

Vague Triggers

Medium

Confidence: 83% confidence
Finding: Describing the skill as usable with 'one natural-language prompt' encourages broad, ambiguous invocation without clear trigger boundaries or confirmation requirements at the skill-selection layer. In an agent environment, this can cause accidental activation of telephony or related side-effecting workflows from loosely related user requests.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal