Sogni Gen

Security checks across malware telemetry and agentic risk

Overview

This skill is an image and video generation integration that uses Sogni credentials and optional local media in ways that are mostly disclosed and aligned with its purpose.

Install only if you are comfortable giving this skill access to your Sogni credentials and letting it send chosen prompts and reference media, including face photos for photobooth, to Sogni. Review or change the credentials, last-render, inbound media, and downloads paths; set SOGNI_MCP_SAVE_DOWNLOADS=0 if you do not want MCP result copies saved locally.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (14)

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The skill exposes local media and file-management style utilities such as listing recent local media files, extracting frames from local video files, and concatenating local clips. These capabilities go beyond the advertised image/video generation purpose and expand the agent’s access to local user data and filesystem workflows, increasing the risk of unintended data exposure or misuse.

Vague Triggers

Medium

Confidence: 90% confidence
Finding: The skill advertises very broad activation phrases like 'draw', 'generate', and 'make a video/animate', which can cause the agent to invoke this skill in contexts the user did not specifically intend. Because the skill can read local credential/config files and process local media, accidental activation increases the chance of unintended file access, external API calls, and privacy-sensitive media handling.

Vague Triggers

Medium

Confidence: 88% confidence
Finding: The photobooth guidance says to use face-transfer mode whenever a user 'mentions photobooth' or wants a stylized portrait, without requiring clear confirmation that the provided image is the user's and that they consent to biometric-style face processing. In a skill that handles local files and face images, overly broad triggering can lead to privacy-invasive processing of sensitive images.

Vague Triggers

Medium

Confidence: 94% confidence
Finding: The mandatory rule to always activate photobooth mode whenever the word 'photobooth' appears is an unsafe keyword trigger with no contextual safeguards. This can force face-transfer processing even in ambiguous, quoted, hypothetical, or non-consensual contexts, raising privacy and misuse risks for biometric-like image manipulation.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The face-transfer feature processes face photos but does not prominently warn about privacy, consent, retention, or the sensitivity of biometric-related imagery. In this context, the skill sends media to an external decentralized network and may save local outputs, so missing warnings meaningfully increases the risk of unauthorized or unexpected processing of sensitive personal data.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The manifest accepts a username/password and injects them into the server process environment, but the user-facing metadata does not clearly warn that these credentials will be read by the local extension and potentially transmitted to Sogni services for authentication. Environment variables are commonly exposed to subprocesses, logs, crash reports, and local inspection, so handling secrets this way increases the chance of credential disclosure beyond the minimum necessary scope.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: Generated media URLs are automatically fetched and, by default, written to a local downloads directory, but the tool descriptions do not warn users about this behavior. Silent local persistence can surprise users, create unwanted artifacts on disk, and leak sensitive prompts or generated content into the local filesystem.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The video-generation tool can cause outputs to be automatically downloaded and saved locally, but this is not communicated in the tool description. Because video files are larger and often more sensitive, undisclosed persistence increases privacy and storage risks in normal use.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The image editing tool accepts local images and may also auto-download and save result media locally without user-facing disclosure. This combination of local input handling and undisclosed output persistence can expose private content and create data-retention surprises.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: The photobooth tool processes face images, which are especially sensitive biometric-related inputs, yet the description omits that generated portraits may be automatically saved locally. In this context, undisclosed persistence is more dangerous because it involves personal likeness data and potentially private portraits.

Missing User Warnings

Medium

Confidence: 88% confidence
Finding: The skill automatically reads Sogni credentials from local files and environment variables without any explicit user-facing notice at the point of use. In an agent-skill context, silent credential harvesting from host state is sensitive because users may not realize the skill is accessing persisted secrets to authenticate outbound actions.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The skill can read local files as media buffers and then upload those buffers to the external Sogni service as reference images, audio, video, or context, but it does not prominently warn users that local content may leave the machine. In an agent setting, this creates a real privacy and data-exfiltration risk if a user supplies sensitive local paths or the agent chooses them implicitly.

Ssd 3

Medium

Confidence: 90% confidence
Finding: Error payloads include the full prompt, timestamp, cwd, and related context, which can expose sensitive natural-language inputs in logs, stdout captures, orchestration layers, or downstream telemetry. In agent environments, prompts often contain private user data, making this a meaningful data-leak channel even without traditional code-execution impact.

Ssd 3

Medium

Confidence: 95% confidence
Finding: The skill persists full prompts, reference paths, URLs, context images, and related generation metadata to a last-render file in the user's home directory. This creates a durable local record of potentially sensitive content that other local users, tools, backups, or later agent actions could read and misuse.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal