senseaudio-video-gen

Security checks across malware telemetry and agentic risk

Overview

This is a coherent video-generation skill, but it needs review because it can combine website capture, local credentials or browser sessions, and external AI/API calls with limited privacy containment.

Install only if you are comfortable with a media-production tool that can call external AI/media providers and store generated project artifacts locally. Avoid using real browser profiles or exported cookies unless the site is approved for capture, prefer a dedicated temporary profile, use `--llm none` or saved/offline inputs for sensitive material, and do not enable vision audit or site-asset downloads for private or authenticated pages unless you have confirmed the data can leave the machine.

SkillSpector

By NVIDIA

Vulnerability Patterns

Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (13)

Tainted flow: 'url' from pathlib.Path.read_text (line 2399, file read) → urllib.request.urlopen (network output)

High

Category: Data Flow
Content: def download_url(url: str, output: str) -> str: out = Path(output) out.parent.mkdir(parents=True, exist_ok=True) with urllib.request.urlopen(url, timeout=300) as resp: out.write_bytes(resp.read()) return str(out)
Confidence: 97% confidence
Finding: with urllib.request.urlopen(url, timeout=300) as resp:

Tainted flow: 'req' from pathlib.Path.read_bytes (line 992, file read) → urllib.request.urlopen (network output)

High

Category: Data Flow
Content: headers=chat_headers(api_key, provider), ) try: with urllib.request.urlopen(req, timeout=120) as resp: raw = resp.read().decode("utf-8") except urllib.error.HTTPError as exc: body = exc.read().decode("utf-8", errors="replace")
Confidence: 91% confidence
Finding: with urllib.request.urlopen(req, timeout=120) as resp:

Tainted flow: 'req' from pathlib.Path.read_bytes (line 992, file read) → urllib.request.urlopen (network output)

High

Category: Data Flow
Content: }, ) try: with urllib.request.urlopen(req, timeout=180) as resp: raw = resp.read().decode("utf-8") except urllib.error.HTTPError as exc: body = exc.read().decode("utf-8", errors="replace")
Confidence: 96% confidence
Finding: with urllib.request.urlopen(req, timeout=180) as resp:

Tainted flow: 'req' from pathlib.Path.read_bytes (line 992, file read) → urllib.request.urlopen (network output)

High

Category: Data Flow
Content: for attempt in range(3): req = urllib.request.Request(url, headers=headers_map) try: with urllib.request.urlopen(req, timeout=timeout) as resp: raw = resp.read() charset = resp.headers.get_content_charset() or "utf-8" return raw.decode(charset, errors="replace")
Confidence: 95% confidence
Finding: with urllib.request.urlopen(req, timeout=timeout) as resp:

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The skill explicitly supports loading an external Chrome profile and imported cookies to capture websites, including authenticated or region-gated pages. That can expose session cookies, private account content, and internal pages to subsequent processing, screenshots, asset downloads, and third-party AI uploads, which is a real privacy and data-exfiltration risk in an agent context.

Context-Inappropriate Capability

Medium

Confidence: 94% confidence
Finding: The code can automatically download and archive up to 80 third-party assets discovered on inspected sites. In practice this may collect copyrighted, private, or authenticated resources into the local workspace without clear scoping or consent, expanding the blast radius of any capture beyond what the user expected.

Context-Inappropriate Capability

Medium

Confidence: 97% confidence
Finding: The vision-audit path uploads local render frames and screenshots to OpenRouter-compatible remote models. Those frames may contain proprietary website content, internal project material, or authenticated captures, so this is a direct external transmission channel for sensitive visual data.

Natural-Language Policy Violations

Medium

Confidence: 94% confidence
Finding: The README states the skill defaults generated video copy to Chinese unless the user explicitly requests another language. That can override user expectations and silently transform content into a different language, which is a security-relevant integrity and consent issue in an agent setting because outputs may become misleading, unusable, or socially engineered for an unintended audience. The surrounding context makes this more plausible because the skill is designed to autonomously generate customer-facing media from briefs and websites.

Missing User Warnings

Medium

Confidence: 91% confidence
Finding: The documentation explicitly says the CLI will read `SENSEAUDIO_API_KEY` from local workspace credential files if the environment variable is absent, but it provides no warning about secure storage, least-privilege access, or the risk of unintended credential discovery from other local contexts. In an agent skill that may process repositories and workspaces, this increases the chance of silent credential use and accidental cross-project secret exposure.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The one-pass website pipeline describes ingesting site evidence, screenshots, audio, and optional vision review by external LLM/media providers, but it does not warn users that website content, media, and derived artifacts may be transmitted off-box. Because this skill is designed to inspect websites, repos, and media, the omission can lead users to unknowingly send proprietary or sensitive content to third parties.

Natural-Language Policy Violations

Medium

Confidence: 87% confidence
Finding: Defaulting generated copy to `zh-CN` without explicit user choice can create misleading or unusable output, especially in workflows that may publish captions, narration, or marketing content automatically. While not a classic security flaw, it can cause integrity and trust issues by producing content in an unintended language.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The workflow explicitly tells operators to provide a public `--audio-url` so narration or music can influence generation, but it does not warn that making audio publicly accessible may expose sensitive voice content, proprietary media, or private URLs to third parties. In a media-production skill, users may upload drafts, client recordings, or internal assets, so normalizing public hosting without privacy guidance creates a real data-exposure risk.

Natural-Language Policy Violations

Medium

Confidence: 84% confidence
Finding: Defaulting generated copy to `zh-CN` without explicit user opt-in can cause unintended language output, misleading narration, and incorrect captions, especially for users expecting another language. While this is primarily a product safety and usability issue rather than a classic security flaw, in this context it can still cause accidental publication of incorrect or inaccessible content.

VirusTotal

VirusTotal findings are pending for this skill version.

View on VirusTotal