Security audit

Video Podcast Maker

Security checks across malware telemetry and agentic risk

Overview

The skill mostly matches a video-production workflow, but it includes several high-impact local actions and unrelated deployment files that need review before installation.

Review carefully before installing. Use it only in a disposable or well-backed-up project workspace, do not run the bundled Onyx deployment unless you intentionally want that separate stack, avoid granting Docker socket access, confirm before any git pull or process kill, and do not send confidential scripts or prompts to cloud TTS/search providers without checking their data policies.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Output HandlingUnvalidated Output Injection, Cross-Context Output, Unbounded Output
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (24)

Context-Inappropriate Capability

Medium

Confidence: 98% confidence
Finding: The delete flow combines a user-controlled output directory with a user-controlled reference ID and passes the joined path to shutil.rmtree without constraining it to an expected base directory. An attacker can supply values such as a crafted ref_id or alternate output root to delete arbitrary directories accessible to the process, causing destructive data loss.

Context-Inappropriate Capability

High

Confidence: 99% confidence
Finding: The compose stack includes a code-interpreter container that runs as root and mounts the host Docker socket, which effectively grants control over the host Docker daemon. Any compromise of that service can lead to container escape-by-design, arbitrary container creation, host filesystem access, and theft of secrets from sibling containers. In a video-podcast-making skill, this capability is not clearly justified and substantially increases risk.

Context-Inappropriate Capability

Medium

Confidence: 95% confidence
Finding: The workflow instructs killing any process bound to port 3000 using `lsof -ti:3000 | xargs kill -9`, which is unrelated to the specific project process and can terminate arbitrary local services. In a skill context, this exceeds the minimum scope needed for video generation and can disrupt development servers, dashboards, or other user workloads without consent.

Missing User Warnings

Low

Confidence: 92% confidence
Finding: The README advertises web research and content gathering but does not disclose that user prompts, URLs, or gathered content may be sent to external websites or services. In an agent-driven workflow, this can lead users to unknowingly expose sensitive topics or proprietary material during automated research.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The README lists multiple third-party TTS and AI providers but does not warn that narration scripts, prompts, subtitles, or other content may be transmitted to those external APIs for processing. This creates a real data-handling risk because users may assume generation is local while actually sending sensitive content to cloud vendors with separate retention and logging practices.

Missing User Warnings

Low

Confidence: 89% confidence
Finding: The README states that the tool auto-learns and reuses user style preferences across projects, but it does not clearly warn that this creates persistent stored preference data. In shared environments or multi-project workflows, silent persistence can expose behavioral preferences or project metadata beyond user expectations.

Missing User Warnings

Low

Confidence: 92% confidence
Finding: The README advertises web research and content collection as part of the workflow, but it does not clearly warn users that their prompts, topics, URLs, and possibly collected content may be transmitted to third-party services or APIs. In a coding-agent skill, this matters because users may provide sensitive topics or proprietary source material, and the omission can lead to unintended data disclosure.

Missing User Warnings

Low

Confidence: 95% confidence
Finding: The README states that the system automatically learns and persists user style preferences, but it does not clearly disclose retention, storage location, contents, or how to disable or delete that data. Persistent profiling is a privacy/security concern because preference files may reveal behavioral patterns and can accumulate sensitive creative or operational metadata over time.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The skill instructs the agent to perform a daily remote `git fetch` automatically and to offer `git pull` updates, which causes network access to an external repository before the user has explicitly consented to contacting that remote. In a skill context, silent remote access expands the trust boundary and can leak repository metadata or expose users to supply-chain risk if the upstream is compromised.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The prerequisites check tests for `AZURE_SPEECH_KEY` presence by reading a sensitive environment variable without first warning the user that secret-bearing environment state will be inspected. Even though it only checks existence, secret access patterns should be minimized and explicitly disclosed because environment variables often hold high-value credentials.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The CLI performs immediate destructive deletion of reference data with no confirmation, dry-run, or safety interlock. In an automation-oriented skill, accidental or manipulated invocation can permanently remove local assets and associated metadata more easily than intended.

Missing User Warnings

Medium

Confidence: 97% confidence
Finding: The README tells users to fetch a shell script over the network and execute it immediately, which is a classic supply-chain risk. If the remote source, repository, branch, or transport path is ever compromised, users could run attacker-controlled code on their systems without reviewing it first.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: This section assigns insecure default credentials for sensitive services, including the OpenSearch admin password and S3/MinIO access keys. Default secrets are commonly known, frequently scanned for, and can enable unauthorized access if the deployment is exposed or later misconfigured. The skill context does not justify shipping known credentials for a media-generation workflow.

Missing User Warnings

Medium

Confidence: 98% confidence
Finding: The database service defaults POSTGRES_PASSWORD to a trivial value of 'password'. If the database becomes reachable from another container, developer workstation, or accidentally exposed port, an attacker could authenticate with predictable credentials and access or modify application data.

Missing User Warnings

Medium

Confidence: 99% confidence
Finding: MinIO is configured with well-known default root credentials ('minioadmin'/'minioadmin'), which are widely abused when object storage is reachable. Because this stack also uses MinIO for file storage, compromise could expose uploaded content, generated assets, and potentially other sensitive application data.

Missing User Warnings

Medium

Confidence: 85% confidence
Finding: The reset command blindly copies the template over the live preferences file, which can irreversibly destroy user configuration and learned settings if triggered accidentally or ambiguously. In an agent context, natural-language commands can be misinterpreted, so destructive file overwrite behavior without confirmation increases the risk of unintended data loss.

Missing User Warnings

Medium

Confidence: 87% confidence
Finding: The skill auto-creates and later migrates `user_prefs.json` without explicit user confirmation, causing persistent local state changes. Silent writes to configuration files can surprise users, overwrite expected workflow assumptions, and normalize unauthorized persistence in an agent workflow.

Missing User Warnings

Medium

Confidence: 84% confidence
Finding: The workflow automatically performs WebSearch/WebFetch research, which may transmit user topics or sensitive request context to external services without a clear warning. In a research step this is plausible functionality, but lack of disclosure can expose confidential ideas, internal project names, or personal data.

Missing User Warnings

High

Confidence: 97% confidence
Finding: Force-killing any process on port 3000 without warning is a concrete destructive action against the local environment. Because the command uses `kill -9` and does not verify ownership or purpose of the process, it can terminate unrelated applications and cause data loss or service interruption.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: This backend sends user-provided text chunks to Alibaba DashScope's remote TTS service via SpeechSynthesizer.streaming_call without any indication in this file that users are informed their content leaves the local environment. In a podcast-generation skill, inputs may contain sensitive drafts, private notes, or proprietary material, so undisclosed third-party transmission creates a real privacy and data-governance risk even if the code is functioning as intended.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: This backend sends raw text chunks and a user identifier to a third-party TTS service, which can expose sensitive prompts, names, or proprietary content if users are unaware of the transfer. In a video-podcast generation skill, large amounts of user-provided script content are likely to be transmitted, so the privacy impact is real even if the behavior is functionally expected.

Missing User Warnings

Medium

Confidence: 89% confidence
Finding: The code sends chunk text to ElevenLabs, which may contain user-provided or sensitive content, without any visible consent, warning, minimization, or policy gate in this component. In a podcast-generation skill, prompts and scripts can include proprietary, personal, or confidential material, so silent third-party transmission creates a real privacy and data-governance risk.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code sends raw text chunks to Google Cloud TTS over the network along with an API key, but there is no indication in this component of user consent, disclosure, redaction, or policy gating for potentially sensitive content. In a skill that may process arbitrary user-supplied podcast scripts or learned content from external sources, this can expose private or regulated data to a third-party service unexpectedly.

Unvalidated Output Injection

High

Category: Output Handling
Content: for pf in part_files: f.write(f"file '{os.path.basename(pf)}'\n") concat_result = subprocess.run( ["ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", concat_list, "-c", "copy", output_wav], capture_output=True, text=True, cwd=args.output_dir) if concat_result.returncode != 0:
Confidence: 80% confidence
Finding: subprocess.run( ["ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", concat_list, "-c", "copy", output_wav], capture_output=True, text=True, cwd=args.output

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal