Doc Process

Security checks across malware telemetry and agentic risk

Overview

This is a useful document-processing skill, but it needs review because it can automatically run setup that installs packages and system binaries while handling sensitive documents.

Review setup.sh and requirements.txt before first use, and do not allow automatic setup or sudo installs unless you are comfortable changing the host environment. Use the skill only on intended documents, be careful with IDs, bank statements, medical records, and resumes, and enable timeline or expense logging only after confirming the exact destination files.

SkillSpector

By NVIDIA

Vulnerability Patterns

Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands

Findings (28)

Intent-Code Divergence

Medium

Confidence: 92% confidence
Finding: The skill claims the timeline contains no personal data, yet it also says the log stores filenames. Filenames frequently contain names, account identifiers, case numbers, or medical references, so this can silently persist sensitive metadata contrary to the privacy promise.

Description-Behavior Mismatch

Medium

Confidence: 98% confidence
Finding: The eval file declares a different skill name ('open-claw') than the provided manifest context ('doc-process'), which means the test suite is validating the wrong capability set. This can cause incorrect deployment, misleading assurance, and policy bypass if a benign-looking manifest is paired with evals for another skill.

Intent-Code Divergence

Medium

Confidence: 96% confidence
Finding: The guide's general rules say to never echo full passport or ID numbers and to mask middle digits when displaying them, but the example output shows a full passport number. In a document-autofill skill that routinely handles highly sensitive identity data, this contradiction can cause implementations to reveal full credentials in chat output, increasing the risk of shoulder surfing, transcript exposure, logging leakage, or downstream retention of identity documents.

Context-Inappropriate Capability

High

Confidence: 93% confidence
Finding: This script adds audio transcription functionality to a skill whose declared scope is document processing, creating a capability mismatch that can bypass user and platform expectations about what data types the skill handles. Even though the code itself is straightforward, out-of-scope media ingestion expands access to sensitive content such as meetings, calls, or voice notes, which is especially risky in a skill already positioned to process highly sensitive documents like IDs, bank statements, and medical records.

Description-Behavior Mismatch

Medium

Confidence: 93% confidence
Finding: The file implements a standalone expense-log manager rather than document-intelligence processing. In an agent skill advertised for document analysis, this capability expands the skill's effective authority into persistent financial record manipulation, which can enable unauthorized creation, alteration, or deletion of local expense data if the skill is invoked in broader workflows.

Context-Inappropriate Capability

Medium

Confidence: 96% confidence
Finding: These CLI handlers expose add, delete, edit, and export operations over expense records, giving the skill general-purpose database modification behavior unrelated to the declared document-processing role. In an agent setting, such hidden write/delete capabilities are dangerous because extracted document content or user prompts could be transformed into file mutations, resulting in silent tampering with financial logs or exfiltration via export.

Description-Behavior Mismatch

Medium

Confidence: 95% confidence
Finding: The setup script installs and enables ffmpeg for audio transcription even though the skill metadata focuses on document processing and does not disclose audio capabilities. This expands the skill's effective capability surface beyond what a reviewer or user would reasonably expect, which can undermine consent, auditing, and policy controls around media handling.

Vague Triggers

High

Confidence: 89% confidence
Finding: The trigger phrases are extremely broad, including common expressions like 'analyze this' and 'what is this,' which can cause the skill to activate in unintended contexts. In a skill with Bash, file read/write, and optional package installation, overbroad activation increases the chance of surprising document access or script execution from ambiguous user requests.

Vague Triggers

Medium

Confidence: 85% confidence
Finding: The ambiguous fallback path routes unclear requests into document categorization, which still moves the interaction toward reading user files after only minimal clarification. In a sensitive-data skill, weak intent disambiguation can lead to accidental processing of personal, legal, financial, or medical documents the user did not clearly intend to submit for analysis.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill instructs automatic execution of `bash skills/doc-process/setup.sh` with no prompting, and that setup installs Python packages plus system binaries via package managers. Automatic shell-based installation materially expands host-side risk: it can modify the environment, pull remote code, and perform privileged or semi-privileged changes without informed user consent.

Vague Triggers

Medium

Confidence: 86% confidence
Finding: The trigger examples include broad everyday phrasing such as 'what is this' and 'analyze this', which can cause the skill to activate on unrelated user content. In a document-processing skill that handles sensitive records, over-broad triggering increases the chance of unintended processing of private data or execution in the wrong context.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: This guide instructs processing highly sensitive financial documents containing account details, balances, transaction history, and potentially identifying information, but it provides no privacy notice, minimization guidance, retention limits, or safe-handling requirements. In this skill context, that omission is more dangerous because the workflow explicitly encourages broad extraction, categorization, anomaly detection, and report generation on bank statements and credit card data, increasing the chance of over-collection, insecure handling, and unintended disclosure.

Vague Triggers

Medium

Confidence: 95% confidence
Finding: The categorizer is designed to run when a user uploads a file with vague prompts like 'analyze this' or even with no text at all, which increases the chance of unintended activation on sensitive documents. In a document-processing skill that can handle IDs, bank statements, medical records, and contracts, overly broad triggering can lead to privacy-invasive processing without sufficiently specific user intent.

Vague Triggers

Medium

Confidence: 92% confidence
Finding: Accepting generic confirmations such as 'yes' as authorization to read the first 1–2 pages is too ambiguous in a multi-step interaction, because the user's 'yes' may refer to another question or may not reflect informed consent to inspect sensitive content. Given this skill's document types include medical, financial, and identity records, vague consent language materially raises the risk of unauthorized reading of private data.

Natural-Language Policy Violations

Medium

Confidence: 89% confidence
Finding: The rule 'Document not in English and user communicates in English → Document Translator' assumes translation into English without first asking the user whether they want translation, which can change the intended processing mode and expose sensitive content unnecessarily. In this context, non-English documents may be passports, medical records, or contracts, so auto-routing to translation can expand data handling beyond what the user requested.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The JSON export writes the full timeline to a second file in the user's Documents folder without any warning or confirmation, unlike the Markdown save flow. Even though the timeline is described as PII-minimized, it still contains filenames, timestamps, document types, and summaries, which can be sensitive and create additional persistence and disclosure risk.

Natural-Language Policy Violations

Medium

Confidence: 90% confidence
Finding: The guide explicitly instructs the skill to translate any non-English medical document to English before summarizing, without checking user preference or obtaining consent. In a medical-document context, automatic translation can expose highly sensitive health information to additional processing, create privacy/compliance issues, and risk meaning distortion in clinical content if the user expected the original language to be preserved.

Missing User Warnings

Medium

Confidence: 93% confidence
Finding: The guide explicitly instructs using a local script to add entries to `expenses.csv` and defaults to creating or modifying that file in the current directory if the user has not specified a path. That can cause unintended local state changes or silent data creation without clear confirmation before execution, which is risky for an agent skill that may be invoked from natural-language prompts.

Missing User Warnings

Medium

Confidence: 92% confidence
Finding: The guidance explicitly instructs extraction of extensive resume data, including contact details, location, employment history, education, certifications, and potentially URLs and other identifiers, but it contains no privacy notice, minimization guidance, retention limits, or instructions for handling sensitive personal data safely. In a document-processing skill, this increases the risk of over-collection, downstream disclosure, and inappropriate use of PII, especially if users submit third-party resumes or regulated data.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The script stores document names, types, and summaries in a persistent file under the user's home directory without a clear warning or explicit consent flow. In this skill context, summaries may contain sensitive financial, legal, medical, or identity-document metadata, so silent local retention increases privacy and data exposure risk on shared or compromised systems.

Missing User Warnings

Medium

Confidence: 95% confidence
Finding: The save/export commands copy potentially sensitive timeline contents to user-specified Markdown or JSON files without warning about the sensitivity of the data being written. Because this skill processes receipts, contracts, bank statements, IDs, and medical records, exported filenames and summaries can leak high-value metadata into less protected locations, backups, or shared folders.

Sudo/Root Execution

Medium

Category: Privilege Escalation
Content: else echo " ensurepip unavailable — trying package manager..." if command -v apt-get &>/dev/null; then sudo apt-get install -y python3-pip elif command -v yum &>/dev/null; then sudo yum install -y python3-pip elif command -v dnf &>/dev/null; then
Confidence: 84% confidence
Finding: sudo

Sudo/Root Execution

Medium

Category: Privilege Escalation
Content: if command -v apt-get &>/dev/null; then sudo apt-get install -y python3-pip elif command -v yum &>/dev/null; then sudo yum install -y python3-pip elif command -v dnf &>/dev/null; then sudo dnf install -y python3-pip elif command -v brew &>/dev/null; then
Confidence: 84% confidence
Finding: sudo

Sudo/Root Execution

Medium

Category: Privilege Escalation
Content: elif command -v yum &>/dev/null; then sudo yum install -y python3-pip elif command -v dnf &>/dev/null; then sudo dnf install -y python3-pip elif command -v brew &>/dev/null; then brew install python # pip is bundled with Homebrew Python else
Confidence: 84% confidence
Finding: sudo

Sudo/Root Execution

Medium

Category: Privilege Escalation
Content: if command -v brew &>/dev/null; then brew install tesseract elif command -v apt-get &>/dev/null; then sudo apt-get install -y tesseract-ocr elif command -v dnf &>/dev/null; then sudo dnf install -y tesseract else
Confidence: 87% confidence
Finding: sudo

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal