Doc Process

Security checks across malware telemetry and agentic risk

Overview

This is a useful document-processing skill, but it needs review because it can automatically run setup that installs packages and system binaries while handling sensitive documents.

Review setup.sh and requirements.txt before first use, and do not allow automatic setup or sudo installs unless you are comfortable changing the host environment. Use the skill only on intended documents, be careful with IDs, bank statements, medical records, and resumes, and enable timeline or expense logging only after confirming the exact destination files.

SkillSpector

By NVIDIA
Vulnerability Patterns
  • Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access
  • Excessive AgencyUnrestricted Tool Access, Autonomous Decision Making, Scope Creep
  • Trigger AbuseOverly Broad Trigger, Shadow Command Trigger, Keyword Baiting Trigger
  • MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
  • Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Findings (28)

Intent-Code Divergence

Medium
Confidence
92% confidence
Finding
The skill claims the timeline contains no personal data, yet it also says the log stores filenames. Filenames frequently contain names, account identifiers, case numbers, or medical references, so this can silently persist sensitive metadata contrary to the privacy promise.

Description-Behavior Mismatch

Medium
Confidence
98% confidence
Finding
The eval file declares a different skill name ('open-claw') than the provided manifest context ('doc-process'), which means the test suite is validating the wrong capability set. This can cause incorrect deployment, misleading assurance, and policy bypass if a benign-looking manifest is paired with evals for another skill.

Intent-Code Divergence

Medium
Confidence
96% confidence
Finding
The guide's general rules say to never echo full passport or ID numbers and to mask middle digits when displaying them, but the example output shows a full passport number. In a document-autofill skill that routinely handles highly sensitive identity data, this contradiction can cause implementations to reveal full credentials in chat output, increasing the risk of shoulder surfing, transcript exposure, logging leakage, or downstream retention of identity documents.

Context-Inappropriate Capability

High
Confidence
93% confidence
Finding
This script adds audio transcription functionality to a skill whose declared scope is document processing, creating a capability mismatch that can bypass user and platform expectations about what data types the skill handles. Even though the code itself is straightforward, out-of-scope media ingestion expands access to sensitive content such as meetings, calls, or voice notes, which is especially risky in a skill already positioned to process highly sensitive documents like IDs, bank statements, and medical records.

Description-Behavior Mismatch

Medium
Confidence
93% confidence
Finding
The file implements a standalone expense-log manager rather than document-intelligence processing. In an agent skill advertised for document analysis, this capability expands the skill's effective authority into persistent financial record manipulation, which can enable unauthorized creation, alteration, or deletion of local expense data if the skill is invoked in broader workflows.

Context-Inappropriate Capability

Medium
Confidence
96% confidence
Finding
These CLI handlers expose add, delete, edit, and export operations over expense records, giving the skill general-purpose database modification behavior unrelated to the declared document-processing role. In an agent setting, such hidden write/delete capabilities are dangerous because extracted document content or user prompts could be transformed into file mutations, resulting in silent tampering with financial logs or exfiltration via export.

Description-Behavior Mismatch

Medium
Confidence
95% confidence
Finding
The setup script installs and enables ffmpeg for audio transcription even though the skill metadata focuses on document processing and does not disclose audio capabilities. This expands the skill's effective capability surface beyond what a reviewer or user would reasonably expect, which can undermine consent, auditing, and policy controls around media handling.

Vague Triggers

High
Confidence
89% confidence
Finding
The trigger phrases are extremely broad, including common expressions like 'analyze this' and 'what is this,' which can cause the skill to activate in unintended contexts. In a skill with Bash, file read/write, and optional package installation, overbroad activation increases the chance of surprising document access or script execution from ambiguous user requests.

Vague Triggers

Medium
Confidence
85% confidence
Finding
The ambiguous fallback path routes unclear requests into document categorization, which still moves the interaction toward reading user files after only minimal clarification. In a sensitive-data skill, weak intent disambiguation can lead to accidental processing of personal, legal, financial, or medical documents the user did not clearly intend to submit for analysis.

Missing User Warnings

Medium
Confidence
96% confidence
Finding
The skill instructs automatic execution of `bash skills/doc-process/setup.sh` with no prompting, and that setup installs Python packages plus system binaries via package managers. Automatic shell-based installation materially expands host-side risk: it can modify the environment, pull remote code, and perform privileged or semi-privileged changes without informed user consent.

Vague Triggers

Medium
Confidence
86% confidence
Finding
The trigger examples include broad everyday phrasing such as 'what is this' and 'analyze this', which can cause the skill to activate on unrelated user content. In a document-processing skill that handles sensitive records, over-broad triggering increases the chance of unintended processing of private data or execution in the wrong context.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
This guide instructs processing highly sensitive financial documents containing account details, balances, transaction history, and potentially identifying information, but it provides no privacy notice, minimization guidance, retention limits, or safe-handling requirements. In this skill context, that omission is more dangerous because the workflow explicitly encourages broad extraction, categorization, anomaly detection, and report generation on bank statements and credit card data, increasing the chance of over-collection, insecure handling, and unintended disclosure.

Vague Triggers

Medium
Confidence
95% confidence
Finding
The categorizer is designed to run when a user uploads a file with vague prompts like 'analyze this' or even with no text at all, which increases the chance of unintended activation on sensitive documents. In a document-processing skill that can handle IDs, bank statements, medical records, and contracts, overly broad triggering can lead to privacy-invasive processing without sufficiently specific user intent.

Vague Triggers

Medium
Confidence
92% confidence
Finding
Accepting generic confirmations such as 'yes' as authorization to read the first 1–2 pages is too ambiguous in a multi-step interaction, because the user's 'yes' may refer to another question or may not reflect informed consent to inspect sensitive content. Given this skill's document types include medical, financial, and identity records, vague consent language materially raises the risk of unauthorized reading of private data.

Natural-Language Policy Violations

Medium
Confidence
89% confidence
Finding
The rule 'Document not in English and user communicates in English → Document Translator' assumes translation into English without first asking the user whether they want translation, which can change the intended processing mode and expose sensitive content unnecessarily. In this context, non-English documents may be passports, medical records, or contracts, so auto-routing to translation can expand data handling beyond what the user requested.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The JSON export writes the full timeline to a second file in the user's Documents folder without any warning or confirmation, unlike the Markdown save flow. Even though the timeline is described as PII-minimized, it still contains filenames, timestamps, document types, and summaries, which can be sensitive and create additional persistence and disclosure risk.

Natural-Language Policy Violations

Medium
Confidence
90% confidence
Finding
The guide explicitly instructs the skill to translate any non-English medical document to English before summarizing, without checking user preference or obtaining consent. In a medical-document context, automatic translation can expose highly sensitive health information to additional processing, create privacy/compliance issues, and risk meaning distortion in clinical content if the user expected the original language to be preserved.

Missing User Warnings

Medium
Confidence
93% confidence
Finding
The guide explicitly instructs using a local script to add entries to `expenses.csv` and defaults to creating or modifying that file in the current directory if the user has not specified a path. That can cause unintended local state changes or silent data creation without clear confirmation before execution, which is risky for an agent skill that may be invoked from natural-language prompts.

Missing User Warnings

Medium
Confidence
92% confidence
Finding
The guidance explicitly instructs extraction of extensive resume data, including contact details, location, employment history, education, certifications, and potentially URLs and other identifiers, but it contains no privacy notice, minimization guidance, retention limits, or instructions for handling sensitive personal data safely. In a document-processing skill, this increases the risk of over-collection, downstream disclosure, and inappropriate use of PII, especially if users submit third-party resumes or regulated data.

Missing User Warnings

Medium
Confidence
94% confidence
Finding
The script stores document names, types, and summaries in a persistent file under the user's home directory without a clear warning or explicit consent flow. In this skill context, summaries may contain sensitive financial, legal, medical, or identity-document metadata, so silent local retention increases privacy and data exposure risk on shared or compromised systems.

Missing User Warnings

Medium
Confidence
95% confidence
Finding
The save/export commands copy potentially sensitive timeline contents to user-specified Markdown or JSON files without warning about the sensitivity of the data being written. Because this skill processes receipts, contracts, bank statements, IDs, and medical records, exported filenames and summaries can leak high-value metadata into less protected locations, backups, or shared folders.

Sudo/Root Execution

Medium
Category
Privilege Escalation
Content
else
        echo "  ensurepip unavailable — trying package manager..."
        if command -v apt-get &>/dev/null; then
            sudo apt-get install -y python3-pip
        elif command -v yum &>/dev/null; then
            sudo yum install -y python3-pip
        elif command -v dnf &>/dev/null; then
Confidence
84% confidence
Finding
sudo

Sudo/Root Execution

Medium
Category
Privilege Escalation
Content
if command -v apt-get &>/dev/null; then
            sudo apt-get install -y python3-pip
        elif command -v yum &>/dev/null; then
            sudo yum install -y python3-pip
        elif command -v dnf &>/dev/null; then
            sudo dnf install -y python3-pip
        elif command -v brew &>/dev/null; then
Confidence
84% confidence
Finding
sudo

Sudo/Root Execution

Medium
Category
Privilege Escalation
Content
elif command -v yum &>/dev/null; then
            sudo yum install -y python3-pip
        elif command -v dnf &>/dev/null; then
            sudo dnf install -y python3-pip
        elif command -v brew &>/dev/null; then
            brew install python   # pip is bundled with Homebrew Python
        else
Confidence
84% confidence
Finding
sudo

Sudo/Root Execution

Medium
Category
Privilege Escalation
Content
if command -v brew &>/dev/null; then
        brew install tesseract
    elif command -v apt-get &>/dev/null; then
        sudo apt-get install -y tesseract-ocr
    elif command -v dnf &>/dev/null; then
        sudo dnf install -y tesseract
    else
Confidence
87% confidence
Finding
sudo

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal