Hwp Extract Pipeline

Security checks across malware telemetry and agentic risk

Overview

This is a legitimate local document-extraction skill, but it saves extracted document text to disk using an unsanitized user-provided ID.

Install only if you are comfortable with extracted document text being written to local JSON files. Use simple IDs containing only letters, numbers, dashes, or underscores; run it from a directory intended for outputs; delete generated JSON files when no longer needed; and only use trusted local hwp-reader, pyhwp, venv, and strings binaries.

SkillSpector

By NVIDIA

Vulnerability Patterns

MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
MCP Tool PoisoningHidden Instructions, Unicode Deception, Parameter Description Injection
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (5)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 88% confidence
Finding: The skill advertises shell execution and file-writing behavior without declaring permissions, which weakens any permission-based safety model and can surprise callers into granting the skill more trust than it deserves. In this context, the skill processes local files and invokes external tooling, so undeclared capabilities increase the risk of unintended file modification or command execution paths.

Tp4

High

Category: MCP Tool Poisoning
Confidence: 95% confidence
Finding: The documented behavior materially differs from the apparent implementation: claimed PDF/OCR support is absent, while undocumented parsing methods and local JSON writes are present. This is dangerous because operators may rely on the description to make security and workflow decisions, leading to unexpected file writes, unsafe handling of untrusted inputs, or missed review of actual extraction paths.

Description-Behavior Mismatch

Medium

Confidence: 91% confidence
Finding: The extractor writes full document contents to a predictable local JSON file named from the user-supplied record ID, which can persist sensitive attachment data on disk beyond the immediate extraction task. In an agent environment handling private documents, this creates an unnecessary local data exposure surface and can also allow path manipulation if rec_id contains path separators.

Missing User Warnings

Low

Confidence: 82% confidence
Finding: The skill states that it writes extracted JSON into a local data folder but does not clearly warn users that running it modifies the filesystem. While the write appears related to the skill's purpose, undocumented or insufficiently disclosed local writes can still cause accidental overwrites, data leakage into shared directories, or misuse in more privileged environments.

Missing User Warnings

Medium

Confidence: 94% confidence
Finding: The code silently writes extracted document text to a local JSON artifact without explicit user consent or disclosure, which can leak sensitive source material into the workspace, logs, backups, or later agent steps. Because this skill is specifically designed to process attachments, the context makes undisclosed persistence more dangerous than in a generic utility.

VirusTotal

64/64 vendors flagged this skill as clean.

View on VirusTotal