Office Document Extractor

AdvisoryAudited by Static analysis on May 4, 2026.

Overview

No suspicious patterns detected.

Findings (0)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Running batch mode on a sensitive folder may create Markdown copies of multiple private documents.

Why it was flagged

Batch mode reads every supported Office file in the user-selected directory and writes Markdown outputs. This is expected for the converter, but users should be aware of the directory-wide local file operation.

Skill content
files = [f for f in input_path.iterdir() if f.suffix.lower() in supported]
...
out_file = output_path / f"{file.stem}.md"
Recommendation

Run it only on intended files or folders, and choose an output directory you are comfortable storing extracted document text in.

What this means

If the bundled dependency code were altered upstream or packaged incorrectly, the converter could behave differently than expected.

Why it was flagged

The skill discloses bundled third-party dependencies. Bundling is purpose-aligned and avoids pip/network installs, but it still means users rely on the integrity of included vendored code.

Skill content
- **openpyxl/** — Pure Python Excel library (v3.1.5)
- **et_xmlfile/** — openpyxl dependency (pure Python)
Recommendation

Prefer versions from a trusted publisher, and review or verify bundled dependency provenance when using the skill on sensitive documents.

What this means

Private document contents may be placed into Markdown and then reused in analysis or LLM context; malicious document text could also be mistaken for instructions if not treated as data.

Why it was flagged

The extracted document text may later be used as model context or indexed. That is the stated purpose, but Office documents can contain sensitive data or untrusted instructions.

Skill content
Use when the user needs to extract text from Word documents, Excel spreadsheets, or PowerPoint presentations for analysis, indexing, or LLM processing.
Recommendation

Treat extracted Markdown as sensitive document data, and do not let instructions inside converted documents override the user's actual request.