pdf-ocr-layout

PassAudited by VirusTotal on May 12, 2026.

Overview

Type: OpenClaw Skill Name: pdf-ocr-layout Version: 1.0.2 The skill is classified as suspicious due to potential prompt injection vulnerabilities against the backend LLMs (GLM-4.7, GLM-4.6V) and potential path traversal vulnerabilities. The `script/glm_understanding.py` directly embeds content (`full_markdown_context`, `detected_title`) derived from the input document into the LLM prompts without sanitization, which could allow a malicious input document to inject instructions to the backend models. Additionally, the scripts perform file system operations using `file_path` and `output_dir` (e.g., in `script/glm_ocr_extract.py`), which, while necessary for functionality, could be exploited for path traversal if the OpenClaw agent is tricked into providing malicious paths. There is no evidence of intentional malicious behavior such as data exfiltration or backdoor installation; the identified issues are vulnerabilities rather than deliberate malice.

Findings (0)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

Note

ASI07: Insecure Inter-Agent Communication

What this means

Private or confidential documents processed by this skill may be transmitted to Zhipu's cloud service for OCR and analysis.

Why it was flagged

The selected input file is base64-encoded and sent to the Zhipu OCR API, which is central to the skill's purpose but means document contents leave the local environment.

Skill content

response = client.layout_parsing.create(
        model="glm-ocr",
        file=image_to_base64(file_path)
    )

Recommendation

Use only with documents you are allowed to send to Zhipu, and review the provider's data retention and privacy terms.

Note

ASI03: Identity and Privilege Abuse

What this means

The configured API key may incur usage charges and grants access to the Zhipu account associated with it.

Why it was flagged

The skill uses a Zhipu API key from the environment to call external models. This is expected for the stated integration, but it is a credential users should manage carefully.

Skill content

client = ZhipuAiClient(api_key=os.getenv("ZHIPU_API_KEY"))

Recommendation

Use a dedicated or least-privilege API key where possible, monitor usage, and consider updating the registry metadata to declare the required environment variable.

Note

ASI06: Memory and Context Poisoning

What this means

Extracted document text and cropped images can remain on disk after processing.

Why it was flagged

The script stores the source path, full OCR markdown context, and extracted elements in a JSON file under the output directory.

Skill content

payload = {
        "source_file": str(file_path),
        "full_markdown_context": full_context_md,
        "elements": extracted_elements
    }

Recommendation

Choose an appropriate output directory and delete generated JSON/images if the source document is sensitive.

Note

ASI01: Agent Goal Hijack

What this means

A maliciously crafted document could make the generated interpretation misleading or off-task.

Why it was flagged

OCR-extracted document text and table content are inserted directly into model prompts for analysis. This is purpose-aligned, but hostile text inside a document could try to steer the model's response.

Skill content

{full_context[:6000]}
...
{markdown_content}

Recommendation

Treat model-generated analysis as advisory, review results for unusual instructions or claims, and consider adding prompt guidance that document text is untrusted content.