pdf-ocr-layout
PassAudited by VirusTotal on May 12, 2026.
Overview
Type: OpenClaw Skill Name: pdf-ocr-layout Version: 1.0.2 The skill is classified as suspicious due to potential prompt injection vulnerabilities against the backend LLMs (GLM-4.7, GLM-4.6V) and potential path traversal vulnerabilities. The `script/glm_understanding.py` directly embeds content (`full_markdown_context`, `detected_title`) derived from the input document into the LLM prompts without sanitization, which could allow a malicious input document to inject instructions to the backend models. Additionally, the scripts perform file system operations using `file_path` and `output_dir` (e.g., in `script/glm_ocr_extract.py`), which, while necessary for functionality, could be exploited for path traversal if the OpenClaw agent is tricked into providing malicious paths. There is no evidence of intentional malicious behavior such as data exfiltration or backdoor installation; the identified issues are vulnerabilities rather than deliberate malice.
Findings (0)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Private or confidential documents processed by this skill may be transmitted to Zhipu's cloud service for OCR and analysis.
The selected input file is base64-encoded and sent to the Zhipu OCR API, which is central to the skill's purpose but means document contents leave the local environment.
response = client.layout_parsing.create(
model="glm-ocr",
file=image_to_base64(file_path)
)Use only with documents you are allowed to send to Zhipu, and review the provider's data retention and privacy terms.
The configured API key may incur usage charges and grants access to the Zhipu account associated with it.
The skill uses a Zhipu API key from the environment to call external models. This is expected for the stated integration, but it is a credential users should manage carefully.
client = ZhipuAiClient(api_key=os.getenv("ZHIPU_API_KEY"))Use a dedicated or least-privilege API key where possible, monitor usage, and consider updating the registry metadata to declare the required environment variable.
Extracted document text and cropped images can remain on disk after processing.
The script stores the source path, full OCR markdown context, and extracted elements in a JSON file under the output directory.
payload = {
"source_file": str(file_path),
"full_markdown_context": full_context_md,
"elements": extracted_elements
}Choose an appropriate output directory and delete generated JSON/images if the source document is sensitive.
A maliciously crafted document could make the generated interpretation misleading or off-task.
OCR-extracted document text and table content are inserted directly into model prompts for analysis. This is purpose-aligned, but hostile text inside a document could try to steer the model's response.
{full_context[:6000]}
...
{markdown_content}Treat model-generated analysis as advisory, review results for unusual instructions or claims, and consider adding prompt guidance that document text is untrusted content.
