pdf-ocr-layout
PassAudited by ClawScan on May 1, 2026.
Overview
This appears to be a legitimate OCR/document-analysis skill, but it uploads selected documents/images to Zhipu’s cloud models and saves extracted content locally.
Before installing, confirm you are comfortable sending the chosen PDFs/images to Zhipu's services and storing extracted text/images in the output folder. Use a dedicated Zhipu API key if possible, verify the Python dependency name, and review generated analyses rather than treating them as authoritative.
Findings (4)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
Private or confidential documents processed by this skill may be transmitted to Zhipu's cloud service for OCR and analysis.
The selected input file is base64-encoded and sent to the Zhipu OCR API, which is central to the skill's purpose but means document contents leave the local environment.
response = client.layout_parsing.create(
model="glm-ocr",
file=image_to_base64(file_path)
)Use only with documents you are allowed to send to Zhipu, and review the provider's data retention and privacy terms.
The configured API key may incur usage charges and grants access to the Zhipu account associated with it.
The skill uses a Zhipu API key from the environment to call external models. This is expected for the stated integration, but it is a credential users should manage carefully.
client = ZhipuAiClient(api_key=os.getenv("ZHIPU_API_KEY"))Use a dedicated or least-privilege API key where possible, monitor usage, and consider updating the registry metadata to declare the required environment variable.
Extracted document text and cropped images can remain on disk after processing.
The script stores the source path, full OCR markdown context, and extracted elements in a JSON file under the output directory.
payload = {
"source_file": str(file_path),
"full_markdown_context": full_context_md,
"elements": extracted_elements
}Choose an appropriate output directory and delete generated JSON/images if the source document is sensitive.
A maliciously crafted document could make the generated interpretation misleading or off-task.
OCR-extracted document text and table content are inserted directly into model prompts for analysis. This is purpose-aligned, but hostile text inside a document could try to steer the model's response.
{full_context[:6000]}
...
{markdown_content}Treat model-generated analysis as advisory, review results for unusual instructions or claims, and consider adding prompt guidance that document text is untrusted content.
