Azure Document OCR

PassAudited by ClawScan on May 1, 2026.

Overview

This is a coherent Azure OCR helper, but it uploads selected documents to Azure and uses an Azure API key, so sensitive files and credentials need care.

Before installing or using it, confirm you are comfortable sending the selected documents to Azure Document Intelligence, keep the Azure key secret, verify the endpoint is your Azure resource, and avoid running batch mode on folders that contain unrelated sensitive files.

Findings (3)

Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.

What this means

Documents submitted for OCR, including sensitive business, identity, tax, or health-related files, may be processed by Azure and the extracted content may be written locally.

Why it was flagged

The script reads the selected local document and sends its bytes to the configured Azure Document Intelligence endpoint. This is purpose-aligned for OCR, but it means document contents leave the local machine.

Skill content
with open(file_path, "rb") as f: body = f.read(); response = requests.post(analyze_url, params=params, headers=headers, data=body)
Recommendation

Use this only with documents you are allowed to send to Azure, verify the Azure endpoint, and handle output files as sensitive data.

What this means

The Azure key can authorize use of the Document Intelligence resource and may incur charges or expose service access if mishandled.

Why it was flagged

The script requires an Azure Document Intelligence endpoint and subscription key from environment variables. This is expected for the service, but it is a credential requirement users should notice.

Skill content
endpoint = os.environ.get("AZURE_DOC_INTEL_ENDPOINT"); key = os.environ.get("AZURE_DOC_INTEL_KEY")
Recommendation

Store the key securely, use a dedicated/least-privilege Azure resource when possible, do not commit it to files, and rotate it if it may have been exposed.

What this means

A broad batch run could upload many local documents to Azure and create many extracted-output files.

Why it was flagged

Batch mode finds all matching files in the user-specified directory and processes them concurrently. This is disclosed and purpose-aligned, but the scope can be broad if the wrong folder is chosen.

Skill content
documents = find_documents(input_path, extensions); ThreadPoolExecutor(max_workers=args.workers)
Recommendation

Point batch mode only at intended folders, narrow extensions when needed, and choose a worker count appropriate for rate limits and sensitivity.