Azure Document OCR

Security checks across malware telemetry and agentic risk

Overview

This is a coherent Azure OCR skill, but users should treat documents, extracted output, and the Azure key as sensitive.

Install only if you are comfortable sending chosen documents or document URLs to Azure Document Intelligence. Verify the Azure endpoint is your own trusted resource, keep the API key out of source control and logs, avoid broad batch runs over mixed private folders, and handle generated text or JSON outputs as sensitive when the source documents are sensitive.

SkillSpector

By NVIDIA

Vulnerability Patterns

Data ExfiltrationExternal Transmission, Env Variable Harvesting, File System Enumeration
Taint TrackingDirect Taint Flow, Variable-Mediated Taint Flow, Credential Exfiltration Chain
MCP Least PrivilegeUnderdeclared Capability, Wildcard Permission, Missing Permission Declaration
Prompt InjectionInstruction Override, Hidden Instructions, Exfiltration Commands
Privilege EscalationExcessive Permissions, Sudo/Root Execution, Credential Access

Findings (4)

Tainted flow: 'operation_location' from os.environ.get (line 246, credential/environment) → requests.get (network output)

Critical

Category: Data Flow
Content: print(f"Error: Polling timeout after {MAX_POLL_TIME} seconds", file=sys.stderr) sys.exit(1) response = requests.get(operation_location, headers=headers) if response.status_code != 200: print(f"Error: Failed to poll status (HTTP {response.status_code})", file=sys.stderr)
Confidence: 76% confidence
Finding: response = requests.get(operation_location, headers=headers)

Lp3

Medium

Category: MCP Least Privilege
Confidence: 92% confidence
Finding: The skill clearly instructs users to set environment variables, invoke Python scripts, access local files, write outputs, and call Azure's external REST API, but it does not declare permissions for those capabilities. This creates a transparency and governance gap: users or platforms may authorize or run the skill without understanding that it can read documents, transmit data externally, and write extracted results to disk.

Missing User Warnings

Medium

Confidence: 96% confidence
Finding: The skill encourages processing PDFs, invoices, IDs, receipts, and URLs through Azure Document Intelligence but does not prominently warn that document contents and remote URLs are transmitted to an external third-party cloud service. Because these inputs often contain sensitive personal, financial, or identity data, omission of that disclosure can lead to unintended data exfiltration, privacy violations, or policy noncompliance.

Missing User Warnings

Medium

Confidence: 90% confidence
Finding: This reference explicitly guides processing of highly sensitive identity, tax, and health-insurance documents but provides no warning about handling PII/PHI, retention, access control, or compliance obligations. In a document-OCR skill, omission of privacy and security guidance can lead users to process regulated data without safeguards, increasing risk of unauthorized disclosure, noncompliance, and unsafe downstream use.

VirusTotal

66/66 vendors flagged this skill as clean.

View on VirusTotal