Upstage Ocr
PassAudited by ClawScan on May 6, 2026.
Overview
This skill appears to be a straightforward Upstage OCR helper, but users should notice that it sends documents to Upstage and uses an API key that is not declared in the registry metadata.
This looks safe to install if you intend to use Upstage's cloud OCR service. Before using it, set UPSTAGE_API_KEY carefully, avoid submitting documents you cannot share with Upstage, and remember that OCR output may be written to a temp file or a path you specify.
Findings (3)
Artifact-based informational review of SKILL.md, metadata, install specs, static scan signals, and capability signals. ClawScan does not execute the skill or run runtime probes.
A user must provide an Upstage API key, which can bill or access the user's Upstage account according to that key's permissions.
The skill uses a provider API key from the environment. This is expected for Upstage OCR, but it is sensitive account access and the registry metadata does not declare a required env var or primary credential.
**API Key**: Always use `os.environ["UPSTAGE_API_KEY"]`. Get your key at [console.upstage.ai](https://console.upstage.ai).
Use a dedicated, least-privileged Upstage key if available, store it only in the environment, and rotate it if it is exposed.
Documents submitted for OCR may contain private or regulated information and will be processed by Upstage's service.
The OCR workflow sends the document file to an external Upstage API. This is purpose-aligned and disclosed, but users should understand that OCR is not performed locally.
requests.post("https://api.upstage.ai/v1/document-digitization", ... files={"document": open("scan.pdf", "rb")}, data={"model": "ocr"})Only OCR documents you are comfortable sending to Upstage, and check Upstage's retention, privacy, and compliance terms before using sensitive files.
OCR results for async jobs may remain available on the provider side for a period after processing.
The async workflow involves provider-side storage and temporary download URLs. This is disclosed and relevant to OCR processing, but it affects data retention and access exposure.
Documents are processed in batches of 10 pages; results are stored for 30 days, individual download URLs expire after 15 minutes.
Use sync mode for smaller sensitive documents when possible, and manage request IDs and download URLs as sensitive information.
