PDF Parse

v1.0.0

Parse a PDF into structured JSON: text, layout-aware blocks with bounding boxes, tables, and image metadata.

0· 94·1 current·1 all-time
byRishabh Dugar@rishabhdugar

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for rishabhdugar/pdf-parse.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "PDF Parse" (rishabhdugar/pdf-parse) from ClawHub.
Skill page: https://clawhub.ai/rishabhdugar/pdf-parse
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install pdf-parse

ClawHub CLI

Package manager switcher

npx clawhub@latest install pdf-parse
Security Scan
Capability signals
Requires sensitive credentials
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name/description match the runtime instructions and example: the skill forwards PDFs (by URL or multipart upload) to pdfapihub.com for parsing and expects an API key in the CLIENT-API-KEY header — this is appropriate for a PDF parsing integration.
Instruction Scope
SKILL.md only instructs POSTing to https://pdfapihub.com/api/v1/pdf/parse with either a public URL or multipart file and an API key. That stays within the stated purpose, but the instructions do not warn that PDFs (which may contain sensitive PII or secrets) will be transmitted to a third-party service — a privacy/data-exfiltration risk users should be aware of.
Install Mechanism
Instruction-only skill with no install spec or code files; nothing is written to disk or downloaded by the skill itself, which minimizes installation risk.
Credentials
The skill requires an API key (header-based) according to SKILL.md and skill.json, but no required environment variables are declared in the registry metadata — this is not dangerous but is a minor inconsistency in how credentials are represented. Requesting an API key for the external service is proportionate to the functionality.
Persistence & Privilege
always:false and no install-time modifications or system paths requested. The skill does perform outbound network calls to pdfapihub.com when invoked (expected for its purpose).
Assessment
This skill forwards PDFs to a third-party service (pdfapihub.com) and requires an API key in the CLIENT-API-KEY header. Before installing: (1) confirm you trust pdfapihub.com (review their privacy policy and retention practices); (2) avoid sending sensitive or regulated documents unless you’ve verified security/contractual protections; (3) supply the API key via the platform's secure credential storage (do not paste it into chat); (4) test with non-sensitive samples first; and (5) if you need stronger assurance, ask the skill author for a homepage, company identity, and privacy/terms links — the owner is currently unknown.

Like a lobster shell, security has layers — review code before you run it.

latestvk97bqpq60zk1faa1ch9bw627nn850wbc
94downloads
0stars
1versions
Updated 1w ago
v1.0.0
MIT-0

PDF Parse

What It Does

Parses a PDF into structured JSON with text content, layout-aware blocks (with normalized bounding boxes), tables, and image metadata.

When to Use

  • Extract structured data from PDFs (text, tables, images)
  • Get layout-aware content with bounding box coordinates
  • Parse invoices, forms, or reports into machine-readable format

Parsing Modes

ModeDescription
textText only
layoutText + text blocks with bounding boxes
tablesText + table blocks
fullText + blocks + tables + images (default)

Required Inputs

Provide one of:

  • url — public URL to a PDF
  • Multipart upload with file field

Authentication

Send your API key in the CLIENT-API-KEY header.

Get your free API key at https://pdfapihub.com. Full API documentation is available at https://pdfapihub.com/docs.

Use Cases

  • Invoice Parsing — Extract line items, totals, and vendor info from PDF invoices
  • Resume Parsing — Extract structured data (name, experience, skills) from PDF resumes
  • Contract Analysis — Extract clauses, dates, and parties from legal PDF contracts
  • Form Data Extraction — Pull filled form fields and values from PDF forms
  • Research Paper Analysis — Extract text, tables, and figures from academic PDFs
  • Document Indexing — Parse PDFs into structured JSON for search engine indexing

Example Usage

curl -X POST https://pdfapihub.com/api/v1/pdf/parse \
  -H "CLIENT-API-KEY: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://pdfapihub.com/sample-pdfinvoice-with-image.pdf", "mode": "full", "pages": "1-3" }'

Comments

Loading comments...