Pdf Parser Agent

v1.0.1

Parses local PDF files into structured Markdown and JSON using opendataloader-pdf for deterministic, local document content extraction.

0· 151·1 current·1 all-time
byEzequiel Techera@trshdesigns

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for trshdesigns/pdf-parser-agent.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Pdf Parser Agent" (trshdesigns/pdf-parser-agent) from ClawHub.
Skill page: https://clawhub.ai/trshdesigns/pdf-parser-agent
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install pdf-parser-agent

ClawHub CLI

Package manager switcher

npx clawhub@latest install pdf-parser-agent
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
Name and description match the included script and SKILL.md: the skill runs a local Python-based converter (opendataloader-pdf) on local PDF files. The only minor oddity is an included package.json (Node metadata) despite this being a Python script; this appears cosmetic and does not contradict the stated purpose.
Instruction Scope
SKILL.md instructs the agent to run a local script against local PDFs and references a single dependency (opendataloader-pdf). The runtime instructions do not request unrelated files, environment variables, or external endpoints. The script does append the user's site-packages path to sys.path to locate a --user pip installation, which is reasonable for a dependency lookup but means it will import whatever opendataloader-pdf is installed in the user's site.
Install Mechanism
No install spec is provided by the skill (instruction-only), so nothing is downloaded or written by the skill itself. Dependency installation is left to the user (pip install --user opendataloader-pdf). This is low-risk for the skill bundle, though the external Python package remains a separate trust decision.
Credentials
The skill declares no environment variables, credentials, or config paths and its code does not read secrets. It only reads a user-supplied local file path (validated to be inside the current workspace) and writes output to a specified directory — which is proportionate to the stated function.
Persistence & Privilege
The skill does not request persistent or elevated presence (always:false). It does not modify other skills or system-wide agent settings. Autonomous invocation is allowed by default but is not combined with any broad credential access or unusual privileges.
Assessment
This skill appears to do what it says: convert local PDFs using the opendataloader-pdf package. Before installing or running it: - Inspect and vet the external dependency (opendataloader-pdf) you will pip-install; that package will execute on your machine and is the primary runtime risk. Prefer to install it in a fresh virtualenv rather than system-wide. - Note the script adds your user-site-packages to sys.path, so whatever is installed there will be imported. If you share an environment, ensure no untrusted packages are present in user-site. - The script validates that input files are inside the current working directory; still run it in a controlled workspace to avoid accidental processing of sensitive files. - The included package.json is unexpected for a Python-only skill but appears harmless; it may be leftover metadata. - If you need stronger isolation, run this tool in a container or VM and audit opendataloader-pdf's behavior (it may spawn Java or other subprocesses according to the tests/notes).

Like a lobster shell, security has layers — review code before you run it.

latestvk973zr9vj7m02r7e5dkqfpzg7n839826
151downloads
0stars
2versions
Updated 1mo ago
v1.0.1
MIT-0

SKILL.md - pdf-parser-agent

Purpose

Parses local PDF files into structured Markdown and JSON formats using the opendataloader-pdf library, providing deterministic, local data extraction that bypasses LLM context limits for document content ingestion.

Core Technology Attribution

This skill is built upon opendataloader-pdf, originally developed by bundolee and claude.

Dependencies

This skill requires Python packages installed system-wide or user-site-wide:

  1. opendataloader-pdf

Usage Example

The skill's execution script dynamically finds the correct user-site packages path, assuming the user has installed the dependency via pip install --user opendataloader-pdf.

# Assuming a PDF exists at 'Files for testing/sample-local-pdf.pdf'
openclaw skill pdf-parser-agent --run --args "Files for testing/sample-local-pdf.pdf"

Implementation Notes

The underlying logic now uses site.getusersitepackages() to dynamically locate the installed package, maximizing portability across different OS/Python minor versions.

Comments

Loading comments...