pdf-miner
v1.0.0Extract text and tables from PDF files with reliable Chinese (CJK) support. Use when: (1) User asks to read/extract content from a PDF file, (2) User needs t...
⭐ 1· 17·0 current·0 all-time
bybaichen@baichenwzj
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
OpenClaw
Benign
high confidencePurpose & Capability
Name/description align with the included script and SKILL.md: the code uses pdfplumber to extract text/tables, supports search/metrics/TOC/diff/chunking, and the requested dependency (pdfplumber) is appropriate for CJK PDF extraction.
Instruction Scope
Instructions stay within the PDF-extraction scope (running the script on user-supplied PDFs). They include an example of downloading a PDF via urllib.request.urlretrieve(url, ...) — normal for fetching remote PDFs, but it means the agent or user may download arbitrary remote files, so avoid fetching from untrusted sources.
Install Mechanism
No automated install spec (instruction-only). The SKILL.md tells the user to pip install pdfplumber manually, which is low-risk but does require installing a third-party Python package before use.
Credentials
The skill declares no environment variables, no credentials, and the code does not reference any secret/config paths — requested privileges are minimal and consistent with the stated purpose.
Persistence & Privilege
The skill is not always:true, does not request system-wide changes, and does not modify other skills or global agent settings. It runs on-demand and writes outputs only to locations the user or flags specify.
Assessment
This skill appears internally consistent and focused on PDF extraction. Before using: (1) install pdfplumber in a controlled environment (python -m pip install pdfplumber); (2) avoid downloading PDFs from untrusted URLs (the instructions show how to fetch remote PDFs, which can execute only-safe parsing but may contain malicious payloads in other formats); (3) review or run the included script in an isolated environment if you plan to let an agent download and process arbitrary documents; and (4) remember the tool does not perform OCR on scanned/image PDFs.Like a lobster shell, security has layers — review code before you run it.
latestvk97d12p67zzmkkx2vte4nxp3pd849h65
License
MIT-0
Free to use, modify, and redistribute. No attribution required.
