Nutrient Document Processing (Universal Agent Skill)

v1.1.2

Universal (non-OpenClaw) Nutrient DWS document-processing skill for Agent Skills-compatible products. Best for Claude Code, Codex CLI, Gemini CLI, Cursor, Wi...

0· 567·2 current·3 all-time
byJonathan Rhyne@jdrhyne
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
medium confidence
Purpose & Capability
Name, description, and operations (convert, OCR, redact, extract, sign, etc.) align with the declared requirements: it needs API keys for Nutrient and either curl or npx for direct API or MCP server mode. Requesting an npm package for an MCP server and showing curl examples is coherent with a document-processing integration.
Instruction Scope
SKILL.md stays on-topic (shows curl API calls and MCP server setup). Two practical caveats: (1) the MCP config example embeds NUTRIENT_DWS_API_KEY in a JSON config, which would persist the key in a config file if users paste it there, contradicting the doc's claim that keys aren't stored beyond the session; (2) MCP server uses a SANDBOX_PATH and may read/write files in that directory (expected for processing, but worth noting).
Install Mechanism
Install uses npx to fetch/run @nutrient-sdk/dws-mcp-server from npm at runtime. This is an expected mechanism for an optional MCP server, but npx downloads and executes code from the public npm registry each time — moderate operational risk if you haven't reviewed the package source or don't trust the publisher.
Credentials
Requires two env vars (NUTRIENT_API_KEY for direct API calls and NUTRIENT_DWS_API_KEY for MCP server). Both are plausible for supporting both modes. The SKILL.md's guidance to put keys into persistent MCP config or environment files can cause long-lived storage of secrets — consider using short-lived or scoped keys and avoid pasting secrets into shared config files.
Persistence & Privilege
always:false and standard autonomous invocation behavior are appropriate. The skill does not request system-wide privileges. The only notable persistence vector is the user-populated MCP config (which is outside the skill itself) and the fact that npx will fetch and run code from npm.
Assessment
This skill appears to do what it says (document processing via Nutrient) and requires the Nutrient API keys and either curl or npx. Before installing: (1) review the @nutrient-sdk/dws-mcp-server package source on npm/GitHub if you plan to use MCP mode (npx will download and run it), (2) avoid pasting API keys into shared/persistent config files unless you trust the environment — prefer scoped/limited keys or temporary credentials, (3) set SANDBOX_PATH to a restricted directory to limit filesystem exposure, and (4) if you only need direct calls, prefer curl mode to avoid executing remote npm code. If you need higher assurance, ask the publisher for a signed release or include a local vetted binary instead of npx.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

📄 Clawdis
Any binnpx, curl
EnvNUTRIENT_API_KEY, NUTRIENT_DWS_API_KEY

Install

Install Nutrient DWS MCP Server (optional)npm i -g @nutrient-sdk/dws-mcp-server
agent-skillsvk9781t9ncx4srnbmxmg2wkdts981y5hjdocument-processingvk973ehzezr9dcktyvsewfxtq5182npc9latestvk973ehzezr9dcktyvsewfxtq5182npc9mcpvk973ehzezr9dcktyvsewfxtq5182npc9nutrientvk973ehzezr9dcktyvsewfxtq5182npc9ocrvk973ehzezr9dcktyvsewfxtq5182npc9pdfvk973ehzezr9dcktyvsewfxtq5182npc9redactionvk973ehzezr9dcktyvsewfxtq5182npc9
567downloads
0stars
5versions
Updated 1mo ago
v1.1.2
MIT-0

Nutrient Document Processing (Universal Agent Skill)

Best for Claude Code/Codex/Gemini/Cursor/Windsurf and other non-OpenClaw agents. Process, convert, extract, OCR, redact, sign, and manipulate documents using the Nutrient DWS Processor API.

Setup

You need a Nutrient DWS API key. Get one free at https://dashboard.nutrient.io/sign_up/?product=processor.

Option 1: MCP Server (Recommended)

If your agent supports MCP (Model Context Protocol), use the Nutrient DWS MCP Server. It provides all operations as native tools.

Configure your MCP client (e.g., claude_desktop_config.json or .mcp.json):

{
  "mcpServers": {
    "nutrient-dws": {
      "command": "npx",
      "args": ["-y", "@nutrient-sdk/dws-mcp-server"],
      "env": {
        "NUTRIENT_DWS_API_KEY": "YOUR_API_KEY",
        "SANDBOX_PATH": "/path/to/working/directory"
      }
    }
  }
}

Then use the MCP tools directly (e.g., convert_to_pdf, extract_text, redact, etc.).

Option 2: Direct API (curl)

For agents without MCP support, call the API directly:

export NUTRIENT_API_KEY="your_api_key_here"

All requests go to https://api.nutrient.io/build as multipart POST with an instructions JSON field.

Safety Boundaries

  • This skill sends documents to the Nutrient DWS API (api.nutrient.io) for processing. Documents may contain sensitive data — ensure your Nutrient account's data handling policies are acceptable.
  • It does NOT access local files beyond those explicitly passed for processing.
  • It does NOT store API keys or credentials beyond the current session.
  • MCP server mode (npx @nutrient-sdk/dws-mcp-server) downloads the official Nutrient MCP server package from npm at runtime.
  • All API calls require an explicit API key — no anonymous access is possible.

Operations

1. Convert Documents

Convert between PDF, DOCX, XLSX, PPTX, HTML, and image formats.

HTML to PDF:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "index.html=@index.html" \
  -F 'instructions={"parts":[{"html":"index.html"}]}' \
  -o output.pdf

DOCX to PDF:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.docx=@document.docx" \
  -F 'instructions={"parts":[{"file":"document.docx"}]}' \
  -o output.pdf

PDF to DOCX/XLSX/PPTX:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}' \
  -o output.docx

Image to PDF:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "image.jpg=@image.jpg" \
  -F 'instructions={"parts":[{"file":"image.jpg"}]}' \
  -o output.pdf

2. Extract Text and Data

Extract plain text:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}' \
  -o output.txt

Extract tables (as JSON, CSV, or Excel):

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}' \
  -o tables.xlsx

Extract key-value pairs:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"extraction","strategy":"key-values"}]}' \
  -o result.json

3. OCR Scanned Documents

Apply OCR to scanned PDFs or images, producing searchable PDFs with selectable text.

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "scanned.pdf=@scanned.pdf" \
  -F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}' \
  -o searchable.pdf

Supported languages: english, german, french, spanish, italian, portuguese, dutch, swedish, danish, norwegian, finnish, polish, czech, turkish, japanese, korean, chinese-simplified, chinese-traditional, arabic, hebrew, thai, hindi, russian, and more.

4. Redact Sensitive Information

Pattern-based redaction (preset patterns):

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","preset":"social-security-number"}]}' \
  -o redacted.pdf

Available presets: social-security-number, credit-card-number, email-address, north-american-phone-number, international-phone-number, date, url, ipv4, ipv6, mac-address, us-zip-code, vin, time.

Regex-based redaction:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","regex":"\\b[A-Z]{2}\\d{6}\\b"}]}' \
  -o redacted.pdf

AI-powered PII redaction:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"ai_redaction","criteria":"All personally identifiable information"}]}' \
  -o redacted.pdf

The criteria field accepts natural language (e.g., "Names and phone numbers", "Protected health information", "Financial account numbers").

5. Add Watermarks

Text watermark:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","text":"CONFIDENTIAL","fontSize":48,"fontColor":"#FF0000","opacity":0.5,"rotation":45,"width":"50%","height":"50%"}]}' \
  -o watermarked.pdf

Image watermark:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F "logo.png=@logo.png" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","imagePath":"logo.png","width":"30%","height":"30%","opacity":0.3}]}' \
  -o watermarked.pdf

6. Digital Signatures

Sign a PDF with CMS signature:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms","signerName":"John Doe","reason":"Approval","location":"New York"}]}' \
  -o signed.pdf

Sign with CAdES-B-LT (long-term validation):

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cades","cadesLevel":"b-lt","signerName":"Jane Smith"}]}' \
  -o signed.pdf

7. Form Filling (Instant JSON)

Fill PDF form fields using Instant JSON format:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "form.pdf=@form.pdf" \
  -F 'instructions={"parts":[{"file":"form.pdf"}],"actions":[{"type":"fillForm","fields":[{"name":"firstName","value":"John"},{"name":"lastName","value":"Doe"},{"name":"email","value":"john@example.com"}]}]}' \
  -o filled.pdf

8. Merge and Split PDFs

Merge multiple PDFs:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "doc1.pdf=@doc1.pdf" \
  -F "doc2.pdf=@doc2.pdf" \
  -F 'instructions={"parts":[{"file":"doc1.pdf"},{"file":"doc2.pdf"}]}' \
  -o merged.pdf

Extract specific pages:

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf","pages":{"start":0,"end":4}}]}' \
  -o pages1-5.pdf

9. Render PDF Pages as Images

curl -X POST https://api.nutrient.io/build \
  -H "Authorization: Bearer $NUTRIENT_API_KEY" \
  -F "document.pdf=@document.pdf" \
  -F 'instructions={"parts":[{"file":"document.pdf","pages":{"start":0,"end":0}}],"output":{"type":"png","dpi":300}}' \
  -o page1.png

10. Check Credits

curl -X GET https://api.nutrient.io/credits \
  -H "Authorization: Bearer $NUTRIENT_API_KEY"

Best Practices

  1. Use the MCP server when your agent supports it — it handles file I/O, error handling, and sandboxing automatically.
  2. Set SANDBOX_PATH to restrict file access to a specific directory.
  3. Check credit balance before batch operations to avoid interruptions.
  4. Use AI redaction for complex PII detection; use preset/regex redaction for known patterns (faster, cheaper).
  5. Chain operations — the API supports multiple actions in a single call (e.g., OCR then redact).

Troubleshooting

IssueSolution
401 UnauthorizedCheck your API key is valid and has credits
413 Payload Too LargeFiles must be under 100 MB
Slow AI redactionAI analysis takes 60–120 seconds; this is normal
OCR quality poorTry a different language parameter or improve scan quality
Missing text in extractionRun OCR first on scanned documents

More Information

Comments

Loading comments...