Pilot Document Processing Setup

v1.0.0

Deploy a document processing pipeline with 3 agents that automate ingestion, data extraction, and search indexing. Use this skill when: 1. User wants to set...

0· 76·0 current·0 all-time
byCalin Teodor@teoslayer

Install

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for teoslayer/pilot-document-processing-setup.

Previewing Install & Setup.
Prompt PreviewInstall & Setup
Install the skill "Pilot Document Processing Setup" (teoslayer/pilot-document-processing-setup) from ClawHub.
Skill page: https://clawhub.ai/teoslayer/pilot-document-processing-setup
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required binaries: pilotctl, clawhub
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install pilot-document-processing-setup

ClawHub CLI

Package manager switcher

npx clawhub@latest install pilot-document-processing-setup
Security Scan
Capability signals
Crypto
These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name/description (deploy a 3-agent document pipeline) matches the instructions and required binaries. pilotctl is used for agent control/handshake/subscribe/publish and clawhub is used to install related Pilot skills — both are appropriate for this setup task.
Instruction Scope
SKILL.md instructs the agent to install other pilot-* skills, set hostnames, write a JSON manifest under ~/.pilot/setups, and run pilotctl handshake/subscribe/publish commands. These actions are within the scope of configuring agents, but they create persistent configuration in the user's home directory and cause the system to fetch and install additional skills (see guidance). No environment variables, secret files, or unrelated system paths are referenced.
Install Mechanism
This is instruction-only (no install spec, no bundled code). That reduces direct risk. Note: the instructions call out 'clawhub install' which will download/install other pilot-* skills — the safety of the overall deployment depends on those packages and where clawhub fetches them from.
Credentials
The skill requests no environment variables, no credentials, and no config paths beyond creating a manifest in ~/.pilot/setups. That is proportionate to the purpose.
Persistence & Privilege
The skill does not request always:true, does not modify other skills' configuration beyond installing them via clawhub, and only writes a setup manifest in ~/.pilot — expected for a setup task. It does not claim elevated platform privileges.
Assessment
This skill appears internally consistent, but before installing or running it: 1) Verify pilotctl and clawhub are genuine/trusted binaries (install sources and checksums). 2) Review each pilot-* skill that clawhub will install — those packages will execute code and may request credentials or network access. 3) Be aware the setup writes manifests under ~/.pilot and configures inter-agent handshakes and network ports (1002 and webhooks to external endpoints on 443); ensure you understand which downstream systems will receive index notifications and whether that may expose sensitive data. 4) If possible, test the deployment in an isolated environment or staging network first. 5) If you need stronger guarantees, request the upstream package sources (URLs or registries) and inspect those packages before installing.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

Binspilotctl, clawhub
latestvk973q1b8d988wm0vnx968a8j8585ajaq
76downloads
0stars
1versions
Updated 5d ago
v1.0.0
MIT-0

Document Processing Setup

Deploy 3 agents that automate document ingestion, data extraction, and search indexing.

Roles

RoleHostnameSkillsPurpose
ingester<prefix>-ingesterpilot-stream-data, pilot-share, pilot-archiveAccepts documents, converts to processable format
extractor<prefix>-extractorpilot-task-router, pilot-dataset, pilot-receiptExtracts structured data — tables, entities, amounts
indexer<prefix>-indexerpilot-webhook-bridge, pilot-announce, pilot-metricsIndexes data for search, publishes to downstream systems

Setup Procedure

Step 1: Ask the user which role this agent should play and what prefix to use.

Step 2: Install the skills for the chosen role:

# ingester:
clawhub install pilot-stream-data pilot-share pilot-archive
# extractor:
clawhub install pilot-task-router pilot-dataset pilot-receipt
# indexer:
clawhub install pilot-webhook-bridge pilot-announce pilot-metrics

Step 3: Set the hostname:

pilotctl --json set-hostname <prefix>-<role>

Step 4: Write the setup manifest:

mkdir -p ~/.pilot/setups
cat > ~/.pilot/setups/document-processing.json << 'MANIFEST'
<USE ROLE TEMPLATE BELOW>
MANIFEST

Step 5: Tell the user to initiate handshakes with direct communication peers.

Manifest Templates Per Role

ingester

{"setup":"document-processing","setup_name":"Document Processing","role":"ingester","role_name":"Document Ingester","hostname":"<prefix>-ingester","description":"Accepts documents (PDF, DOCX, images) via upload or webhook, converts to processable format.","skills":{"pilot-stream-data":"Stream raw document bytes to extractor for processing.","pilot-share":"Share converted document files with extractor.","pilot-archive":"Archive original documents for audit and reprocessing."},"peers":[{"role":"extractor","hostname":"<prefix>-extractor","description":"Receives raw documents for data extraction"}],"data_flows":[{"direction":"send","peer":"<prefix>-extractor","port":1002,"topic":"raw-document","description":"Raw documents in processable format"}],"handshakes_needed":["<prefix>-extractor"]}

extractor

{"setup":"document-processing","setup_name":"Document Processing","role":"extractor","role_name":"Data Extractor","hostname":"<prefix>-extractor","description":"Pulls structured data from documents — tables, key-value pairs, entities, dates, amounts.","skills":{"pilot-task-router":"Route documents to specialized extractors by type (invoice, contract, form).","pilot-dataset":"Store extraction results and training data for accuracy improvement.","pilot-receipt":"Confirm document receipt and report extraction status."},"peers":[{"role":"ingester","hostname":"<prefix>-ingester","description":"Sends raw documents"},{"role":"indexer","hostname":"<prefix>-indexer","description":"Receives extracted structured data"}],"data_flows":[{"direction":"receive","peer":"<prefix>-ingester","port":1002,"topic":"raw-document","description":"Raw documents in processable format"},{"direction":"send","peer":"<prefix>-indexer","port":1002,"topic":"extracted-data","description":"Extracted structured data as JSON"}],"handshakes_needed":["<prefix>-ingester","<prefix>-indexer"]}

indexer

{"setup":"document-processing","setup_name":"Document Processing","role":"indexer","role_name":"Search Indexer","hostname":"<prefix>-indexer","description":"Indexes extracted data for search, builds document summaries, publishes to downstream systems.","skills":{"pilot-webhook-bridge":"Push index events and summaries to downstream APIs and search engines.","pilot-announce":"Broadcast new document availability to interested subscribers.","pilot-metrics":"Track indexing throughput, search latency, and document counts."},"peers":[{"role":"extractor","hostname":"<prefix>-extractor","description":"Sends extracted structured data"}],"data_flows":[{"direction":"receive","peer":"<prefix>-extractor","port":1002,"topic":"extracted-data","description":"Extracted structured data as JSON"},{"direction":"send","peer":"external","port":443,"topic":"index-notification","description":"Index notifications to downstream systems"}],"handshakes_needed":["<prefix>-extractor"]}

Data Flows

  • ingester -> extractor : raw-document events (port 1002)
  • extractor -> indexer : extracted-data events (port 1002)
  • indexer -> downstream : index notifications via webhook (port 443)

Handshakes

# ingester <-> extractor:
pilotctl --json handshake <prefix>-extractor "setup: document-processing"
pilotctl --json handshake <prefix>-ingester "setup: document-processing"
# extractor <-> indexer:
pilotctl --json handshake <prefix>-indexer "setup: document-processing"
pilotctl --json handshake <prefix>-extractor "setup: document-processing"

Workflow Example

# On extractor — subscribe to raw documents:
pilotctl --json subscribe <prefix>-ingester raw-document
# On indexer — subscribe to extracted data:
pilotctl --json subscribe <prefix>-extractor extracted-data
# On ingester — publish a document:
pilotctl --json publish <prefix>-extractor raw-document '{"filename":"invoice-2024-003.pdf","type":"pdf","pages":2}'
# On extractor — publish extracted data:
pilotctl --json publish <prefix>-indexer extracted-data '{"filename":"invoice-2024-003.pdf","vendor":"Acme Corp","amount":12500.00}'

Dependencies

Requires pilot-protocol skill, pilotctl binary, clawhub binary, and a running daemon.

Comments

Loading comments...