Install
openclaw skills install @anyforge/anyparse-skillUse the AnyParse API to extract content from various documents. Supports PDF, Word, Excel, CSV, TSV, images, PPT, HTML, Markdown, Epub, ipynb, RST, EML, and many other formats. Supports document orientation classification, layout analysis, and layout preservation. Use this skill when users need to parse documents, extract document content, or convert documents into structured text.
openclaw skills install @anyforge/anyparse-skillUse the AnyParse API to extract structured text content from multiple document formats.
⛔ Mandatory Restrictions - Do Not Violate ⛔
python scripts/api.py to complete parsingDeploy the AnyParse service (see the official project documentation)
Install dependencies:
pip install -r scripts/requirements.txt
Configure: edit scripts/config.json and enter your API URL and key:
{
"anyparse_api_url": "http://your-api-host:port/anyparse/invoke/v1",
"anyparse_api_key": "your-api-key-here"
}
Or configure them through environment variables:
export anyparse_api_url=http://your-api-host:port/anyparse/invoke/v1
export anyparse_api_key=your-api-key-here
python scripts/api.py --file /path/to/document.pdf
python scripts/api.py --file /path/to/image.jpg --use_doc_cls
python scripts/api.py --file /path/to/image.jpg --use_doc_rectifier
python scripts/api.py --file /path/to/document.pdf --no_doc_layout
python {baseDir}/scripts/api.py --file PATH [--use_doc_cls] [--no_doc_layout]
| Parameter | Required | Description |
|---|---|---|
--file | Yes | Local file path |
--use_doc_cls | No | Use document orientation classification |
--use_doc_rectifier | No | Use document rectification |
--no_doc_layout | No | Do not use document layout analysis |
{
"code": 2000,
"msg": "success",
"data": {
"metadata": {
"file_md5": "f484351567161df1e5e4d9d4b861c594",
"file_type": "jpg",
"file_name": "image.jpg",
"file_size": "8.10/KB"
},
"pages": [
{
"id": 1,
"content": "$10^{9}/L$",
"layout": [
{
"order_id": 0,
"label_name": "text",
"box": [0, 0, 196, 80],
"task": "text",
"parse_text": "$10^{9}/L$"
}
],
"elapse_times": 1.9343271255493164
}
],
"content": "$10^{9}/L$",
"elapse_times": 1.9524128437042236
}
}
Key fields:
code — API status code; 2000 means successmsg — Status messagedata.metadata — File metadatadata.pages[].layout[] — Layout information for each region, including position and parsing resultdata.content — Merged text result for the entire documentAPI URL not configured:
→ Tell the user to configure anyparse_api_url in config.json or environment variables
API key not configured:
→ Tell the user to configure anyparse_api_key in config.json or environment variables
File does not exist: → Check whether the file path is correct
Non-2000 status code: → Show the API error message to the user
Missing dependencies:
→ Tell the user to run pip install -r scripts/requirements.txt to install dependencies