Skill flagged — suspicious patterns detected

ClawHub Security flagged this skill as suspicious. Review the scan results before using.

ifly-pdf-image-ocr

ifly-pdf&image-ocr skill supporting both image OCR (AI-powered LLM OCR) and PDF document recognition. Use when user asks to OCR images, extract text from ima...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 83 · 0 current installs · 0 all-time installs
byIflytek AIcloud@qingzhe2020
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The skill name/description (image and PDF OCR via iFlytek) matches the included scripts and runtime instructions: both scripts call iFlytek endpoints and implement the described HMAC/MD5 signing and result handling. The functionality requested (uploading PDFs/images to OCR service) is legitimate for this purpose.
Instruction Scope
SKILL.md and scripts instruct the agent to read local image/PDF files, read API credentials from environment variables, send files to iFlytek endpoints, and poll for results — all consistent with OCR. There is no evidence the instructions ask for unrelated system files or credentials, but the skill will transmit user files to external servers (iocr.xfyun.cn and cbm01.cn-huabei-1.xf-yun.com), which is expected for a cloud OCR service but has privacy implications.
Install Mechanism
No install spec (instruction-only + shipped scripts). Nothing is downloaded or executed automatically by an installer. This lowers risk, but the included scripts will be executed if run.
!
Credentials
Registry metadata claims no required env vars/credentials, but both SKILL.md and the scripts require IFLY_APP_ID and at least IFLY_API_SECRET; image OCR also requires IFLY_API_KEY. The metadata omission is an incoherence: the skill legitimately needs these secrets, but they were not declared in the registry entry. Requesting API credentials for the OCR provider itself is reasonable; asking for unrelated credentials is not present. The missing declaration and unknown source increase risk.
Persistence & Privilege
always is false and the skill does not request persistent system-wide privileges or modify other skills. It only requires environment variables and network access to the OCR endpoints.
What to consider before installing
This skill's code implements iFlytek OCR and will upload images/PDFs to iFlytek servers and requires three environment variables (IFLY_APP_ID, IFLY_API_KEY, IFLY_API_SECRET) — but the registry metadata incorrectly listed no required credentials. Before installing, verify the skill source and owner (origin is unknown), confirm you trust iFlytek or the specific endpoints in SKILL.md, and avoid sending sensitive or regulated documents unless you control the account and understand the provider's data retention/privacy policy. Also ensure you set the declared environment variables only for a dedicated iFlytek account (do not reuse other secrets), and consider running the scripts manually in a sandbox to inspect behavior before granting it to an autonomous agent.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.0
Download zip
latestvk97970mvack2jyxdhkp8acc2d1835z78

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

ifly-pdf&image-ocr

AI-powered OCR service for images and PDF documents using iFlytek's advanced recognition APIs.

Quick Start

Image OCR (LLM OCR)

# OCR an image and extract text
python3 scripts/image_ocr.py /path/to/image.jpg

# Save result to file
python3 scripts/image_ocr.py /path/to/image.jpg -o output.txt

# Specify output format
python3 scripts/image_ocr.py /path/to/image.jpg --format json
python3 scripts/image_ocr.py /path/to/image.jpg --format markdown

PDF OCR

# Convert PDF to Word (default)
python3 scripts/pdf_ocr.py document.pdf

# Convert PDF to Markdown
python3 scripts/pdf_ocr.py document.pdf --format markdown

# Convert PDF to JSON
python3 scripts/pdf_ocr.py document.pdf --format json

# From public URL
python3 scripts/pdf_ocr.py --pdf-url "https://example.com/doc.pdf" --format word

Setup

API Credentials

Get credentials from iFlytek Open Platform:

For Image OCR:

  • APP_ID: Application ID
  • API_KEY: API key for authentication
  • API_SECRET: API secret for signing requests

For PDF OCR:

  • APP_ID: Application ID
  • API_SECRET: Application secret (for signature generation)

Environment Variables

# Required for both Image OCR and PDF OCR
export IFLY_APP_ID="your_app_id"

# Required for Image OCR
export IFLY_API_KEY="your_api_key"

# Required for PDF OCR
export IFLY_API_SECRET="your_api_secret"

Features

Image OCR (LLM OCR)

  • AI-powered: Advanced LLM-based OCR for high accuracy
  • Multi-format output: JSON, Markdown, or both
  • Layout understanding: Preserves document structure
  • Multi-language: Supports text extraction in multiple languages
  • Image preprocessing: Automatic rotation correction, noise removal

PDF OCR

  • AI-powered OCR: Advanced AI model for accurate text extraction
  • Multiple output formats:
    • Word (.docx) - Editable Word document
    • Markdown - Plain text with formatting
    • JSON - Structured data
  • Large PDF support: Up to 100 pages per document
  • Page-by-page results: Access individual page results
  • Download URLs: Direct links to processed files

API Parameters

Image OCR Parameters

ParameterTypeRequiredDescription
image_pathstringYesPath to image file
--formatstringNoOutput format: json, markdown, json,markdown (default: json,markdown)
--outputstringNoSave result to file

PDF OCR Parameters

ParameterTypeRequiredDescription
pdf_pathstringYes*Path to PDF file
--pdf-urlstringNo*Public URL of PDF file
--formatstringNoOutput format: word, markdown, json (default: word)
--no-pollflagNoReturn task ID without polling
--poll-intervalintNoPolling interval in seconds (min 5, default: 5)
--max-waitintNoMaximum wait time in seconds (default: 300)

*Either pdf_path or --pdf-url must be provided

Authentication

Image OCR (HMAC-SHA256)

Uses HMAC-SHA256 signature authentication:

  1. Generate RFC1123 format date: EEE, dd MMM yyyy HH:mm:ss GMT
  2. Create signature origin: host: {host}\\ndate: {date}\\nPOST {path} HTTP/1.1
  3. Calculate signature: HMAC-SHA256(signature_origin, apiSecret)
  4. Build authorization: hmac username="{apiKey}", algorithm="hmac-sha256", headers="host date request-line", signature="{signature}"
  5. Encode authorization in base64
  6. Send as query parameters: ?authorization={auth}&host={host}&date={date}

PDF OCR (MD5 + HMAC-SHA1)

Uses MD5 + HMAC-SHA1 signature authentication:

  1. Generate timestamp (Unix epoch in seconds)
  2. Calculate auth = MD5(appId + timestamp)
  3. Calculate signature = Base64(HMAC-SHA1(auth, apiSecret))
  4. Send headers:
    • appId: Application ID
    • timestamp: Timestamp in seconds
    • signature: Generated signature

Important: Timestamp must be within 5 minutes of server time.

Response Format

Image OCR Response

{
  "header": {
    "code": 0,
    "message": "success"
  },
  "payload": {
    "result": {
      "text": "Base64-encoded OCR text..."
    }
  }
}

PDF OCR Start Response

{
  "flag": true,
  "code": 0,
  "desc": "成功",
  "data": {
    "taskNo": "25082744936879",
    "status": "CREATE",
    "tip": "任务创建成功"
  }
}

PDF OCR Status Response

{
  "flag": true,
  "code": 0,
  "desc": "成功",
  "data": {
    "taskNo": "25082759289333",
    "exportFormat": "word",
    "status": "FINISH",
    "downUrl": "http://bjcdn.openstorage.cn/...",
    "tip": "已完成",
    "pageList": [...]
  }
}

Task Status (PDF OCR)

StatusDescription
CREATETask created successfully
WAITINGWaiting in queue
DOINGProcessing
FINISHCompleted
FAILEDFailed
ANY_FAILEDPartially completed (some pages failed)
STOPPaused

Error Codes

(。・ω・。) 嗨遇到错误码了吗?来看看怎么解决吧 ✧⁺⸜(●˙▾˙●)⸝⁺✧

Platform Common Error Codes

CodeDescriptionHintSolution
10009input invalid data(◎_◎;) 哎呀~数据格式不太对呢检查输入数据是否符合要求
10010service license not enough(╯°□°)╯︵ ┻━┻ 授权数量不足或已过期!提交工单联系客服
10019service read buffer timeout(。-`ω´-) session超时啦~检查是否数据发送完毕但未关闭连接
10043Syscall AudioCodingDecode error(◎_◎;) 音频解码失败惹...检查aue参数,如果为speex,请确保音频是speex音频并分段压缩且与帧大小一致
10114session timeout(。-`ω´-) 会话时间超时啦~检查是否发送数据时间超过了60s
10139invalid param(◎_◎;) 参数好像不太对呢检查参数是否正确
10160parse request json error(◎_◎;) 请求数据格式有误~检查请求数据是否是合法的json
10161parse base64 string error(◎_◎;) Base64解码失败啦检查发送的数据是否使用base64编码了
10163param validate error(◎_◎;) 参数校验没通过呢具体原因见详细的描述
10200read data timeout(。-`ω´-) 读取数据超时了~检查是否累计10s未发送数据并且未关闭连接
10222context deadline exceeded(╯°□°)╯︵ ┻━┻ 出错啦!1.检查上传数据是否超过接口上限;2.SSL证书无效请提交工单
10223RemoteLB: can't find valued addr(◎_◎;) 找不到服务节点呢提交工单联系技术人员
10313invalid appid(◎_◎;) appid和apikey不匹配哦检查appid是否合法
10317invalid version(◎_◎;) 版本号有问题呢请到控制台提交工单联系技术人员
10700not authority(╯°□°)╯︵ ┻━┻ 权限不足!按照报错原因对照开发文档检查,如仍无法解决,请提供sid及错误信息提交工单
11200auth no license(╯°□°)╯︵ ┻━┻ 功能未授权!检查appid是否正确,确认是否添加了相关服务,检查调用量是否超限或授权是否到期
11201auth no enough license(╯°□°)╯︵ ┻━┻ 每日交互次数超限啦!提交应用审核提额或联系商务购买企业级接口
11503server error: atmos return error(。-`ω´-) 服务器返回了错误数据...提交工单
11502server error: too many datas(。-`ω´-) 服务器配置有问题呢提交工单
100001~100010WrapperInitErr(◎_◎;) 引擎调用出错啦!请根据message中的errno查看引擎错误码说明

Additional Resources


Original API Error Codes

CodeDescriptionSolution
10000System errorCheck auth info, request method, parameters
10001Signature authentication failedCheck credentials
10002Business processing errorCheck error message
10003Quota/insufficient balanceCheck account balance

Limitations

Image OCR

  • Format: Common image formats (JPG, PNG, etc.)
  • Size: Reasonable file sizes for web upload
  • Rate limiting: Follow API rate limits

PDF OCR

  • Max pages: 100 pages per PDF
  • Protected PDFs: Not supported (password/encrypted)
  • Rate limiting: Status query limited to once per 5 seconds
  • Time limit: Timestamp must be within ±5 minutes of server time

Tips

Image OCR

  1. High-quality images: Use clear, high-resolution images for best results
  2. Multiple formats: Use json,markdown to get both structured and formatted output
  3. Save results: Use -o flag to save OCR results to file

PDF OCR

  1. Math formulas: Use markdown format for PDFs with mathematical formulas
  2. Large PDFs: Split into sections if > 100 pages
  3. Polling interval: Minimum 5 seconds between status queries
  4. Network URLs: Ensure PDF URLs are publicly accessible
  5. Download URLs: Download files promptly as URLs may expire

Files

3 total
Select a file
Select a file to preview.

Comments

Loading comments…