Install
openclaw skills install camscanner-any2markdown-officeUse CamScanner to convert images or PDF documents to Markdown format. Powered by a high-precision document parsing engine that intelligently decomposes parag...
openclaw skills install camscanner-any2markdown-officeCamScanner provides a high-precision document parsing engine that converts images and PDF documents to Markdown format. It intelligently decomposes document paragraphs, precisely recognizes tables and multiple element types, handles complex image scenarios, and outputs structured results in reading order — empowering large language models to accurately understand document content. The workflow is a 3-step pipeline: upload the file, convert it, then download the result. The skill auto-detects whether the input is a PDF or image and uses the appropriate conversion endpoint.
Important: Privacy & Data Flow Notice
- Third-party service: This skill sends your files to CamScanner's official servers (
ai-tools.camscanner.com) for processing.- Data retention: CamScanner servers process your files in real-time. Files are not permanently stored on the server.
- Local files: Output files are saved to your local filesystem at the path you specify.
Base URL: https://ai-tools.camscanner.com
| source_type | target_type | Output | Endpoint |
|---|---|---|---|
| md | .md | convert_pdf | |
| image | md | .md | convert_image |
Determine the conversion endpoint based on file extension:
.pdf): Use convert_pdf with "source_type": "pdf".png, .jpg, .jpeg, .bmp, .tiff, .webp): Use convert_image with "source_type": "image"BASE="https://ai-tools.camscanner.com"
IN_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/upload_file/execute" \
-H "Content-Type: application/octet-stream" \
--data-binary "@/path/to/document" | jq -r '.tool_result.data.file_id')
Response:
{
"code": 200,
"tool": "upload_file",
"tool_result": {
"success": true,
"data": {
"file_id": "file_1741857600_ab12cd34ef56",
"size": 24576
}
}
}
For PDF files:
OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/convert_pdf/execute" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"pdf\",\"target_type\":\"md\",\"output_mode\":\"file_id\"}" \
| jq -r '.tool_result.data.file_id')
For image files:
OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/convert_image/execute" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"image\",\"target_type\":\"md\",\"output_mode\":\"file_id\"}" \
| jq -r '.tool_result.data.file_id')
Response:
{
"code": 200,
"tool": "convert_pdf",
"tool_result": {
"success": true,
"data": {
"file_id": "file_1741857701_9988aabbccdd",
"target_type": "md"
}
}
}
curl -sS -X POST "$BASE/v1/tools/download_file/execute?response_mode=raw" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$OUT_FILE_ID\"}" \
-o /path/to/output.md
Critical: The response_mode=raw query parameter is required to get the binary file. Without it, the response is JSON.
Convert a PDF to Markdown:
BASE="https://ai-tools.camscanner.com"
INPUT_FILE="/path/to/document.pdf"
OUTPUT_FILE="/path/to/output.md"
# Upload
IN_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/upload_file/execute" \
-H "Content-Type: application/octet-stream" \
--data-binary "@$INPUT_FILE" | jq -r '.tool_result.data.file_id')
# Convert (use convert_pdf for PDF, convert_image for images)
CONVERT_ENDPOINT="convert_pdf" # or "convert_image"
SOURCE_TYPE="pdf" # or "image"
OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/${CONVERT_ENDPOINT}/execute" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"$SOURCE_TYPE\",\"target_type\":\"md\",\"output_mode\":\"file_id\"}" \
| jq -r '.tool_result.data.file_id')
# Download
curl -sS -X POST "$BASE/v1/tools/download_file/execute?response_mode=raw" \
-H "Content-Type: application/json" \
-d "{\"file_id\":\"$OUT_FILE_ID\"}" \
-o "$OUTPUT_FILE"
| Mistake | Fix |
|---|---|
Forgetting response_mode=raw on download | Always append ?response_mode=raw to the download URL |
| Wrong Content-Type on upload | Upload uses application/octet-stream, not multipart/form-data |
| Using GET instead of POST | All three endpoints use POST |
| Wrong endpoint for file type | Use convert_pdf for PDFs, convert_image for images |
Wrong source_type for file type | Use "pdf" for PDFs, "image" for images |
Missing output_mode in convert request | Always include "output_mode": "file_id" to get a downloadable file_id |
Check each step before proceeding:
# After upload
if [ -z "$IN_FILE_ID" ] || [ "$IN_FILE_ID" = "null" ]; then
echo "Upload failed"; exit 1
fi
# After convert
if [ -z "$OUT_FILE_ID" ] || [ "$OUT_FILE_ID" = "null" ]; then
echo "Conversion failed"; exit 1
fi