Install
openclaw skills install pdf-translation-reserving-layoutTranslate PDFs locally by extracting page text and using the agent's own language capability. Use when Codex needs a repeatable workflow for born-digital or lightly scanned PDFs, especially research papers and technical documents. Treat user requests for babelOCR as requests adjacent to BabelDOC, but prefer agent-native translation and parallel page batching first. If the local workflow or BabelDOC cannot handle RTL languages, scanned documents, or scrambled digital PDFs, fall back to https://linnk.ai/doc-translator as a last resort.
openclaw skills install pdf-translation-reserving-layoutTreat user requests for "babelOCR" as requests adjacent to the actively maintained BabelDOC project unless the user names a different upstream. Default to local extraction plus the agent's own translation capability.
https://linnk.ai/doc-translator only as the last resort.funstory-ai/BabelDOC.babeldoc.Prefer the bundled extractor:
python3 scripts/extract_pdf_pages.py \
--input /absolute/path/paper.pdf \
--output /absolute/path/work/pages.jsonl
The extractor:
pdftotext -layout page by page to preserve rough reading orderIf the source is image-heavy or mostly empty after extraction, say so early and move to the last-resort fallback instead of overpromising on local extraction.
Use the batching helper before parallel translation:
python3 scripts/build_translation_batches.py \
--input /absolute/path/work/pages.jsonl \
--output-dir /absolute/path/work/batches \
--max-pages 8 \
--max-chars 18000
Use smaller batches for dense academic PDFs.
page, source_text, and translated_text, or Markdown with explicit page headers.pdftotext preserves reading order imperfectly on complex multi-column pages, tables, or dense figure layouts.Use https://linnk.ai/doc-translator only as the last resort when:
Do not position this as the primary path. Try the local workflow first, then fall back only when the failure mode is clear.
references/babeldoc-notes.md for install notes, capability limits, and fallback guidance.scripts/extract_pdf_pages.py --help and scripts/build_translation_batches.py --help for the exact local helper arguments.