Install
openclaw skills install image-to-text-pdfConvert a finished raster image, especially a generated poster or visual resume, into an image-based PDF with an additional selectable, copyable, searchable text layer. Use when the image is the final visual layout, when recreating that layout in PPT, HTML, or LaTeX would be fragile, and when the user needs both a final invisible-text PDF and a visible inspection PDF for checking OCR or text-layer placement.
openclaw skills install image-to-text-pdfTurn a finished raster image into a PDF while preserving the image exactly and adding a transparent text layer for selection, copying, and search.
references/ocr-alignment.md when you need guidance for extracting, correcting, or prompting for box-level text.references/layout-json.md for the exact JSON schema.python scripts/compose_image_text_pdf.py \
--image /path/to/image.png \
--layout /path/to/layout.json \
--output /path/to/image-text.pdf \
--debug-output /path/to/image-text-check.pdf
For CJK or other non-Latin text, pass a Unicode font:
python scripts/compose_image_text_pdf.py \
--image /path/to/image.png \
--layout /path/to/layout.json \
--output /path/to/image-text.pdf \
--debug-output /path/to/image-text-check.pdf \
--font-file /path/to/NotoSansCJK-Regular.ttc
text field in layout JSON, not the OCR image.When an OCR tool returns word-level boxes, convert them to line-level layout items:
python scripts/ocr_words_to_layout.py \
--ocr /path/to/ocr.json \
--output /path/to/layout.json \
--image-width 1536 \
--image-height 2048 \
--source-text /path/to/source.txt
The converter accepts common JSON shapes containing words, items, textAnnotations, or nested page/line/word objects. It groups nearby words into lines and can replace OCR text with the closest line from the source text when the match is strong.