Install
openclaw skills install ui-element-opsParse UI screenshots into structured element JSON (type, OCR text, bbox) and operate desktop UI from parsed elements. Use when a user asks to detect/locate U...
openclaw skills install ui-element-opsParse one or more screenshots into a machine-readable JSON schema with:
type (normalized UI element type)bbox_px and bbox_normtext (OCR/caption content when available)clickable flagscripts/operate_ui.py (click/type/key/hotkey/screenshot)scripts/operate_ui.py (find, wait)calibrate)skills/ui-element-ops/scripts/bootstrap_omniparser_env.sh "$PWD"
skills/ui-element-ops/scripts/run_parse_ui.sh /abs/path/to/1.jpeg
<image>.elements.json<image>.overlay.pngskills/ui-element-ops/scripts/capture_and_parse.sh
scripts/bootstrap_omniparser_env.sh when .venv or OmniParser weights are missing.scripts/run_parse_ui.sh for standard parsing.total, clickable, by_type.python3 skills/ui-element-ops/scripts/operate_ui.py list --elements <json>python3 skills/ui-element-ops/scripts/operate_ui.py find --elements <json> --type button --text-contains loginpython3 skills/ui-element-ops/scripts/operate_ui.py wait --elements <json> --state appear --text-contains continuepython3 skills/ui-element-ops/scripts/operate_ui.py click --elements <json> --id e_0001python3 skills/ui-element-ops/scripts/operate_ui.py screenshot (defaults to user tmp dir)python3 skills/ui-element-ops/scripts/operate_ui.py calibrate --parsed-size <w> <h> --actual-size <w> <h>references/type_rules.example.json.scripts/parse_ui.py --help.--use-paddleocr only when paddleocr/paddlepaddle are installed.schema_version, pipeline, image, counts, elementsid, type, bbox_px, bbox_norm, text, clickable$HOME: keep temporary caches under /tmp (handled by run script).elements.json when possible.