!
Purpose & Capability
The skill's stated purpose (automating DingTalk GUI on macOS) matches the code and instructions. However the registry metadata claims no required binaries or env vars while SKILL.md and scripts require/expect peekaboo, cliclick, swift, screencapture and Screen Recording/Accessibility permissions — a mismatch between declared requirements and actual runtime needs.
!
Instruction Scope
Instructions and the included Python/Swift scripts direct the agent to take full-screen and window screenshots, OCR them locally, and — optionally if an API key is available — POST base64-encoded screenshots to a remote vision endpoint (dashscope.aliyuncs.com). The script also reads a user configuration file (~/.openclaw/openclaw.json) to extract a model API key. Capturing and transmitting screenshots can expose unrelated sensitive data on your screen; reading the user's config file is not declared in the skill metadata.
✓
Install Mechanism
Instruction-only skill (no install spec). No installers or downloads are performed by the skill bundle itself, which limits disk-write/install risk. However it depends on third-party tools that the user must install separately (peekaboo, cliclick, Swift).
!
Credentials
The registry lists no required env vars or config paths, but the script accesses ~/.openclaw/openclaw.json and the QWEN_API_KEY environment variable to enable optional remote 'vision' functionality. This is disproportionate to 'send a DingTalk message' and is not clearly declared: the script will try to find and use a model API key from your personal config without explicit metadata prompting.
✓
Persistence & Privilege
The skill does not request permanent inclusion (always:false) and does not modify other skills or system-wide settings. It stores transient files in /tmp/dingtalk-gui and requires Screen Recording/Accessibility permissions as expected for GUI automation.
What to consider before installing
This skill appears to implement the advertised DingTalk GUI automation, but there are several things to check before installing or running it:
- Expect to grant macOS Screen Recording and Accessibility permissions; the script will take full-screen and app-window screenshots and save them in /tmp/dingtalk-gui (including QR codes and any visible content).
- The script will look for a QWEN API key in ~/.openclaw/openclaw.json and in the QWEN_API_KEY env var. If found it will send base64-encoded screenshots to dashscope.aliyuncs.com (qwen-vl-max) for optional vision analysis. If you do not want screenshots leaving your machine, remove the API key(s) or avoid using the --vision option.
- Metadata/registry fields do not declare the required local tools (peekaboo, cliclick, swift) and the script reads a user config path that is not advertised. This mismatch is a red flag — review the files locally before running.
- If you decide to use it: audit the included scripts (send_message.py and ocr_screen.swift) yourself, run them in a controlled environment or VM, and avoid running the vision feature unless you trust the remote endpoint and the API key in use.
If you want, I can: (a) point out the exact lines where the script reads your config and where it sends network requests, (b) suggest edits to disable remote uploads, or (c) produce a minimal checklist to run this safely.