Midscene Automations Skills for Computer

v1.0.3

Vision-driven desktop automation using Midscene. Control your desktop (macOS, Windows, Linux) with natural language commands. Operates entirely from screensh...

3· 2.4k·10 current·11 all-time
byLeyang@quanru
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
The SKILL.md describes a vision-driven desktop automation tool that uses remote LLM/vision models and a CLI (npx @midscene/computer). The environment variables it documents (MIDSCENE_MODEL_API_KEY, MIDSCENE_MODEL_NAME, MIDSCENE_MODEL_BASE_URL, MIDSCENE_MODEL_FAMILY, etc.) are coherent with that purpose. However the registry metadata claims no required env vars or primary credential, which is inconsistent with the runtime instructions.
Instruction Scope
Instructions are explicit: run synchronous npx commands, take screenshots, read the saved image files, use act to perform UI interactions, and summarize results. All actions described are within the scope of a desktop automation tool. Note: the workflow inherently captures screen contents and controls input devices — high-sensitivity operations but expected for this skill's purpose.
Install Mechanism
This is an instruction-only skill with no install spec and no code files present, so nothing is written to disk by the skill itself. The runtime depends on running an external package via npx (@midscene/computer), which will fetch code at runtime — that is normal but means runtime code provenance matters.
!
Credentials
The SKILL.md requires multiple environment variables including API keys and base URLs for third-party model providers. That is proportionate to using remote models, but it contradicts the registry's 'no required env vars' fields. Requesting API keys for cloud models is legitimate here, but these are sensitive credentials; the skill asks for them without declaring them in metadata, and the source/homepage is missing, which raises concern about where those credentials are used and who can access them.
Persistence & Privilege
No persistent installation, no always: true flag, and the skill is user-invocable only. The skill will run commands that give it active control of the desktop while invoked, but it does not request elevated persistent platform privileges in the metadata.
What to consider before installing
This skill will take screenshots and control your mouse/keyboard while it runs, and it requires remote-model API keys (MIDSCENE_MODEL_API_KEY and related vars). However the registry metadata does not list those env vars and the skill has no declared source or homepage — that mismatch is a red flag. Before installing or using it: 1) Verify the package source (official midscenejs site or GitHub repo and publisher identity); request the skill's code or homepage and confirm the exact npx package and version it runs. 2) Use dedicated, limited-scope API keys (not your primary cloud account keys), and consider creating test-only model accounts. 3) Run the tool in a contained environment or VM and avoid exposing sensitive apps/documents while testing. 4) Ask the publisher why registry metadata omits required env vars and request a full manifest (package.json, exact npx package SHA). 5) Be prepared to rotate/revoke any model API keys after testing. If you cannot verify the source or provenance of the npx package, treat the skill as risky and avoid providing high-privilege credentials or running it on machines with sensitive data.

Like a lobster shell, security has layers — review code before you run it.

latestvk97713ykdexj0vahkw9f01d4j182e3yh

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

Comments