E2B Desktop
v1.0.0Control E2B Desktop sandboxes (virtual Linux desktops) for computer-use agents. Use when you need to create/manage sandboxed desktop environments, take scree...
Security Scan
OpenClaw
Suspicious
medium confidencePurpose & Capability
Name/description match the included scripts and SDK usage: scripts provide mouse/keyboard, screenshots, run commands, and VNC streaming as advertised. However, the registry metadata lists no required environment variables while the SKILL.md and every script require an E2B_API_KEY (and optionally E2B_SANDBOX_ID). That metadata omission is an inconsistency that could mislead reviewers or automation.
Instruction Scope
Runtime instructions and scripts stay within the sandbox domain: they read/write ~/.e2b_state, use E2B_API_KEY, and call the e2b-desktop SDK to control the VM. They also expose sandbox screenshots, stream URLs and (when requested) stream auth keys, and provide a run_command.sh that executes arbitrary shell commands inside the sandbox. Those behaviors are expected for a desktop-control skill but raise data-exfiltration risk (screenshots/streams/printed auth keys) which the SKILL.md demonstrates by sending screenshots to an LLM in an example.
Install Mechanism
No install spec is included (instruction-only); the SKILL.md asks users to 'pip install e2b-desktop'. The skill itself does not download arbitrary code or use obscure URLs. Risk depends on the external 'e2b-desktop' package provenance (not included here).
Credentials
The scripts require E2B_API_KEY (and may use E2B_SANDBOX_ID / ~/.e2b_state), but the registry metadata declares no required env vars or primary credential. Requiring a service API key is proportionate to the purpose, but the missing declaration is a transparency problem. Also, the skill prints stream auth keys and URLs (sensitive) to stdout which could be captured by whatever calls these scripts.
Persistence & Privilege
always is false and the skill writes only its own state file (~/.e2b_state). It does not request permanent platform-wide privileges or modify other skills. Note: because disable-model-invocation is false (normal default), an agent allowed to invoke skills autonomously could use this skill to run commands in sandboxes and start streams; combine that with the other concerns when granting autonomous permissions.
What to consider before installing
This skill appears to implement the advertised sandbox control functions, but check these before installing:
- Verify the publisher/source and the 'e2b-desktop' Python package on PyPI (or the expected distribution) — the repo/homepage is missing in the metadata.
- Provide an API key (E2B_API_KEY) with minimal privileges and rotate it if you later remove the skill; the registry metadata failing to declare it is an oversight.
- Be cautious about screenshots, VNC stream URLs, and printed AUTH_KEY values — these are sensitive and could leak data if logged or sent to external services (e.g., the example that sends screenshots to an LLM).
- Review who/what can call the scripts (especially run_command.sh) — they allow arbitrary command execution inside the sandbox; ensure the sandboxing/isolation meets your threat model.
- If you plan to allow autonomous agent invocation, limit that agent's permissions and monitor for unexpected stream starts or state changes.
If you want a stronger assurance, ask the publisher for an official homepage/repo and a signed release of the 'e2b-desktop' SDK or provide the SDK source for review.Like a lobster shell, security has layers — review code before you run it.
latest
E2B Desktop Skill
Control a headless Linux desktop (Ubuntu + XFCE) via the e2b-desktop Python SDK.
All scripts live in scripts/ and wrap the SDK in bash for easy agent use.
Prerequisites
pip install e2b-desktop
export E2B_API_KEY=e2b_***
State Management
start_sandbox.shsaves the sandbox ID to~/.e2b_state- All other scripts auto-load it from there
- Override anytime with
export E2B_SANDBOX_ID=<id> - Sandboxes survive script exit — reconnect with
Sandbox.connect(sandbox_id)
Scripts
| Script | Usage | Description |
|---|---|---|
start_sandbox.sh | [--resolution 1280x800] [--timeout 300] [--stream] | Create sandbox; optionally start VNC stream |
kill_sandbox.sh | [SANDBOX_ID] | Kill sandbox and remove state |
screenshot.sh | [OUTPUT_FILE] | Take screenshot → PNG (default: /tmp/e2b_screenshot.png) |
click.sh | X Y | Left click at coordinates |
right_click.sh | X Y | Right click |
double_click.sh | X Y | Double click |
middle_click.sh | X Y | Middle click |
move_mouse.sh | X Y | Move cursor (no click) |
drag.sh | X1 Y1 X2 Y2 | Click-drag between two points |
scroll.sh | AMOUNT | Scroll (positive=up, negative=down) |
type_text.sh | "text" | Type text at current cursor |
press_key.sh | KEY [KEY2...] | Press key or combo (e.g. ctrl c) |
run_command.sh | "cmd" | Run shell command inside sandbox |
open_url.sh | URL_OR_PATH | Open URL or file in default app |
launch_app.sh | APP_NAME | Launch app (e.g. firefox, vscode) |
stream_start.sh | [--auth] | Start VNC stream; --auth for password-protected |
stream_stop.sh | (none) | Stop VNC stream |
get_cursor.sh | (none) | Print CURSOR_X and CURSOR_Y |
get_screen_size.sh | (none) | Print SCREEN_WIDTH and SCREEN_HEIGHT |
list_windows.sh | [APP_NAME] | List app windows or show active window |
wait.sh | MILLISECONDS | Wait N ms (sandbox-side) |
Computer-Use Agent Loop Pattern
SCRIPTS="skills/e2b-desktop/scripts"
# 1. Start sandbox
source <($SCRIPTS/start_sandbox.sh --resolution 1280x800 --stream)
echo "Sandbox: $SANDBOX_ID"
echo "View at: $STREAM_URL"
# 2. Agent loop
while true; do
# Capture screen
$SCRIPTS/screenshot.sh /tmp/screen.png
# Send to LLM, parse action... (your code)
ACTION=$(llm_decide /tmp/screen.png)
case "$ACTION" in
click:*) IFS=: read -r _ x y <<< "$ACTION"; $SCRIPTS/click.sh $x $y ;;
type:*) $SCRIPTS/type_text.sh "${ACTION#type:}" ;;
key:*) $SCRIPTS/press_key.sh ${ACTION#key:} ;;
done) break ;;
esac
done
# 3. Clean up
$SCRIPTS/kill_sandbox.sh
Key Notes
scroll.sh AMOUNT: positive = scroll up, negative = scroll down (matchesdesktop.scroll(amount)API)press_key.sh ctrl c: multiple args become a key combo viadesktop.press(["ctrl", "c"])run_command.shexits with the sandbox command's exit code- All mouse coordinate scripts accept integer pixel coordinates matching sandbox resolution
- VNC stream: only one active stream at a time; stop before switching windows
Comments
Loading comments...
