Install
openclaw skills install ssh-labManage and monitor remote GPU servers via SSH with GPU, disk, process status, alerts, log tailing, file sync, and health diagnostics.
openclaw skills install ssh-labStructured SSH operations for managing remote GPU servers. Standardizes what's universal (GPU/disk/process), delegates interpretation to the LLM.
ssh-lab, gpu status, server status, nvidia-smi, remote servercd /path/to/ssh-lab && npm run build
After building, use the CLI via any of:
ssh-lab status all # if npm link'd or installed globally
npx ssh-lab status all # via npx
node dist/cli.js status all # direct invocation
The CLI reads ~/.ssh/config automatically — zero configuration needed for existing SSH hosts.
ssh-lab hosts [--json]
ssh-lab status [host|all] [--json] [--timeout ms] [--heartbeat] [--quiet]
Returns structured probe data: GPU utilization/VRAM/temp, disk usage, active processes.
Includes automatic alerts: GPU idle, disk full (>90%), high temp (>85°C).
--heartbeat gives compact one-liner output for agent integration. --quiet suppresses all-clear output (for cron).
ssh-lab run <host|all> <command...> [--json] [--timeout ms]
Supports all for parallel execution across all hosts.
ssh-lab tail <host> <path> [-n lines] [--json]
Default: last 50 lines. Smart truncation at 50KB.
ssh-lab ls <host> <path> [--sort time|size|name] [--json]
Lists files with size, date, permissions. Great for checkpoint directories.
ssh-lab df <host|all> [path] [--json]
Shows disk usage for all filesystems. Optional path runs du -sh on a specific directory.
ssh-lab sync <host> <src> <dst> [--direction up|down] [--dry-run] [--exclude pat1,pat2]
Direction: down (remote→local, default), up (local→remote). Always use --dry-run first!
ssh-lab watch <host> <path> [-n lines] [--prev-hash hash] [--json]
Captures file state + metadata. Returns changed: true/false when --prev-hash is provided.
Agent calls this repeatedly via cron/heartbeat — the tool itself is stateless.
ssh-lab compare <host1> <host2> [--probes gpu,disk,process] [--json]
ssh-lab compare all [--probes gpu]
ssh-lab compare GMI4 GMI5 GMI6 --probes gpu,disk
Side-by-side comparison of GPU, disk, and process data across hosts.
Probes default to gpu,disk,process. Supports any number of hosts.
Uses bounded concurrency pool for parallel data collection.
ssh-lab doctor <host> [--json] [--timeout ms]
Runs connectivity and health diagnostics on a host. Checks SSH reachability, authentication, command execution, and basic system health.
# List configured rules
ssh-lab alert list [--json]
# Add a rule
ssh-lab alert add <kind> <host> [--threshold N] [--process-pattern regex]
# Remove a rule
ssh-lab alert remove <rule-id>
# Evaluate alerts against live status
ssh-lab alert check [host|all] [--json] [--quiet]
Alert kinds: gpu_idle, disk_full, process_died, ssh_unreachable, oom_detected, high_temp
Examples:
ssh-lab alert add disk_full all --threshold 90
ssh-lab alert add process_died GMI4 --process-pattern "torchrun|deepspeed"
ssh-lab alert check all --json
ssh-lab add <name> <user@host> [--port N] [--tags a,b] [--notes "desc"]
All commands support three output modes:
--json or -j): Structured JSON for programmatic use--raw or -r): Raw command output onlyCommands use per-type timeout defaults (user --timeout always overrides):
hosts, add, doctor — connectivity checksstatus, run, ls, df, alert, compare — normal probestail, watch — streaming/log tailingsync — rsync bulk transfersssh — inherits user's config, keys, ProxyJump/tmp/ssh-lab-%r@%h:%p sockets; auto-cleanup detects stale sockets (ssh -O check), gracefully exits (ssh -O exit), then removes dead socket files before retry--timeout always overridesstatus all / run all queries all hosts concurrently via withPool (default: 5)~/.ssh/config, SSH alias is used directly so all ProxyJump/Match/Include rules are respectedIn HEARTBEAT.md:
- [ ] Run: ssh-lab status all --heartbeat
Check for: GPU idle, disk >90%, processes died
- [ ] Run: ssh-lab alert check all --json
Act on any critical firings
Custom hosts stored in ~/.config/ssh-lab/config.json (auto-created, XDG-compliant).
SSH hosts from ~/.ssh/config discovered automatically via ssh -G.
Override config path with SSH_LAB_CONFIG env var.