Install
openclaw skills install multi-agent-statusCross-agent health monitoring for multi-host OpenClaw deployments. Each agent pushes structured status reports (JSON) to a central location. A PM/monitoring agent reads them and alerts on failures. Works across Windows, Linux, and mixed environments.
openclaw skills install multi-agent-statusIn a multi-agent OpenClaw deployment, each agent monitors itself but has blind spots. This skill solves that by having every agent push structured health reports to a shared location, where a monitoring agent reads and alerts on issues.
Agent A (Host 1) --push--> /shared/agent-status/agent-a.json
Agent B (Host 2) --push--> /shared/agent-status/agent-b.json
Agent C (Host 3) --push--> /shared/agent-status/agent-c.json
↓
Monitor Agent reads all
→ alerts on failures
→ updates dashboard
Each agent pushes a JSON status report containing:
Scripts available in the Collective Skills repo
On your central/shared host:
mkdir -p /path/to/agent-status
chmod 777 /path/to/agent-status
Scripts are in references/
Copy the script from references/agent-status-report.sh to your preferred location and make it executable:
#!/bin/bash
# agent-status-report.sh
AGENT_NAME="my-agent"
STATUS_DIR="/path/to/agent-status"
REPORT="$STATUS_DIR/$AGENT_NAME.json"
# Get gateway status
GW_STATUS=$(openclaw gateway status 2>&1)
if echo "$GW_STATUS" | grep -q "RPC probe: ok"; then
GATEWAY="healthy"
elif echo "$GW_STATUS" | grep -q "RPC probe: failed"; then
GATEWAY="failed"
else
GATEWAY="unknown"
fi
# Count cron errors
CRON_LIST=$(openclaw cron list 2>&1)
TOTAL=$(echo "$CRON_LIST" | grep -c "ok\|error" || echo 0)
ERRORS=$(echo "$CRON_LIST" | grep -c "error" || echo 0)
# Write report
cat > "$REPORT" << EOF
{
"agent": "$AGENT_NAME",
"timestamp": "$(date -Iseconds)",
"gateway": "$GATEWAY",
"crons": {
"total": $TOTAL,
"errors": $ERRORS
}
}
EOF
echo "Status report pushed at $(date)"
For remote agents (different hosts), use SCP to push:
# Add to the end of the script:
scp "$REPORT" user@central-host:/path/to/agent-status/
openclaw cron add \
--name "agent-status-report" \
--every "4h" \
--message "Run the agent status report script" \
--no-deliver
The monitoring agent's HEARTBEAT.md should include:
## Agent Status Check
1. Read all files in /path/to/agent-status/*.json
2. For each agent:
- Is gateway healthy? If "failed" → alert immediately
- Any cron errors? If errors > 0 → ping the agent
- Is timestamp recent (within 8 hours)? If stale → agent may be down
3. Update DASHBOARD.md with findings
| Condition | Action |
|---|---|
Gateway failed | Alert human immediately |
| Cron errors ≥ 2 | Ping owning agent for ETA on fix |
| Report stale (>8h) | Ping agent — might be down |
| Report missing | Agent never pushed — check if configured |
# Agent Health Dashboard
*Last updated: 2026-04-02 14:00*
| Agent | Host | Gateway | Crons | Errors | Last Report |
|-------|------|---------|-------|--------|-------------|
| Hyjack | OPT1 | ✅ healthy | 16 | 2 | 10m ago |
| Rook | PC-147 | ✅ healthy | 9 | 0 | 2h ago |
| Dozer | Vigo | ✅ healthy | 3 | 0 | 1h ago |
⚠️ Hyjack: 2 cron errors (Research Scout, sunday-self-compassion)
For Windows agents, copy references/agent-status-report.ps1 and run it with:
# agent-status-report.ps1
$timestamp = Get-Date -Format "o"
$tempFile = "$env:TEMP\agent-status.json"
# Gateway check
$gwStatus = openclaw gateway status 2>&1 | Out-String
if ($gwStatus -match "RPC probe: ok") { $gw = "healthy" }
elseif ($gwStatus -match "RPC probe: failed") { $gw = "failed" }
else { $gw = "unknown" }
# Build report
@{
agent = "my-agent"
timestamp = $timestamp
gateway = $gw
} | ConvertTo-Json | Out-File $tempFile -Encoding utf8
# Push to central host
scp $tempFile user@central-host:/path/to/agent-status/my-agent.json