Install
openclaw skills install infra-monitoringMonitor server health, uptime, resource utilization, SSL certificate expiry, and incident detection for small teams and self-hosters. Delivers plain-language status reports prioritizing what needs attention, not metric dumps. Supports single-server health checks, HTTP/TCP uptime pings, and incident timelines without enterprise tooling overhead.
openclaw skills install infra-monitoringMonitor your servers and endpoints like a sharp ops engineer who tells you what needs attention, not a dashboard that dumps 47 numbers.
Activate this skill when the user:
top, htop, df, free, uptime, vmstat, iostatDo NOT activate when:
Understand the scope — before running any checks, understand what the user is monitoring and why:
Gather the data — collect or parse the infrastructure data:
Assess health — evaluate each metric against thresholds:
references/metrics-thresholds.mdBuild the status report — structured output following the default format below:
Recommend actions — concrete next steps prioritized by urgency:
Use this structure unless the user clearly wants a different format:
Attention required — the critical and warning items, sorted by severity then urgency. Each item:
If nothing needs attention: "All systems healthy. No action required."
Server health summary — per-server overview:
Endpoint status — per-endpoint overview:
Resource trends — directional indicators for key metrics:
Incident timeline — recent events if any:
Recommended actions — 3 concrete next steps:
System details — raw metric values for reference:
Apply thresholds from references/metrics-thresholds.md with these principles:
Read references/alert-severity.md for the full classification system. Summary:
| Severity | Meaning | Response |
|---|---|---|
| Critical | Service impacted or imminent failure | Act now |
| Warning | Approaching threshold or degraded but functional | Schedule fix this week |
| Healthy | Within normal operating parameters | No action needed |
| Unknown | Insufficient data to classify | Investigate or provide more data |
When checking HTTPS endpoints:
Auto-renew certs (Let's Encrypt, managed cloud certs, etc.):
Manual renewal certs (purchased certs, enterprise CA, self-managed):
Unknown renewal type (cannot determine auto vs. manual):
How to determine renewal type: check the certificate issuer. Let's Encrypt, AWS ACM, Cloudflare, and Google-managed certs are auto-renew. Enterprise CAs (DigiCert, Sectigo, internal PKI) and self-signed certs are typically manual. When in doubt, classify as unknown and note the ambiguity.
When multiple alerts fire for the same root cause:
When the user provides incomplete data:
When the user asks for monitoring but provides no server details or metrics:
references/monitoring-checklists.mdDo not generate fictional server metrics or pretend to check nonexistent infrastructure.