Self Monitor

Proactive self-monitoring of infrastructure, services, and health. Tracks disk/memory/load, service health, cron job status, recent errors. Auto-fixes safe i...

MIT-0 · Free to use, modify, and redistribute. No attribution required.
0 · 668 · 4 current installs · 4 all-time installs
MIT-0
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Suspicious
medium confidence
Purpose & Capability
Name and description match the provided instructions: the SKILL.md contains commands to check disk, memory, load, services, cron, and logs, and to produce reports. Optional checks (systemctl, docker, tailscale) are reasonable for a monitoring skill and are used only as examples.
!
Instruction Scope
Instructions tell the agent to read system logs (/var/log, journalctl, dmesg) and application logs via a broad wildcard (~/workspace/projects/*/logs/*.log) and to perform destructive auto-fixes (find /var/log -name "*.log" -mtime +7 -delete, rm -rf ~/.cache/pip, rm -f /tmp/agent-temp-*). 'Attempt restart' is vague (no concrete safe restart procedure, no confirmation or rollback). These actions are within a monitoring tool's domain but lack safeguards, scoping, and explicit permission steps — increasing risk of data exposure or accidental deletion.
Install Mechanism
Instruction-only skill; no install spec and no code files. This minimizes supply-chain risk because nothing is downloaded or written by a package installer.
Credentials
The skill declares no required environment variables or credentials. It references optional local tools (docker, tailscale, systemctl) but does not request unrelated secrets. No disproportionate credential access is requested.
Persistence & Privilege
The skill does not request 'always' inclusion and has no installer, but the README/SKILL.md suggests adding a cron entry to run health checks. Combined with the skill's auto-fix commands, allowing autonomous invocation (platform default) or running as a privileged user could enable potentially destructive operations without human oversight.
What to consider before installing
This skill is functionally aligned with monitoring, but it includes sweeping log reads and file-deletion commands and gives no safeguards for auto-fixes or restarts. Before installing or enabling it: 1) Run all commands in a safe staging environment to verify behavior; 2) Remove or modify any destructive commands (find -delete, rm -rf) or make them require explicit confirmation or a dry-run flag; 3) Narrow log-file paths instead of using wildcards like ~/workspace/projects/*/logs/*.log to avoid exposing unrelated data; 4) Ensure the agent runs as an unprivileged user (not root) so deletions and service restarts are limited; 5) If you allow autonomous invocation or add it to cron, require alerting/manual approval before any auto-fix; 6) Backup logs before enabling auto-cleanup and inform stakeholders about retention policies. If you want a lower-risk deployment, use the scripts in read-only/dry-run mode and add explicit approval steps for any corrective action.

Like a lobster shell, security has layers — review code before you run it.

Current versionv1.0.1
Download zip
latestvk978yyebezvvx8ztqfxwrx1vm981ngb1

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

Compatible with Claude Code, Codex CLI, Cursor, Windsurf, and any SKILL.md-compatible agent.

Self Monitor

Proactive self-monitoring: infrastructure, services, and health.

Usage

Run during heartbeats or scheduled checks.

1. Infrastructure Health

# Disk usage
df -h / | awk 'NR==2 {print $5}' | tr -d '%'

# Memory usage  
free -m | awk 'NR==2 {printf "%.0f", $3/$2*100}'

# Load average
uptime | awk -F'load average:' '{print $2}' | awk -F',' '{print $1}'

# Top processes by memory
ps aux --sort=-%mem | head -10

# Top processes by CPU
ps aux --sort=-%cpu | head -10

Thresholds:

MetricWarningCritical
Disk> 80%> 90%
Memory> 85%> 95%
Load> 2.0> 4.0

2. Service Health

# Check if a process is running
pgrep -f "your_process_name" >/dev/null && echo "OK" || echo "FAIL"

# Check HTTP endpoint
curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/health

# Check systemd service
systemctl is-active --quiet nginx && echo "OK" || echo "FAIL"

# Check Docker container
docker ps --filter "name=mycontainer" --filter "status=running" -q | grep -q . && echo "OK" || echo "FAIL"

# Tailscale (if using)
tailscale status --json 2>/dev/null | jq -r '.Self.Online' || echo "FAIL"

3. Cron Job Health

# Check recent cron executions
grep CRON /var/log/syslog | tail -20

# Count failures in last 24h
grep -c "CRON.*error\|CRON.*fail" /var/log/syslog

# List scheduled jobs
crontab -l

4. Recent Errors

# Check system logs for errors
journalctl -p err --since "1 hour ago" 2>/dev/null | tail -20

# Check application logs
tail -50 ~/workspace/projects/*/logs/*.log 2>/dev/null | grep -i "error"

# Check dmesg for hardware/kernel issues
dmesg | tail -20 | grep -i "error\|fail\|warn"

Quick Health Check (for heartbeat)

#!/bin/bash
# Quick health snapshot

DISK=$(df -h / | awk 'NR==2 {print $5}' | tr -d '%')
MEM=$(free -m | awk 'NR==2 {printf "%.0f", $3/$2*100}')
LOAD=$(uptime | awk -F'load average:' '{print $2}' | awk -F',' '{print $1}' | xargs)

echo "Disk: ${DISK}% | Mem: ${MEM}% | Load: ${LOAD}"

# Alert if thresholds exceeded
[ "$DISK" -gt 90 ] && echo "⚠️ Disk critical!"
[ "$MEM" -gt 95 ] && echo "⚠️ Memory critical!"

Proactive Actions

When issues detected:

IssueAuto-ActionAlert?
Disk > 90%Clean temp files, old logsYes
Key process downAttempt restartYes
Cron 3+ failuresGenerate reportYes
Memory > 95%List top processesYes

Auto-fixable (safe):

# Clean old logs (> 7 days)
find /var/log -name "*.log" -mtime +7 -delete 2>/dev/null
find ~/.cache -type f -mtime +7 -delete 2>/dev/null

# Clean temp files
rm -f /tmp/agent-temp-* 2>/dev/null
rm -rf ~/.cache/pip 2>/dev/null

Report Format

## 🔍 Self-Monitor Report - [TIME]

### Health Summary
| Metric | Value | Status |
|--------|-------|--------|
| Disk | XX% | ✅/⚠️/🔴 |
| Memory | XX% | ✅/⚠️/🔴 |
| Load | X.X | ✅/⚠️/🔴 |
| Services | X/Y up | ✅/⚠️ |

### Issues Found
- [Issue 1]: [Action taken or recommended]

### Top Resource Consumers
| Process | CPU% | MEM% |
|---------|------|------|
| ... | ... | ... |

Integration with Scheduled Tasks

Add to your crontab or task scheduler:

# Run health check every 30 minutes
*/30 * * * * /path/to/health-check.sh >> /var/log/health-check.log 2>&1

Or run manually as part of your workflow:

./health-check.sh

Files

2 total
Select a file
Select a file to preview.

Comments

Loading comments…