Cron Sentinel

Automation

Make scheduled jobs bulletproof and get alerted when one SILENTLY fails. Use this skill whenever the user is setting up, debugging, or worrying about recurring/scheduled tasks, including phrases like 'schedule this job,' 'set up a cron job,' 'run this every night,' 'my cron job isn't running,' 'how do I know if my scheduled task failed,' 'my backup didn't run and nobody told me,' 'add retries to my cron,' 'why did my job stop running,' 'monitor my scheduled jobs,' 'alert me when a job fails,' 'is my nightly task still working,' or 'make this schedule reliable.' Wraps any scheduled command so every run is recorded with retries and timeouts, then a watchdog catches BOTH crashed jobs and the more dangerous silent failures - jobs that simply never ran (machine asleep, cron misconfigured, command renamed) and produced no error for anyone to notice. The dead-man's switch for your cron.

Install

openclaw skills install cron-sentinel

Cron Sentinel

Everyone monitors the job that crashes. Almost nobody catches the job that just... stops. The machine was asleep at 3am, someone renamed the script, the cron daemon wasn't reloaded - the task never ran, threw no error, and the first you hear of it is when the backup you needed isn't there. Cron Sentinel is a dead-man's switch for your scheduled tasks: it records every run and alerts you both when a job fails loudly and when it goes silent.

Four jobs:

  1. wrap - run your scheduled command through Sentinel so each run is recorded (start, end, exit code, duration, output tail), with optional retries and a per-attempt timeout.
  2. check - the watchdog. Reports any job that crashed (non-zero exit) or is overdue (expected to have run by now but hasn't). Exits non-zero if anything is wrong, so it can drive an alert.
  3. status - a quick table of every tracked job: last run, health, when it's next expected.
  4. crontab - print a ready-to-paste crontab line that wraps a command, plus a watchdog line.

When to use this

Whenever recurring tasks come up: "schedule this," "run it every night," "my cron job isn't running," "how would I even know if it failed," "add retries," "my backup didn't run and nothing warned me," "monitor my jobs." If the user is creating a schedule, set it up wrapped from the start. If they're debugging a schedule that misbehaved, status and check tell you what actually happened on the last run.

This is complementary to OpenClaw's own scheduler and to system cron - Sentinel doesn't replace what triggers the job, it makes whatever triggers it observable and self-reporting.

The tool

# Wrap a command (this is what cron actually runs):
python cron_sentinel.py wrap --name backup --expect-every 1d --retries 2 -- /path/backup.sh

# The watchdog (run this on its own short schedule):
python cron_sentinel.py check          # exits 1 if any job failed or is overdue
python cron_sentinel.py status         # human-readable table

# Generate the crontab lines for the user:
python cron_sentinel.py crontab --name backup --schedule "0 3 * * *" --expect-every 1d -- /path/backup.sh

The command to run always goes after --. --expect-every accepts human durations (30m, 12h, 1d, 1w) and is what makes silent-failure detection possible: it's how Sentinel knows a job should have run by now. State is stored in ~/.cron-sentinel/state.json (override with --state or $CRON_SENTINEL_STATE); all timestamps are UTC so it stays correct across timezones and DST.

The pattern to set up

The whole design is two scheduled entries:

  1. The wrapped job - the real task, run through wrap, on its normal schedule.
  2. The watchdog - check on a short schedule (e.g. every 30 min) that pipes its output to wherever the user gets notified.

crontab prints both lines. Walk the user through pasting them, or, in OpenClaw, register the wrapped command as the scheduled task and a check as a second short-interval task whose output routes to their channel.

How to help

  1. Setting up a new schedule: ask for the command, how often it should run, and whether retries make sense (yes for anything network-dependent). Then produce the wrapped crontab line via crontab, and explain the watchdog line. Always set --expect-every - without it, silent failures can't be detected, which is the whole point.
  2. "Is my job still working?" run status and read back the last run time and health. If it shows overdue, that's your silent failure.
  3. "My job failed / isn't running": run check. A 💥 failed means it ran and errored - show the captured output tail. A 🔇 overdue means it never ran - the problem is upstream (the trigger, the machine, the path), not the command itself. That distinction saves a lot of wasted debugging.
  4. Wiring up alerts: the check exit code and output are designed to feed a notifier. In OpenClaw, schedule check and route its output to the user's channel so they only hear from it when something is actually wrong.

Honest interpretation

  • overdue uses a grace window (default 50% of the interval) so a job that's merely a little late doesn't cry wolf. Tune with --grace if a job's timing is naturally loose.
  • A check that reports all healthy is a real green light - say so plainly.
  • Retries help with transient failures (a flaky network call). They won't fix a broken command, and Sentinel still records the final failure - so don't let retries mask a job that's genuinely broken.