Cron job monitoring

With Cronitor you can create and integrate a cron job monitor in minutes, no coding required. When you create a monitor with a rule like Not on Schedule */15 * * * 1-5 we will alert you of any noncompliance with the schedule, including:

  • The job does not start on time
  • The job runs longer than expected
  • The job runs outside the expected schedule
  • The job overlaps itself

Creating a Not on Schedule monitor

Monitoring your cron job is easy:

  1. From your dashboard, click the button to create a new monitor and select Cron job
  2. Paste the schedule expression from your crontab:
  3. If your server clock timezone is not set to UTC, select the appropriate timezone. A default timezone can be set from your Account Settings page.
  4. Add at least one notification method across Email, SMS, Slack, Hipchat, PagerDuty, etc
  5. Give the monitor a descriptive name like Offsite Database Backup and add optional tags or notes
  6. Save the monitor and add the unique ping url provided to your Crontab. See the integration guide for details.

Your new monitor is created in a paused state; monitoring will not begin until the first ping is received.

Expected duration alerts

It's great to know that your cron job started but it's really valuable to know that it completed successfully. To alert you as quickly as possible when that's not the case we've developed a duration prediction algorithm. The predictions grow more accurate as we learn more about your job performance but will not work perfectly for every job. If it's normal for job runtime to vary significantly, or if you are receiving has not completed alerts earlier than you would like, you can provide a fixed runtime duration:

  • With less data available on a new monitor we use wider grace periods and we will not send has not completed alerts until the job has completed successfully once.
  • API Users: Specify fixed runtime duration by adding a a custom ran_longer_than rule along with the not_on_schedule rule.

To send duration alerts, Cronitor must know when your job starts running and when it completes successfully. To report these events, you should ping your /run endpoint as your job starts and /complete as it exits successfully. For more details on ping URL endpoints and where you should integrate your ping requests, see:

Overlap Alerts

A cron job running before the last occurrence completes is a common problem, especially in jobs that run frequently. A few seconds of overlap may be harmless, but a pile-up of jobs can take down your server. Cronitor will alert you if your job does not terminate before running again.

Troubleshooting Not on Schedule Rules

  • If for any reason your jobs do not start immediately at the scheduled time, you can provide an expanded grace period using our API by setting the grace_seconds on your not_on_schedule rule.
  • If you receive unexpected alerts for a new monitor, review your ping history from the dashboard to ensure it matches your expectations. If there are missing pings, double check your integration. Verify that you are pinging the /run endpoint before your job and your /complete endpoint after. A missing or misplaced /run or /complete ping is the most common integration mistake.
  • By using the date command on your linux server or tzutil /g on windows you can verify that your server time matches the timezone set in your account settings.
  • If you've verified that you're pinging /run and /complete endpoints appropriately, try changing the logic in your commands from && "and" logic to ; "or" logic.

    Switch from only pinging /complete if your command finishes successfully:

    * */2 * * * curl -m 10 https://cronitor.link/d3x0/run ; /path/to/db_backup.sh && curl -m 10 https://cronitor.link/d3x0/complete

    To always pinging your /complete URL:

    * */2 * * * curl -m 10 https://cronitor.link/d3x0/run ; /path/to/db_backup.sh ; curl -m 10 https://cronitor.link/d3x0/complete

    By changing to ; you will ping your Cronitor URL even if your command exits with a non-zero exit code. If this works, you should investigate why your command is exiting with an error code.