Cron Job Time Tracking

Cronitor will alert you if your job runs longer than expected. These alerts provide value beyond simple pass/fail alerts, for example:

  • System resource planning. As you accumulate new scheduled jobs, it can be difficult to predict if and when running jobs will overlap. With only a few jobs, you're able to schedule them far enough apart that one should be done before the next starts. But as time goes on and the performance characteristics of your jobs change, cron scheduling becomes a moving target.
  • Monitoring performance creep. It's an issue many people can relate to. You've configured a cron job that runs as quickly as expected, only to see that performance deteriorate over time as your database gets bigger or your customer list grows longer. Often this happens unnoticed, out of sight, until your once-per-hour job starts taking more than an hour to run.

How duration alerts work

To send duration alerts, Cronitor must know when your job starts running and when it completes successfully. To report these events, you should ping your /run endpoint as your job starts and your /complete endpoint when it completes. For more details on ping URL endpoints and where you should integrate your ping requests, see:

With /run and /complete pings, Cronitor is able to calculate an expected duration range for your job. The predictions grow more accurate as we learn more about your job performance but will not work perfectly for every job. If it's normal for job runtime to vary significantly, or if you are receiving has not completed alerts earlier than you would like, you can provide a fixed runtime duration:

Understanding When Alerts are Sent

When using Cron Job monitoring, or if you define a "ran longer than" heartbeat rule, we start the timer when you ping your /run URL and send the alert once your defined threshold is reached. For example if you define a rule with a 5 minute duration and your job pings /run at 06:00, then dies due to a syntax error in your code, your alert will be triggered after 06:05. We usually send alerts within 1 minute of failure.

A "ran less than" heartbeat rule is available. This rule is not evaluated until we receive a /run ping and subsequent /complete ping. In most cases your rule is evaluated within 1 minute of your /complete ping.

Troubleshooting Duration Based Rules

  • If you receive unexpected alerts for a new monitor, review your ping history from the dashboard to ensure it matches your expectations. If there are missing pings, double check your integration. Verify that you are pinging the /run endpoint before your job and your /complete endpoint after. A missing or misplaced /run or /complete ping is the most common integration mistake.
  • If you've verified that you're pinging /run and /complete appropriately, try changing the logic in your commands from && "and" logic to ; "or" logic.

    Switch from only pinging /complete if your command finishes successfully:

    * */2 * * * curl -m 10 ; /path/to/ && curl -m 10

    To always pinging /complete:

    * */2 * * * curl -m 10 ; /path/to/ ; curl -m 10

    By changing to ; you will ping your Cronitor URL even if your command exits with a non-0 exit code. If this works, you should investigate why your command is exiting with an error code.