Server Monitoring

Heartbeat Monitoring Guide

Heartbeat monitoring is a time-tested method of tracking the health of a device or software system by sending regular heartbeat events to a remote monitoring service. Alerts are triggered when:

  • The heartbeats stop being sent.
  • The rhythm (schedule) of the heartbeats changes.
  • Data included with the heartbeat event fails user defined assertions.

Creating Heartbeat Monitors

Heartbeat monitors can be configured in a variety of ways.  A complete list of monitor attributes can be found in the Monitor API docs.

# the unique identifier (key)
airflow-heartbeat:
    # configure monitor as a heartbeat
    type: 'heartbeat'
    # When events will be sent to Cronitor (also supports cron and HH:MM 24hr time expressions)
    schedule: 'every 60 seconds'
    # assertions based on data along with the heartbeat event
    assertions:
        - 'metric.error_count < 3'
    # who to notify when Cronitor detects a problem
    notify:
        - 'devops-alerts'

Cronitor's open-source SDKs support managing the configuration of all your monitors from a single YAML configuration file, which be synced with Cronitor as part of your build or deploy process.

The following Python example is representative of how this works across our SDKs.

import cronitor
cronitor.api_key = '{{api_key}}'
cronitor.config = '/path/to/cronitor.yaml'
cronitor.apply_config() # send monitor configuration to Cronitor

Sending Heartbeat Events

Once a monitor has been created, it will remain in an inactive state until it receives its first heartbeat event. Events are sent using Cronitor's Telemetry API.

A basic heartbeat is any GET, POST or HEAD request to the URL

curl https://cronitor.link/p/{{api_key}}/airflow-heartbeat

Alert Settings

Cronitor can send alerts via many different channels, including: Email, Slack, PagerDuty, Microsoft Teams, SMS, webhooks and more.

Notification lists are configurable sets of channels that are attached to your monitors. When a monitor fails or recovers, an alert will be sent to every channel/recipient on the attached list. You can create or modify a list from the alerts ettings page.

Tip: When your Cronitor account is created a list named "Default" will automatically be created, and your email address will be added to it. When no list is specified the default list will be used.

FAQs

  • If you have enabled Telemetry Events API authentication on your account settings page, all telemetry ping requests will require your Telemetry Events API key as a parameter. If you're using CronitorCLI, save your Telemetry Events API key to your configuration file using cronitor configure --ping-api-key <KEY> or set the CRONITOR_PING_API_KEY environment variable. If you're sending pings using Curl or a native HTTP client, provide the key via an &auth_key parameter.
  • A ten second timeout is used in our examples but we do not expect pings to take nearly that long. A lost ping will often mean a false alarm so we use a long timeout to avoid that. It's not a magic number and certainly should be tweaked if your use-case demands it. Our goal is a 90th percentile time of 1.0 second
  • We recommend using https to keep your ping URL private and server traffic encrypted, but you may also use http. If you are troubleshooting SSL errors from your cronitor ping, it's commonly a timeout issue unrelated to SSL. Note: unencrypted http pings must be made using the cronitor.link hostname. Ping requests to cronitor.io are deprecated and will redirect any unencrypted traffic.
  • Your monitor is not initialized, and alerts will not be sent, until your first ping.
  • Ping endpoints support both http GET and POST, though any POST body is ignored.
  • If brevity is more important to you than clarity, you can abbreviate endpoints to their first initial: /r, /c and /f
Previous
Important Web Performance Metrics