Monitor MySQL Slave Lag

Configuring a MySQL slave is a relatively straight-forward process. However, one often overlooked detail is to monitor the replication lag time — how far behind the master database the slave is. In this tutorial we will use Cronitor's heartbeat monitoring (which provides a simple mechanism to report ad-hoc failures of any type of service) to alert us when the lag between the MySQL slave and master passes a pre-defined threshold, or when replication halts.

What causes MySQL Slave Lag?

Lag time in MySQL replication can be caused by variety of issues, and can often be very difficult to debug. MySQL replication is both single threaded and locks tables just like any other query execution. Thus, long running queries and/or queries that set many logs during execution are the among the most common causes of slave lag. The Percona team has written an excellent set of articles that explain the topic in more depth:

Bash script to alert when MySQL replication lags

The script below will report failure if the MySQL slave either stops running or falls more than a minute behind the master in replication. Your monitor can be configured to alert you to failures in a variety of different ways including email, SMS, and via third-party integrations.

The best way to run this script is as a cron job on the server that your slave is running on. The frequency at which it runs is entirely up to you - we run ours every 10 minutes.