If you have MySQL replication running in your environment, you very likely have encountered this error before.
Why does this happen?
Before I explain that, we have to understand exactly how MySQL Master<>Slave replication works:
Two threads are created on the SLAVE to the MASTER.
Armed with this information, you now know WHY this error happens. But thats just half the battle. You have to check the relevant logs to see which event/transaction is
attempting to be executed. The logs will tell you if the next transaction is one that has already been commited to the Slave. If that is the case, it's safe to SKIP the next transaction.
Now lets look at the contents of the relevant files:
The master.info file looks like the following:
18 - number of lines in the file
mysql-bin.000001 - name of the current master binary log being read from the master
589 - current position within the master binary log that has been read by the master.
The relay-log.info file looks like this:
/var/log/mariadb/mysql-relay-bin.000002 - filename of current relay log
873 - current position within the relay log, events up to this position have been applied to the slave DB
mysql-bin.000001 - name of the master binary log, which is read by relay log
589 - the position within the master binary log that have already been applied on the slave.
[root@server1 mysql]# cat relay-log.info
Using the above example, we see that mysql-relay-bin.000004 is the file containing the events/tranactions. So next, we view the contents of that file:
We can see that there is an INSERT statement executed on the datav2 table. The next logical step would be to login to the slave DB, and verify if this statement has already been exectuted.
mysql> DESC datav2; (to find the structure of the table, so we have something to query.)
Lets assume the table consistes of a timestamp, and and an integer variable.
Next, we'd query the DB/Table to see if a duplicate entry exists.
mysql> SELECT * from datav2 where timesteamp like '2017-02-08 12:00:01';
If this query returns a result, and the integer following the timestamp is '66' then you have your reason for the duplicate key error.
The mysql-relay-bin.000004 log is trying to INSERT duplicate data into the Slave.
You can safely SKIP this duplicate entry with the following command:
mysql> SET GLOBAL SQL_SKIP_COUNTER=1;
Now try to restart the slave with the START SLAVE; command.
If there are more than one events/transactions that have been duplicated, the error will re-occur when trying start the slave.
There are those that say 'NEVER SKIP! YOU'LL LOSE DATA!' and you may, but if you follow this guide, you will know if that data has already been commited to the DB. Skipping a transaction, that would simply write a record that already exists in the DB will not cause any data loss.
If have any any questions, or comments, feel free to ping me!