from master when reading data from binary log: 'unknown error reading log event on the master; the first event 'binlog.004215' at 8610479, the last event read from './binlog.004226' at 154, the last byte read from './binlog.004226' at 154.‘”. We can see this error by “show slave status”. - When you execute “START SLAVE”, position slightly advances and replication stops with the same error again. - Recovery by executing “START SLAVE” on the slaves after binary log rotation on the master. - No HW error, Not disk full. Abstract of the issue
Request binary log The master referenced MYSQL_BIN_LOG::binlog_end_pos for checking sendable position of the latest binary log to the slave. Send binary log event MYSQL_BIN_LOG ::binlog_end_pos
of SYNC_BINLOG=1, end position of the binlog is held by THD::m_trans_end_pos on FLUSH stage, then it’s copied to MYSQL_BIN_LOG::binlog_end_pos on SYNC stage. COMMIT stage THD:: m_trans_end_pos SYNC stage MYSQL_BIN_LOG ::binlog_end_pos
(Physical writes are not guaranteed on this stage) - SYNC stage Write the binlog events physically. - COMMIT stage Finalize of the COMMIT on each storage engine. (Transactions have PREPARE status before this stage) Each stage can work independently like a transaction is on COMMIT stage, next transaction is on SYNC stage, another transaction is on FLUSH stage. The abstract of stages at COMMIT
a binlog rotation is occurred before updating MYSQL_BIN_LOG::binlog_end_pos on SYNC stage, this issue occurs. COMMIT stage THD:: m_trans_end_pos SYNC stage MYSQL_BIN_LOG ::binlog_end_pos Rotate a binlog Caused by Previous commits at COMMIT stage Or Flush (binary) logs
wrong. So, replication can restart by command, read and send binarylog event if it is written after previous error position, then stop by causing same error. - When the binlog is rotated again, MYSQL_BIN_LOG::binlog_end_pos is resetted for new binlog file. If this time no transaction is on SYNC stage, the problem is gone. - If transaction has update tables that supports XA transaction, MYSQL_BIN_LOG::m_prep_xids is incremented on FLUSH stage. It makes binlog rotation waiting when it is decremented on COMMIT stage. So this case does not cause this issue. Behaviors after causing the issue
etc. Amazon RDS? Aurora? No source code available. MariaDB doesn’t have this issue) - SYNC_BINLOG=1 - Outputting binary logs The binlog is rotated at the same time of the following actions. (It includes rotation by FLUSH command) - Updating to tables that does not support XA transaction like MyISAM and MEMORY. - Executing DDL. (Except atomic DDL at MySQL 8.0) (“CREATE TABLE … SELECT …” does not include atomic DDL) - Executing commands like ANALYZE TABLE. The detail condition of causing the issue
TABLE, FLUSH TABLES, FLUSH PRIVILEGES, FLUSH ENGINE LOGS, FLUSH ERROR LOGS, FLUSH GENERAL LOGS, FLUSH HOSTS, FLUSH OPTIMIZER_COSTS, FLUSH RELAY LOGS, FLUSH SLOW LOGS, FLUSH SLOW LOGS, FLUSH STATUS, FLUSH USER_RESOURCES, ALTER INSTANCE, CREATE TABLE ... SELECT ... MySQL 5.7 & MySQL 8.0 (Tables that does not support atomic DDL) CREATE TABLE, DROP TABLE, ALTER TABLE, RENAME TABLE, TRUNCATE, CREATE INDEX, DROP INDEX MySQL 5.7 & MySQL 8.0 (Tables that does not support XA transaction) INSERT, UPDATE, DELETE, REPLACE, LOAD DATA The statements of causing the issue (No.1)
SET PASSWORD, GRANT, REVOKE (ALL), CREATE DATABASE, DROP DATABASE, ALTER DATABASE, CREATE VIEW, DROP VIEW, CREATE TABLESPACE, DROP TABLESPACE, ALTER TABLESPACE, CREATE FUNCTION, DROP FUNCTION, ALTER FUNCTION, CREATE PROCEDURE, DROP PROCEDURE, ALTER PROCEDURE, CREATE TRIGGER, DROP TRIGGER, CREATE EVENT, DROP EVENT, ALTER EVENT, FLUSH DES_KEY_FILE, FLUSH QUERY CACHE The statements of causing the issue (No.2)
it is possible to do other approaches) 1. Execute “FLUSH BINARY LOGS” command on the master for rotating the binlog, then execute “START SLAVE (IO_THREAD)” on the slaves. If you get same error on the slave, re-execute “FLUSH BINARY LOGS” and “START SLAVE”. 2. Keep executing “START SLAVE (IO_THREAD)” until there are no more errors. Approach of recovering or avoiding from the issue
it is possible to do other approaches) 3. Change “SYNC_BINLOG” from 1. Requires to take care of data inconsistency in case of master failure. 4. If you know in advance that problems do not occur during normal operation, execute “FLUSH BINARY LOGS” before executing DDL etc in maintenance for reducing the possibility of binary log rotation during DDL execution. Approach of recovering or avoiding from the issue