MDEV-9501: rpl.rpl_binlog_index, rpl.rpl_gtid_crash, rpl.rpl_stm_multi_query...
MDEV-9501: rpl.rpl_binlog_index, rpl.rpl_gtid_crash, rpl.rpl_stm_multi_query fail sporadically in buildbot with Master command COM_REGISTER_SLAVE failed Analysis: ======== Slave server will send COM_REGISTER_SLAVE command at the time of establishing a connection to master. If master is down, then the command will fail and COM_REGISTER_SLAVE failed warning is reported. 'rpl_binlog_index.test' shutsdown the master and it relocates binary logs to a new location and attempts to start master by pointing 'log-bin' to new location. During this process the slave threads are active. IO thread actively checks for the presence of master when it finds that the connection is lost it attempts a reconnect, as master is down COM_REGISTER_SLAVE command fails. As part of fix, stop the slave threads and then shutdown the master and do the binlog relocation. Once master is restarted start the slave threads and sync them with the master. In test binary logs and index files on master are relocated to /tmpdir but during master restart only --log-bin option is provided, this is incorrect. Even --log-bin-index also should be pointed to /tmpdir otherwise upon master server restart two index files will be created. One master-bin.index in /tmpdir and a new master-bin.index as per log_basename in datadir. Due to this slave will fail to connect to master. 'rpl_gtid_crash.test' tests following scenario "crashing master, causing slave IO thread to reconnect while SQL thread is running". When IO thread tries to connect to crashed master on slow platforms COM_REGISTER_SLAVE command fails. This is expected hence the warning should be added to suppression list.
Showing
Please register or sign in to comment