• Brandon Nesterenko's avatar
    MDEV-33500 (part 2): rpl.rpl_parallel_sbm can still fail · 68938d2b
    Brandon Nesterenko authored
    The failing test case validates Seconds_Behind_Master for a delayed
    slave, while STOP SLAVE is executed during a delay. The test fixes
    initially added to the test (commit b04c8575) added a table lock
    to ensure a transaction could not finish before validating the
    Seconds_Behind_Master field after SLAVE START, but did not address a
    possibility that the transaction could finish before running the
    STOP SLAVE command, which invalidates the validations for the rest
    of the test case. Specifically, this would result in 1) a timeout in
    “Waiting for table metadata lock” on the replica, which expects the
    transaction to retry after slave restart and hit a lock conflict on
    the locked tables (added in b04c8575), and 2) that
    Seconds_Behind_Master should have increased, but did not.
    
    The failure can be reproduced by synchronizing the slave to the master
    before the MDEV-32265 echo statement (i.e. before the SLAVE STOP).
    
    This patch fixes the test by adding a mechanism to use DEBUG_SYNC to
    synchronize a MASTER_DELAY, rather than continually increase the
    duration of the delay each time the test fails on buildbot. This is
    to ensure that on slow machines, a delay does not pass before the
    test gets a chance to validate results. Additionally, it decreases
    overall test time because the test can continue immediately after
    validation, thereby bypassing the remainder of a full delay for each
    transaction.
    68938d2b
rpl_parallel_sbm.test 9.82 KB