• Kristian Nielsen's avatar
    Yet another attempt at fixing random failures in test case main.myisam-metadata · 8e63a7fe
    Kristian Nielsen authored
    I think I finally found the problem, managed to reproduce locally using a
    sleep in the test case to simulate the particular race condition that causes
    the test to fail often in Buildbot.
    
    The test starts an ALTER TABLE that does repair by sort in one thread, then
    another thread waits for the sort to be visible in SHOW PROCESSLIST and runs a
    SHOW statement in parallel.
    
    The problem happens when the sort manages to run to completion before the
    other thread has the time to look at SHOW PROCESSLIST. In this case, the wait
    times out because the state looked for has already passed.
    
    Earlier I added some DEBUG_SYNC to prevent this race, but it turns out that
    DEBUG_SYNC itself changes the state in the processlist. So when the debug sync
    point was hit, the processlist was showing the wrong state, so the wait would
    still time out.
    
    Fixed now by looking for the processlist to contain either the "Repair by
    sorting" state or the debug sync wait stage.
    
    Also clean up previous attempts to fix it. Set the wait timeout back to
    reasonable 60 seconds, and simplify the DEBUG_SYNC operations to work closer
    to how the original test case was intended.
    8e63a7fe
myisam-metadata.test 1.4 KB