• Davi Arnaut's avatar
    Bug#56096: STOP SLAVE hangs if executed in parallel with user sleep · 7d64c364
    Davi Arnaut authored
    The root of the problem is that to interrupt a slave SQL thread
    wait, the STOP SLAVE implementation uses thd->awake(THD::NOT_KILLED).
    This appears as a spurious wakeup (e.g. from a sleep on a
    condition variable) to the code that the slave SQL thread is
    executing at the time of the STOP. If the code is not written
    to be spurious-wakeup safe, unexpected behavior can occur. For
    the reported case, this problem led to an infinite loop around
    the interruptible_wait() function in item_func.cc (SLEEP()
    function implementation).  The loop was not being properly
    restarted and, consequently, would not come to an end. Since the
    SLEEP function sleeps on a timed event in order to be killable
    and to perform periodic checks until the requested time has
    elapsed, the spurious wake up was causing the requested sleep
    time to be reset every two seconds.
    
    The solution is to calculate the requested absolute time only
    once and to ensure that the thread only sleeps until this
    time is elapsed. In case of a spurious wake up, the sleep is
    restarted using the previously calculated absolute time. This
    restores the behavior present in previous releases. If a slave
    thread is executing a SLEEP function, a STOP SLAVE statement
    will wait until the time requested in the sleep function
    has elapsed.
    7d64c364
rpl_stm_start_stop_slave.result 2.04 KB