• Dmitry Lenev's avatar
    Fix for bug #51093 "Crash (possibly stack overflow) in · 03672b96
    Dmitry Lenev authored
    MDL_lock::find_deadlock".
    
    On some platforms deadlock detector in metadata locking 
    subsystem under certain conditions might have exhausted
    stack space causing server crashes.
    
    Particularly this caused failures of rqg_mdl_stability
    test on Solaris in PushBuild.
    
    During search for deadlock MDL deadlock detector could 
    sometimes encounter loop in the waiters graph in which 
    MDL_context which has started search for a deadlock 
    does not participate. In such case our algorithm will 
    continue looping assuming that either this deadlock will 
    be resolved by MDL_context which has created it (i.e.
    by one of loop participants) or maximum search depth
    will be reached. 
    Since max search depth was set to 1000 in the latter case 
    on platforms where each iteration of deadlock search 
    algorithm needs more than DEFAULT_STACK_SIZE/1000 bytes 
    of stack (around 192 bytes for 32-bit and around 256 bytes 
    for 64-bit platforms) we might have exhausted stack space.
    
    This patch solves this problem by reducing maximum search
    depth for MDL deadlock detector to 32. This should be safe
    at the moment as it is unlikely that each iteration of the 
    current deadlock detector algorithm will consume more than 
    1K of stack (thus total amount of stack required can't be
    more than 32K) and we require at least 80K of stack in order
    to open any table. Also this value should be (hopefully) big
    enough to not cause too much false deadlock errors (there
    is an anecdotal evidence that real-life deadlocks are
    typically shorter than that).
    
    Additional reasearch should be conducted in future in order
    to determine the more optimal value of maximum search depth.
    
    This patch does not include test case as existing
    rqg_mdl_stability test can serve as one.
    03672b96
mdl.cc 63.8 KB