• Vlad Lesin's avatar
    MDEV-34690 lock_rec_unlock_unmodified() causes deadlock · 8d3d85e5
    Vlad Lesin authored
    lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock()
    or under a combination of lock_sys.rd_lock() + record locks hash table
    cell latch. It also requests page latch to check if locked records were
    changed by the current transaction or not.
    
    Usually InnoDB requests page latch to find the certain record on the
    page, and then requests lock_sys and/or record lock hash cell latch to
    request record lock. lock_rec_unlock_unmodified() requests the latches
    in the opposite order, what causes deadlocks. One of the possible
    scenario for the deadlock is the following:
    
    thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table
               cell latch, the latch is acquired;
    thread 2 - purge thread acquires page latch and tries to remove
               delete-marked record, it invokes lock_update_delete(), which
               requests locks hash table cell latch, held by thread 1;
    thread 1 - requests page latch, held by thread 2.
    
    To fix it we need to release lock_sys.latch and/or lock hash cell latch,
    acquire page latch and re-acquire lock_sys related latches.
    
    When lock_sys.latch and/or lock hash cell latch are released in
    lock_release_on_prepare() and lock_release_on_prepare_try(), the page on
    which the current lock is held, can be merged. In this case the bitmap
    of the current lock must be cleared, and the new lock must be added to
    the end of trx->lock.trx_locks list, or bitmap of already existing lock
    must be changed.
    
    The new field trx_lock_t::set_nth_bit_calls indicates if new locks
    (bits in existing lock bitmaps or new lock objects) were created during
    the period when lock_sys was released in trx->lock.trx_locks list
    iteration loop in lock_release_on_prepare() or
    lock_release_on_prepare_try(). And, if so, we traverse the list again.
    
    The block can be freed during pages merging, what causes assertion
    failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page
    get mode to it. That's why page_get_mode parameter was added to
    btr_block_get() to pass BUF_GET_POSSIBLY_FREED from
    lock_release_on_prepare() and lock_release_on_prepare_try() to
    buf_page_get_gen().
    
    As searching for id of trx, which modified secondary index record, is
    quite expensive operation, restrict its usage for master. System variable
    was added to remove the restriction for testing simplifying. The
    variable exists only either for debug build or for build with
    -DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the
    probability of catching bugs for release build with RQG.
    
    Reviewed by Marko Mäkelä, Debarun Banerjee.
    8d3d85e5
trx0trx.cc 64.6 KB