• Marko Mäkelä's avatar
    MDEV-24738 Improve the InnoDB deadlock checker · c68007d9
    Marko Mäkelä authored
    A new configuration parameter innodb_deadlock_report is introduced:
    * innodb_deadlock_report=off: Do not report any details of deadlocks.
    * innodb_deadlock_report=basic: Report transactions and waiting locks.
    * innodb_deadlock_report=full (default): Report also the blocking locks.
    
    The improved deadlock checker will consider all involved transactions
    in one loop, even if the deadlock loop includes several transactions.
    The theoretical maximum number of transactions that can be involved in
    a deadlock is `innodb_page_size` * 8, limited by the persistent data
    structures.
    
    Note: Similar to
    mysql/mysql-server@3859219875b62154b921e8c6078c751198071b9c
    our deadlock checker will consider at most one blocking transaction
    for each waiting transaction. The new field trx->lock.wait_trx be
    nullptr if and only if trx->lock.wait_lock is nullptr. Note that
    trx->lock.wait_lock->trx == trx (the waiting transaction), while
    trx->lock.wait_trx points to one of the transactions whose lock is
    conflicting with trx->lock.wait_lock.
    
    Considering only one blocking transaction will greatly simplify
    our deadlock checker, but it may also make the deadlock checker
    blind to some deadlocks where the deadlock cycle is 'hidden' by
    the fact that the registered trx->lock.wait_trx is not actually
    waiting for any InnoDB lock, but something else. So, instead of
    deadlocks, sometimes lock wait timeout may be reported.
    
    To improve on this, whenever trx->lock.wait_trx is changed, we
    will register further 'candidate' transactions in Deadlock::to_check(),
    and check for 'revealed' deadlocks as soon as possible, in lock_release()
    and innobase_kill_query().
    
    The old DeadlockChecker was holding lock_sys.latch, even though using
    lock_sys.wait_mutex should be less contended (and thus preferred)
    in the likely case that no deadlock is present.
    
    lock_wait(): Defer the deadlock check to this function, instead of
    executing it in lock_rec_enqueue_waiting(), lock_table_enqueue_waiting().
    
    DeadlockChecker: Complete rewrite:
    (1) Explicitly keep track of transactions that are being waited for,
    in trx->lock.wait_trx, protected by lock_sys.wait_mutex. Previously,
    we were painstakingly traversing the lock heaps while blocking
    concurrent registration or removal of any locks (even uncontended ones).
    (2) Use Brent's cycle-detection algorithm for deadlock detection,
    traversing each trx->lock.wait_trx edge at most 2 times.
    (3) If a deadlock is detected, release lock_sys.wait_mutex,
    acquire LockMutexGuard, re-acquire lock_sys.wait_mutex and re-invoke
    find_cycle() to find out whether the deadlock is still present.
    (4) Display information on all transactions that are involved in the
    deadlock, and choose a victim to be rolled back.
    
    lock_sys.deadlocks: Replaces lock_deadlock_found. Protected by wait_mutex.
    
    Deadlock::find_cycle(): Quickly find a cycle of trx->lock.wait_trx...
    using Brent's cycle detection algorithm.
    
    Deadlock::report(): Report a deadlock cycle that was found by
    Deadlock::find_cycle(), and choose a victim with the least weight.
    Altogether, we may traverse each trx->lock.wait_trx edge up to 5
    times (2*find_cycle()+1 time for reporting and choosing the victim).
    
    Deadlock::check_and_resolve(): Find and resolve a deadlock.
    
    lock_wait_rpl_report(): Report the waits-for information to
    replication. This used to be executed as part of DeadlockChecker.
    Replication must know the waits-for relations even if no deadlocks
    are present in InnoDB.
    
    Reviewed by: Vladislav Vaintroub
    c68007d9
partition_innodb_plugin.test 5.32 KB