• sjaakola's avatar
    MDEV-22666 galera.MW-328A hang · 1af6e92f
    sjaakola authored
    The hang can happen between a lock connection issuing KILL CONNECTION for a victim,
    which is in committing phase.
    There happens two resource deadlockwhere  killer is holding victim's
    LOCK_thd_data and requires trx mutex for the victim.
    The victim, otoh, holds his own trx mutex, but requires LOCK_thd_data
    in wsrep_commit_ordered(). Hence a classic two thread deadlock happens.
    
    The fix in this commit changes innodb commit so that wsrep_commit_ordered()
    is not called while holding trx mutex. With this, wsrep patch commit time mutex
    locking does not violate the locking protocol of KILL command
    (i.e. LOCK_thd_data -> trx mutex)
    
    Also, a new test case has been added in galera.galera_bf_kill.test for scenario
    where a client connection is killed in committting phase.
    1af6e92f
service_wsrep.cc 8.36 KB