• Dmitry Lenev's avatar
    Fix for bug #51263 "Deadlock between transactional · c070e5a1
    Dmitry Lenev authored
    SELECT and ALTER TABLE ...  REBUILD PARTITION".
    
    ALTER TABLE on InnoDB table (including partitioned tables)
    acquired exclusive locks on rows of table being altered.
    In cases when there was concurrent transaction which did
    locking reads from this table this sometimes led to a
    deadlock which was not detected by MDL subsystem nor by
    InnoDB engine (and was reported only after exceeding
    innodb_lock_wait_timeout).
    
    This problem stemmed from the fact that ALTER TABLE acquired
    TL_WRITE_ALLOW_READ lock on table being altered. This lock
    was interpreted as a write lock and thus for table being
    altered handler::external_lock() method was called with
    F_WRLCK as an argument. As result InnoDB engine treated
    ALTER TABLE as an operation which is going to change data
    and acquired LOCK_X locks on rows being read from old
    version of table.
    
    In case when there was a transaction which already acquired
    SR metadata lock on table and some LOCK_S locks on its rows
    (e.g. by using it in subquery of DML statement) concurrent
    ALTER TABLE was blocked at the moment when it tried to
    acquire LOCK_X lock before reading one of these rows.
    The transaction's attempt to acquire SW metadata lock on
    table being altered led to deadlock, since it had to wait
    for ALTER TABLE to release SNW lock. This deadlock was not
    detected and got resolved only after timeout expiring
    because waiting were happening in two different subsystems.
    
    Similar deadlocks could have occured in other situations.
    This patch tries to solve the problem by changing ALTER TABLE
    implementation to use TL_READ_NO_INSERT lock instead of
    TL_WRITE_ALLOW_READ. After this step handler::external_lock()
    is called with F_RDLCK as an argument and InnoDB engine
    correctly interprets ALTER TABLE as operation which only
    reads data from original version of table. Thanks to this
    ALTER TABLE acquires only LOCK_S locks on rows it reads.
    This, in its turn, causes inter-subsystem deadlocks to go
    away, as all potential lock conflicts and thus deadlocks will
    be limited to metadata locking subsystem:
    
    - When ALTER TABLE reads rows from table being altered it
      can't encounter any locks which conflict with LOCK_S row
      locks. There should be no concurrent transactions holding
      LOCK_X row locks. Such a transaction should have been
      acquired SW metadata lock on table first which would have
      conflicted with ALTER's SNW lock.
    - Vice versa, when DML which runs concurrently with ALTER
      TABLE tries to lock row it should be requesting only LOCK_S
      lock which is compatible with locks acquired by ALTER,
      as otherwise such DML must own an SW metadata lock on table
      which would be incompatible with ALTER's SNW lock.
    c070e5a1
ha_partition.cc 202 KB