• Marko Mäkelä's avatar
    MDEV-26200 buf_pool.flush_list corrupted by buffer pool resizing or ROW_FORMAT=COMPRESSED · bf435a3f
    Marko Mäkelä authored
    The lazy deletion of clean blocks from buf_pool.flush_list that was
    introduced in commit 6441bc61 (MDEV-25113)
    introduced a race condition around the function
    buf_flush_relocate_on_flush_list().
    
    The test innodb_zip.wl5522_debug_zip as well as the buffer pool
    resizing tests would occasionally fail in debug builds due to
    buf_pool.flush_list.count disagreeing with the actual length of the
    doubly-linked list.
    
    The safe procedure for relocating a block in buf_pool.flush_list should be
    as follows, now that we implement lazy deletion from buf_pool.flush_list:
    
    1. Acquire buf_pool.mutex.
    2. Acquire the exclusive buf_pool.page_hash.latch.
    3. Acquire buf_pool.flush_list_mutex.
    4. Copy the block descriptor.
    5. Invoke buf_flush_relocate_on_flush_list().
    6. Release buf_pool.flush_list_mutex.
    
    buf_flush_relocate_on_flush_list(): Assert that
    buf_pool.flush_list_mutex is being held. Invoke
    buf_page_t::oldest_modification() only once, using
    std::memory_order_relaxed, now that the mutex protects us.
    
    buf_LRU_free_page(), buf_LRU_block_remove_hashed(): Avoid
    an unlock-lock cycle on hash_lock. (We must not acquire hash_lock
    while already holding buf_pool.flush_list_mutex, because that
    could lead to a deadlock due to latching order violation.)
    bf435a3f
buf0buf.cc 133 KB