• Marko Mäkelä's avatar
    MDEV-34520 purge_sys_t::wait_FTS sleeps 10ms, even if it does not have to · d58734d7
    Marko Mäkelä authored
    There were two separate Atomic_counter<uint32_t>, purge_sys.m_SYS_paused
    and purge_sys.m_FTS_paused. In purge_sys.wait_FTS() we have to read both
    atomically. We used to use an overkill solution for this, acquiring
    purge_sys.latch and waiting 10 milliseconds between samples. To make
    matters worse, the 10-millisecond wait was unconditional, which would
    unnecessarily suspend the purge_coordinator_task every now and then.
    
    It turns out that we can fold both "reference counts" into a single
    Atomic_relaxed<uint32_t> and avoid the purge_sys.latch.
    To assess whether std::memory_order_relaxed is acceptable, we should
    consider the operations that read these "reference counts", that is,
    purge_sys_t::wait_FTS(bool) and purge_sys_t::must_wait_FTS().
    
    Outside debug assertions, purge_sys.must_wait_FTS() is only invoked in
    trx_purge_table_acquire(), which is covered by a shared dict_sys.latch.
    We would increment the counter as part of a DDL operation, but before
    acquiring an exclusive dict_sys.latch. So, a
    purge_sys_t::close_and_reopen() loop could be triggered slightly
    prematurely, before a problematic DDL operation is actually executed.
    Decrementing the counter is less of an issue; purge_sys.resume_FTS()
    or purge_sys.resume_SYS() would mostly be invoked while holding an
    exclusive dict_sys.latch; ha_innobase::delete_table() does it outside
    that critical section. Still, this would only cause some extra wait in
    the purge_coordinator_task, just like at the start of a DDL operation.
    
    There are two calls to purge_sys_t::wait_FTS(bool): in the above mentioned
    purge_sys_t::close_and_reopen() and in purge_sys_t::clone_oldest_view(),
    both invoked by the purge_coordinator_task. There is also a
    purge_sys.clone_oldest_view<true>() call at startup when no DDL operation
    can be in progress.
    
    purge_sys_t::m_SYS_paused: Merged into m_FTS_paused, using a new
    multiplier PAUSED_SYS = 65536.
    
    purge_sys_t::wait_FTS(): Remove an unnecessary sleep as well as the
    access to purge_sys.latch. It suffices to poll purge_sys.m_FTS_paused.
    
    purge_sys_t::stop_FTS(): Do not acquire purge_sys.latch.
    
    Reviewed by: Debarun Banerjee
    d58734d7
trx0purge.h 16.5 KB