Commit 8c68b549 authored by Marko Mäkelä's avatar Marko Mäkelä

MDEV-21452 fixup: Fix fake server hang reports

srv_monitor_task(): Make the innodb_fatal_semaphore_wait_threshold
watchdog tolerate non-monotonic clock. On NUMA systems, the
my_hrtime_coarse() executed by different NUMA nodes are not in sync,
and the clock could appear to run backwards. We must treat negative
time durations as zero, just like we did in
commit ff5d306e in
dict_sys_t::mutex_lock_wait().

The wrong logic caused occasional crashes of the test
mariabackup.apply-log-only-incr when it was run concurrently with
itself with a large number of instances.
parent 30dc4287
...@@ -1346,20 +1346,23 @@ void srv_monitor_task(void*) ...@@ -1346,20 +1346,23 @@ void srv_monitor_task(void*)
eviction policy. */ eviction policy. */
buf_LRU_stat_update(); buf_LRU_stat_update();
const ulonglong now = my_hrtime_coarse().val; ulonglong now = my_hrtime_coarse().val;
const ulong threshold = srv_fatal_semaphore_wait_threshold; const ulong threshold = srv_fatal_semaphore_wait_threshold;
if (ulonglong start = dict_sys.oldest_wait()) { if (ulonglong start = dict_sys.oldest_wait()) {
ulong waited = static_cast<ulong>((now - start) / 1000000); if (now >= start) {
if (waited >= threshold) { now -= start;
ib::fatal() << dict_sys.fatal_msg; ulong waited = static_cast<ulong>(now / 1000000);
} if (waited >= threshold) {
ib::fatal() << dict_sys.fatal_msg;
}
if (waited == threshold / 4 if (waited == threshold / 4
|| waited == threshold / 2 || waited == threshold / 2
|| waited == threshold / 4 * 3) { || waited == threshold / 4 * 3) {
ib::warn() << "Long wait (" << waited ib::warn() << "Long wait (" << waited
<< " seconds) for dict_sys.mutex"; << " seconds) for dict_sys.mutex";
}
} }
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment