Commits · 657fcdf430f39a3103dff51a6a2b2bd3a090a498 · nexedi / MariaDB

25 Nov, 2020 11 commits

MDEV-24280 InnoDB triggers too many independent periodic tasks · 657fcdf4

Marko Mäkelä authored Nov 25, 2020

A side effect of MDEV-16264 is that a large number of threads will
be created at server startup, to be destroyed after a minute or two.

One source of such thread creation is srv_start_periodic_timer().
InnoDB is creating 3 periodic tasks: srv_master_callback (1Hz)
srv_error_monitor_task (1Hz), and srv_monitor_task (0.2Hz).

It appears that we can merge srv_error_monitor_task and srv_monitor_task
and have them invoked 4 times per minute (every 15 seconds). This will
affect our ability to enforce innodb_fatal_semaphore_wait_threshold and
some computations around BUF_LRU_STAT_N_INTERVAL.

We could remove srv_master_callback along with the DROP TABLE queue
at some point of time in the future. We must keep it independent
of the innodb_fatal_semaphore_wait_threshold detection, because
the background DROP TABLE queue could get stuck due to dict_sys
being locked by another thread. For now, srv_master_callback
must be invoked once per second, so that
innodb_flush_log_at_timeout=1 can work.

BUF_LRU_STAT_N_INTERVAL: Reduce the precision and extend the time
from 50*1 second to 4*15 seconds.

srv_error_monitor_timer: Remove.

MAX_MUTEX_NOWAIT: Increase from 20*1 second to 2*15 seconds.

srv_refresh_innodb_monitor_stats(): Avoid a repeated call to time(NULL).
Change the interval to less than 60 seconds.

srv_monitor(): Renamed from srv_monitor_task.

srv_monitor_task(): Renamed from srv_error_monitor_task().
Invoked only once in 15 seconds. Invoke also srv_monitor().
Increase the fatal_cnt threshold from 10*1 second to 1*15 seconds.

sync_array_print_long_waits_low(): Invoke time(NULL) only once.
Remove a bogus message about printouts for 30 seconds. Those
printouts were effectively already disabled in MDEV-16264
(commit 5e62b6a5).

657fcdf4

MDEV-24278 InnoDB page cleaner keeps waking up on idle server · 7b1252c0

Marko Mäkelä authored Nov 25, 2020

The purpose of the InnoDB page cleaner subsystem is to write out
modified pages from the buffer pool to data files. When the
innodb_max_dirty_pages_pct_lwm is not exceeded or
innodb_adaptive_flushing=ON decides not to write out anything,
the page cleaner should keep sleeping indefinitely until the state
of the system changes: a dirty page is added to the buffer pool such
that the page cleaner would no longer be idle.

buf_flush_page_cleaner(): Explicitly note when the page cleaner is idle.
When that happens, use mysql_cond_wait() instead of mysql_cond_timedwait().

buf_flush_insert_into_flush_list(): Wake up the page cleaner if needed.

innodb_max_dirty_pages_pct_update(),
innodb_max_dirty_pages_pct_lwm_update():
Wake up the page cleaner just in case.

Note: buf_flush_ahead(), buf_flush_wait_flushed() and shutdown are
already waking up the page cleaner thread.

7b1252c0

MDEV-24270: Clarify some comments · f693b725
Marko Mäkelä authored Nov 25, 2020

f693b725
Fix misspelling. · 2de95f7a
Vladislav Vaintroub authored Nov 25, 2020
```
Kudos to Marko for finding.
```
2de95f7a
Cleanup. Remove obsolete comment · af98fddc
Vladislav Vaintroub authored Nov 25, 2020

af98fddc
Cleanup. Provide accurate comment on my_getevents(). · c130c60b
Vladislav Vaintroub authored Nov 25, 2020

c130c60b
Partially Revert "MDEV-24270: Collect multiple completed events at a time" · 78df9e37
Vladislav Vaintroub authored Nov 25, 2020
```
This partially reverts commit 6479006e.

Remove the constant tpool::aio::N_PENDING, which has no
intrinsic meaning for the tpool.
```
78df9e37
Cleanup: Fix Intel compiler warnings about sign conversions · 3dfeae0e
Marko Mäkelä authored Nov 25, 2020

3dfeae0e
Cleanup: Remove redundant nonnull attributes · 4a22056c
Marko Mäkelä authored Nov 25, 2020

4a22056c

MDEV-24270: Collect multiple completed events at a time · 6479006e

Marko Mäkelä authored Nov 25, 2020

tpool::aio::N_PENDING: Replaces OS_AIO_N_PENDING_IOS_PER_THREAD.
This limits two similar things: the number of outstanding requests
that a thread may io_submit(), and the number of completed requests
collected at a time by io_getevents().

6479006e

MDEV-24270 Misuse of io_getevents() causes wake-ups at least twice per second · 7a9405e3

Marko Mäkelä authored Nov 25, 2020

In the asynchronous I/O interface, InnoDB is invoking io_getevents()
with a timeout value of half a second, and requesting exactly 1 event
at a time.

The reason to have such a short timeout is to facilitate shutdown.

We can do better: Use an infinite timeout, wait for a larger maximum
number of events. On shutdown, we will invoke io_destroy(), which
should lead to the io_getevents system call reporting EINVAL.

my_getevents(): Reimplement the libaio io_getevents() by only invoking
the system call. The library implementation would try to elide the
system call and return 0 immediately if aio_ring_is_empty() holds.
Here, we do want a blocking system call, not 100% CPU usage. Neither
do we want the aio_ring_is_empty() trigger SIGSEGV because it is
dereferencing some memory that was freed by io_destroy().

7a9405e3

24 Nov, 2020 1 commit

MDEV-24271 rw_lock::read_lock_yield() may cause writer starvation · 1b12e251

Marko Mäkelä authored Nov 24, 2020

The greedy fetch_add(1) approach of read_trylock() may cause
starvation of a waiting write lock request. Let us use a
compare-and-swap for the read lock acquisition in order to
guarantee the progress of writers.

1b12e251

23 Nov, 2020 1 commit

MDEV-21534 fixup: Remove HAVE_IB_LINUX_FUTEX · dcdc8c35

Marko Mäkelä authored Nov 20, 2020

Since commit 30ea63b7
we actually depend on futex on Linux. Also, we depend on
std::atomic for even longer.

dcdc8c35

20 Nov, 2020 5 commits

MDEV-24167: Remove PFS instrumentation of buf_block_t · 1e5d989d

Marko Mäkelä authored Nov 20, 2020

We always defined PFS_SKIP_BUFFER_MUTEX_RWLOCK, that is,
the latches of the buffer pool blocks were never instrumented
in PERFORMANCE_SCHEMA.

For some reason, the debug_latch (which enforce proper usage of
buffer-fixing in debug builds) was instrumented.

1e5d989d

MDEV-22871 fixup: Relax a debug assertion · 156cb94b

Marko Mäkelä authored Nov 20, 2020

In commit bf3c862f we introduced
an assertion that may dereference a null pointer.

This regression was caught by running the following:
./mtr --parallel=auto --suite=innodb \
--mysqld=--loose-innodb-adaptive-hash-index

The adaptive hash index is disabled by default since
commit 88cdfc5c (MDEV-20487)
and hence the problem was not caught earlier.

156cb94b

Run innodb_wl6326_big only in debug builds · 3c8ecb5b

Marko Mäkelä authored Nov 20, 2020

The test seems to deterministically fail on RelWithDebInfo builds
due to a timeout in wait_condition.inc.

According to Matthias Leich (the original author of the test),
the failure rate would reduce if we disabled the purge of
transaction history by setting innodb_force_recovery=2.

For now, let us run this stress test on debug builds only.

3c8ecb5b

Cleanup: Fix build problems with the Intel compiler · 8ac19be8

Marko Mäkelä authored Nov 19, 2020

fil_space_t::flush_low(): Define and declare without inline.

ut_is_2pow(): Remove UNIV_LIKELY. This is almost exclusively
used in debug assertions. UNIV_LIKELY is not compatible with
static_assert in some compilers.

8ac19be8

MDEV-21534 fixup: Use a compile-time constant · 9c455945
Marko Mäkelä authored Nov 19, 2020

9c455945

19 Nov, 2020 2 commits

Update MCS to resolve libmarias3 compulation for centos74-amd64-debug · a16e3c32

Roman Nozdrin authored Nov 10, 2020

pipeline in community BB

Fix for rebuild from source step

Disable MCS on i386|i686 platforms

This patch puts MCS debian packaging files and part of debian/control
into the engine directory

a16e3c32

MDEV-24125: linux large pages, linux/mman.h needed · 3b486c28
Daniel Black authored Nov 19, 2020
```
Centos/RHEL7 have the MAP_HUGE_SHIFT constant
defined in linux/mman.h which needed to get included.
```
3b486c28

18 Nov, 2020 1 commit

MDEV-24224 Gap lock on delete in 10.5 using READ COMMITTED · 33d41167

Marko Mäkelä authored Nov 18, 2020

When MDEV-19544 (commit 1a6f4704)
simplified the initialization of the local variable
set_also_gap_locks, an inadvertent change was included.
Essentially, all code branches that are executed when
set_also_gap_locks hold must also ensure that
trx->isolation_level > TRX_ISO_READ_COMMITTED holds.
This was being violated in a few code paths.

It turns out that there is an even simpler fix: Remove the test
of thd_is_select() completely. In that way, the first part of
UPDATE or DELETE should work exactly like SELECT...FOR UPDATE.

thd_is_select(): Remove.

33d41167

17 Nov, 2020 7 commits

Merge 10.4 into 10.5 · fabdad68
Marko Mäkelä authored Nov 17, 2020

fabdad68
Work around MDEV-24232: Skip perfschema.nesting if WITH_WSREP=OFF · bbf0b55c
Marko Mäkelä authored Nov 17, 2020

bbf0b55c

MDEV-24188 fixup: Simplify the wait loop · 83a55670

Marko Mäkelä authored Nov 17, 2020

Starting with commit 7cffb5f6 (MDEV-23399)
the function buf_flush_page() will first acquire block->lock and only
after that invoke set_io_fix(). Before that, it was possible to reach
a livelock between buf_page_create() and buf_flush_page().

buf_page_create(): Directly try acquiring the exclusive page latch
without checking whether the page is io-fixed or buffer-fixed.
(As a matter of fact, the have_x_latch() check is not strictly necessary,
because we still support recursive X-latches.)
In case of a latch conflict, wait while allowing buf_page_write_complete()
to acquire buf_pool.mutex and release the block->lock.

An attempt to wait for exclusive block->lock while holding buf_pool.mutex
would lead to a hang in the tests parts.part_supported_sql_func_innodb
and stress.ddl_innodb, due to a deadlock between buf_page_write_complete()
and buf_page_create().

Similarly, in case of an I/O fixed compressed-only
ROW_FORMAT=COMPRESSED page, we will sleep before retrying.

In both cases, we will sleep for 1ms or until a flush batch is completed.

83a55670

MDEV-24115 Fix -Wconversion in Timeval::Timeval() on Mac OS X · 796f708f

Dmitry Shulga authored Nov 17, 2020

The data member tv_usec of the struct timeval is declared as suseconds_t
on MacOS. Size of suseconds_t is 4 bytes. On the other hand, size of ulong
is 8 bytes on 64-bit MacOS, so attempt to assign a value of wider type
(usec) to a value (tv_usec) of narrower type leads to error.

796f708f

Merge 10.3 into 10.4 · f0c99037
Marko Mäkelä authored Nov 17, 2020

f0c99037
Fix suppression in MTR test galera_3nodes.inconsistency_shutdown · 694926a4
Daniele Sciascia authored Nov 17, 2020

694926a4

MDEV-23610: Slave user can't run "SHOW SLAVE STATUS" anymore after upgrade to... · c815ffb9

Sujatha authored Nov 16, 2020

MDEV-23610: Slave user can't run "SHOW SLAVE STATUS" anymore after upgrade to 10.5, mysql_upgrade should take of that

Post push fix. Update version to 10.5.8.

c815ffb9

16 Nov, 2020 7 commits

MDEV-24125: allow compile on Linux headers < 3.8 · 7f30a5c4

Daniel Black authored Nov 16, 2020

This allows MariaDB to compile on old (limits to >2.6.32)
linux kernel versions.

This warns that attempts to use large pages will rely on
implict kernel determination.

7f30a5c4

MDEV-24125: linux large pages - Revert "Fixed centos 6 build failure" · 8cc5d284

Daniel Black authored Nov 16, 2020

This reverts commit 6cf8f05f.

Original patch assumed that MAP_HUGETLB as consistent across
achitectures which isn't the case. Defining it unconditionally
broke large pages on every achitecutre where the value differed
from x86_64.

With the EOL for Centos/RHEL6 announced in 10.5.7, <3.8 linux
kernels are no longer supported.

8cc5d284

MDEV-24124: main.drop test - mulitarch/os error messages · 0fc0eb1e
Daniel Black authored Nov 16, 2020
```
Account for variety of mips, hppa, solaris and other messages.

Copied from rpl.rpl_drop_db test.
```
0fc0eb1e
Do not run maria.repair with --embedded as memory usage is different · eae9311f
Monty authored Nov 16, 2020

eae9311f
Restore autoincrement offset in MTR test MDEV-24063 · 1ae7809a
Daniele Sciascia authored Nov 13, 2020

1ae7809a

MDEV-23610: Slave user can't run "SHOW SLAVE STATUS" anymore after upgrade to... · 2b347e9f

Sujatha authored Nov 16, 2020

MDEV-23610: Slave user can't run "SHOW SLAVE STATUS" anymore after upgrade to 10.5, mysql_upgrade should take of that

Fixing a post push test issue.

2b347e9f

MDEV-23610: Slave user can't run "SHOW SLAVE STATUS" anymore after upgrade to... · 6da68049

Sujatha authored Nov 16, 2020

MDEV-23610: Slave user can't run "SHOW SLAVE STATUS" anymore after upgrade to 10.5, mysql_upgrade should take of that

Add a new privilege "SLAVE MONITOR" which will grant user the permission
to execute "SHOW SLAVE STATUS" and "SHOW RELAYLOG EVENTS" commands.

SHOW SLAVE STATUS requires either SLAVE MONITOR/SUPER
SHOW RELAYLOG EVENTS requires SLAVE MONITOR privilege.

6da68049

14 Nov, 2020 5 commits
- This patch puts MCS debian packaging files and part of debian/control · 1edd2243
  Roman Nozdrin authored Nov 14, 2020
```
into the engine directory
```
  1edd2243
- MDEV-24098: 10.5 followup · 81b9c785
  Oleksandr Byelkin authored Nov 14, 2020
```
remove version data from the test output
```
  81b9c785
- Merge branch '10.4' into 10.5 · 6daf6bbc
  Oleksandr Byelkin authored Nov 14, 2020
  
  6daf6bbc
- Merge branch '10.3' into 10.4 · 1bebc8de
  Oleksandr Byelkin authored Nov 14, 2020
  
  1bebc8de
- Merge branch '10.2' into 10.3 · a00e21c0
  Oleksandr Byelkin authored Nov 14, 2020
  
  a00e21c0