Commits · 208233be5af55072d7ef80c37ddbc664bc51f342 · nexedi / MariaDB

22 Feb, 2021 1 commit

MDEV-24830 : Write a warning to error log if Galera replicates InnoDB table with no primary key · 208233be

Jan Lindström authored Feb 10, 2021

Two new features for Galera
* Write a warning to error log if Galera replicates table with storage engine not supported by Galera (at the moment only InnoDB is supported
** Warning is pushed to client also
** MyISAM is allowed if wsrep_replicate_myisam=ON
* Write a warning to error log if Galera replicates table with no primary key
** Warning is pushed to client also
** MyISAM is allowed if wsrep_relicate_myisam=ON
* In both cases apply flood control if > 10 same warning is writen to error log
(requires log_warnings > 1), flood control will suppress warnings for 300 seconds

208233be

20 Feb, 2021 1 commit

MDEV-24854: Change innodb_flush_method=O_DIRECT by default · 420f8e24

Marko Mäkelä authored Feb 20, 2021

We have innodb_use_native_aio=ON by default since the introduction of
that parameter in commit 2f9fb41b
(MySQL 5.5 and MariaDB 5.5).

However, to really benefit from the setting, the files should be
opened in O_DIRECT mode, to bypass the file system cache.
In this way, the reads and writes can be submitted with DMA, using
the InnoDB buffer pool directly, and no processor cycles need to be
used for copying data. The use of O_DIRECT benefits not only the
current libaio implementation, but also liburing.

os_file_set_nocache(): Test innodb_flush_method in the function,
not in the callers.

420f8e24

18 Feb, 2021 2 commits

MDEV-24915 Galera conflict resolution is unnecessarily complex · 43b239a0

Marko Mäkelä authored Feb 18, 2021

The fix of MDEV-23328 introduced a background thread for
killing conflicting transactions.
Thanks to the refactoring that was conducted in MDEV-24671,
the high-priority ("brute-force") applier thread can kill the
conflicting transactions itself, before waiting for the
locks to be finally released (after the conflicting transactions
have been rolled back).

This also allows us to remove the hack LockGGuard that had to
be added in MDEV-20612, and remove Galera-related function
parameters from lock creation.

43b239a0

MDEV-20612 fixup: Remove a redundant check · 18dc5b01

Marko Mäkelä authored Feb 18, 2021

lock_wait_rpl_report(): Only reload trx->lock.wait_lock
if lock_sys.wait_mutex had to be released and reacquired.

18dc5b01

17 Feb, 2021 11 commits

MDEV-24887 Tests fail on macos because mysqltest can't use nonblock API · 9a907868
Robert Bindar authored Feb 17, 2021

9a907868
Merge 10.5 into 10.6 · 94b45787
Marko Mäkelä authored Feb 17, 2021

94b45787

MDEV-19168: Add ssl-flush command. (#1749) · 66b8edf8

Kartik Soneji authored Feb 17, 2021

* MDEV-19168: Add ssl-flush command.
Improve flush error messages and move error printing into the `flush` function.

66b8edf8

MDEV-24738 fixup: heap-use-after-poison in lock_sys_t::deadlock_check() · 9f136700

Marko Mäkelä authored Feb 17, 2021

Deadlock::report(): Require the caller to acquire lock_sys.latch
if invoking on a transaction that is now owned by the current thread.

9f136700

Keep old GCC quiet. · e92c34ce
Vladislav Vaintroub authored Feb 17, 2021

e92c34ce
Merge mariadb-10.5.9 · 16388f39
Marko Mäkelä authored Feb 17, 2021

16388f39

MDEV-24848 Assertion rlen<llen failed when applying MEMSET · d82386b6

Marko Mäkelä authored Feb 17, 2021

btr_cur_upd_rec_in_place(): Prefer WRITE to MEMSET for a single-byte
operation.

log_phys_t::apply(): Relax the assertion to allow a single-byte MEMSET,
even though it is 1 byte longer than a WRITE record.

d82386b6

MDEV-24738 Improve the InnoDB deadlock checker · c68007d9

Marko Mäkelä authored Feb 17, 2021

A new configuration parameter innodb_deadlock_report is introduced:
* innodb_deadlock_report=off: Do not report any details of deadlocks.
* innodb_deadlock_report=basic: Report transactions and waiting locks.
* innodb_deadlock_report=full (default): Report also the blocking locks.

The improved deadlock checker will consider all involved transactions
in one loop, even if the deadlock loop includes several transactions.
The theoretical maximum number of transactions that can be involved in
a deadlock is `innodb_page_size` * 8, limited by the persistent data
structures.

Note: Similar to
mysql/mysql-server@3859219875b62154b921e8c6078c751198071b9c
our deadlock checker will consider at most one blocking transaction
for each waiting transaction. The new field trx->lock.wait_trx be
nullptr if and only if trx->lock.wait_lock is nullptr. Note that
trx->lock.wait_lock->trx == trx (the waiting transaction), while
trx->lock.wait_trx points to one of the transactions whose lock is
conflicting with trx->lock.wait_lock.

Considering only one blocking transaction will greatly simplify
our deadlock checker, but it may also make the deadlock checker
blind to some deadlocks where the deadlock cycle is 'hidden' by
the fact that the registered trx->lock.wait_trx is not actually
waiting for any InnoDB lock, but something else. So, instead of
deadlocks, sometimes lock wait timeout may be reported.

To improve on this, whenever trx->lock.wait_trx is changed, we
will register further 'candidate' transactions in Deadlock::to_check(),
and check for 'revealed' deadlocks as soon as possible, in lock_release()
and innobase_kill_query().

The old DeadlockChecker was holding lock_sys.latch, even though using
lock_sys.wait_mutex should be less contended (and thus preferred)
in the likely case that no deadlock is present.

lock_wait(): Defer the deadlock check to this function, instead of
executing it in lock_rec_enqueue_waiting(), lock_table_enqueue_waiting().

DeadlockChecker: Complete rewrite:
(1) Explicitly keep track of transactions that are being waited for,
in trx->lock.wait_trx, protected by lock_sys.wait_mutex. Previously,
we were painstakingly traversing the lock heaps while blocking
concurrent registration or removal of any locks (even uncontended ones).
(2) Use Brent's cycle-detection algorithm for deadlock detection,
traversing each trx->lock.wait_trx edge at most 2 times.
(3) If a deadlock is detected, release lock_sys.wait_mutex,
acquire LockMutexGuard, re-acquire lock_sys.wait_mutex and re-invoke
find_cycle() to find out whether the deadlock is still present.
(4) Display information on all transactions that are involved in the
deadlock, and choose a victim to be rolled back.

lock_sys.deadlocks: Replaces lock_deadlock_found. Protected by wait_mutex.

Deadlock::find_cycle(): Quickly find a cycle of trx->lock.wait_trx...
using Brent's cycle detection algorithm.

Deadlock::report(): Report a deadlock cycle that was found by
Deadlock::find_cycle(), and choose a victim with the least weight.
Altogether, we may traverse each trx->lock.wait_trx edge up to 5
times (2*find_cycle()+1 time for reporting and choosing the victim).

Deadlock::check_and_resolve(): Find and resolve a deadlock.

lock_wait_rpl_report(): Report the waits-for information to
replication. This used to be executed as part of DeadlockChecker.
Replication must know the waits-for relations even if no deadlocks
are present in InnoDB.

Reviewed by: Vladislav Vaintroub

c68007d9

MDEV-24738: Extend the test innodb.deadlock_detect · 3ddb4fdd
Marko Mäkelä authored Feb 15, 2021

3ddb4fdd

MDEV-24884 Hang in ssux_lock_low::write_lock() · 272a1289

Marko Mäkelä authored Feb 17, 2021

ssux_lock_low::write_lock(): Before invoking writer_wait(), keep
attempting write_lock_wait_try() as long as no conflict exists.

rw_lock::upgrade_trylock(): Relax a bogus assertion and correct
the acquisition operation. Another thread may be executing in
ssux_lock_low::write_lock() on the same latch. Because we are the
only thread that can make progress on that latch, we must become
the writer. Any waiting thread will be eventually woken up by
ssux_lock_low::u_unlock() or ssux_lock_low::wr_unlock(), but not
by wr_u_downgrade() because the upgrade is a very rare operation.

272a1289

MDEV-20612 fixup: Make comments refer to lock_sys.latch · 584e5211
Marko Mäkelä authored Feb 17, 2021

584e5211

16 Feb, 2021 4 commits
- List of unstable tests for 10.5.9 release · 9d7dc1f6
  Elena Stepanova authored Feb 16, 2021
```
Test code modifications and new failures from buildbot were registered
only for the main suite. The rest was updated partially,
based on the status of existing JIRA items
```
  9d7dc1f6
- MDEV-24341 - followup remove assert. · 1146e98b
  Vladislav Vaintroub authored Feb 16, 2021
  
  1146e98b
- MDEV-20612 fixup: Fix a memory leak in buffer pool resize · e5d83ad4
  Marko Mäkelä authored Feb 16, 2021
  
  e5d83ad4
- galera.galera_gra_log crashes · ae7989ca
  Sergei Golubchik authored Feb 16, 2021
```
reset thd->lex->query_tables_own_last,
because open_table() uses it and will try to dereference
whatever garbage it might have
```
  ae7989ca
15 Feb, 2021 7 commits

Ignore reporting in thd_progress_report() if we cannot lock LOCK_thd_data · fc5e03f0

Monty authored Feb 15, 2021

The reason for this is that Galera can lock LOCK_thd_data for a long time.

Instead of stalling any long running process, like alter or repair table,
because of progress reporting, ignore the progress reporting for this
call. Progress reporting will continue on the next call after the lock has
been released.

fc5e03f0

MDEV-24864 Fatal error in buf_page_get_low() / fseg_page_is_free() · 638ede5b

Marko Mäkelä authored Feb 15, 2021

The fix of MDEV-24569 and MDEV-24695 introduced a race condition when
a table is being rebuilt or dropped during the fseg_page_is_free() check.
The server would occasionally crash during the execution of the test
encryption.create_or_replace.

The fil_space_t::STOPPING flag can be set by DDL operations. Normally,
such concurrent operations are prevented by a metadata lock (MDL).
However, neither the change buffer merge nor the fil_crypt_thread() are
protected by MDL.

fil_crypt_read_crypt_data(), xdes_get_descriptor_const(): Pass the
BUF_GET_POSSIBLY_FREED flag to avoid the fatal error in buf_page_get_low()
if a DDL operation was just initiated.

638ede5b

update PCRE2 ref · f13f9663
Sergei Golubchik authored Feb 15, 2021

f13f9663
columnstore 5.5.1-2 · a89b7c1d
Sergei Golubchik authored Feb 15, 2021

a89b7c1d
Merge branch 'bb-10.4-release' into bb-10.5-release · 25d9d2e3
Sergei Golubchik authored Feb 15, 2021

25d9d2e3
Comment on assertion in row_rename_table_for_mysql() · 8eaf4bc5
Aleksey Midenkov authored Feb 15, 2021
```
Related to commit f0baa864 and MDEV-23632.
```
8eaf4bc5

MDEV-24861 Assertion `trx->rsegs.m_redo.rseg' failed in innodb_prepare_commit_versioned · 2e84846e

Marko Mäkelä authored Feb 15, 2021

trx_t::commit_tables(): Ensure that mod_tables will be empty.
This was broken in commit b08448de
where the query cache invalidation was moved from lock_release().

2e84846e

14 Feb, 2021 3 commits

MDEV-24855 ER_CRASHED_ON_USAGE or Assertion `length <= column->length' · 34c65402

Monty authored Feb 15, 2021

When creating a summary temporary table with bit fields used in the sum
expression with several parameters, like GROUP_CONCAT(), the counting of
bits needed in the record was wrong.

The reason we got an assert in Aria was because the bug caused a memory
overwrite in the record and Aria noticed that the data was 'impossible.

34c65402

updating @@wsrep_cluster_address deadlocks · 26965387

Sergei Golubchik authored Feb 12, 2021

wsrep_cluster_address_update() causes LOCK_wsrep_slave_threads
to be locked under LOCK_wsrep_cluster_config, while normally
the order should be the opposite.

Fix: don't protect @@wsrep_cluster_address value with the
LOCK_wsrep_cluster_config, LOCK_global_system_variables is enough.

Only protect wsrep reinitialization with the LOCK_wsrep_cluster_config.
And make it use a local copy of the global @@wsrep_cluster_address.

Also, introduce a helper function that checks whether
wsrep_cluster_address is set and also asserts that it can be safely
read by the caller.

26965387

MDEV-24341 Innodb - do not block in foreground thread in log_write_up_to( · 4df0249b
Vladislav Vaintroub authored Feb 14, 2021

4df0249b

12 Feb, 2021 11 commits

MDEV-24833 : Signal 11 on wsrep_can_run_in_toi at wsrep_mysqld.cc:1994 · b3df194e

Jan Lindström authored Feb 12, 2021

Problem was that when engine substitution is allowd (e.g. sql_mode='')
we must also check db_type. Additionally, we did not resolve
default storage engine on that case and used that to check is
TOI possible or not.

b3df194e

fix a 3-way deadlock in galera_sr.galera-features#56 · b91e77cf

Sergei Golubchik authored Feb 12, 2021

rarely (try --repeat 1000), the following happens:

* from wsrep_bf_abort (when a thread is being killed), wsrep-lib
starts streaming_rollback that wants to
convert_streaming_client_to_applier. wsrep_create_streaming_applier
creates a new THD(). All while the other THD is being killed,
so under LOCK_thd_kill and LOCK_thd_data. In particular, THD::init()
takes LOCK_global_system_variables under LOCK_thd_kill.

* updating @@wsrep_slave_threads takes LOCK_global_system_variables
and LOCK_wsrep_cluster_config (in that order) and invokes
wsrep_slave_threads_update() that takes LOCK_wsrep_slave_threads

* wsrep_replication_process() takes LOCK_wsrep_slave_threads and
invokes wsrep_close_applier(), that does thd->set_killed() which
takes LOCK_thd_kill.

et voilà.

As a fix I copied a workaround from wsrep_cluster_address_update()
to wsrep_slave_threads_update(). It seems to be safe: without mutexes
a race condition is possible and a concurrent SET might change
wsrep_slave_threads, but wsrep_slave_threads_update() always verifies
if there's a need to do something, so it will not run twice in this case,
it'll be a no-op.

b91e77cf

remove find_thread_with_thd_data_lock_callback · 259b9452
Sergei Golubchik authored Feb 12, 2021
```
let the caller take the lock if needed
```
259b9452
MDEV-23328 Server hang due to Galera lock conflict resolution · eac8341d
Sergei Golubchik authored Feb 07, 2021
```
adaptation of 29bbcac0 for 10.4
```
eac8341d
don't take mutexes conditionally · 9703cffa
Sergei Golubchik authored Feb 05, 2021

9703cffa

cleanup: THD::abort_current_cond_wait() · 259a1902

Sergei Golubchik authored Feb 05, 2021

* reuse the loop in THD::abort_current_cond_wait, don't duplicate it
* find_thread_by_id should return whatever it has found, it's the
  caller's task not to kill COM_DAEMON (if the caller's a killer)

and other minor changes

259a1902

List of unstable tests for 10.4.18 release · cbbcc8fa

Elena Stepanova authored Feb 08, 2021

Test code modifications and new failures from buildbot registered
only for the main suite. The rest was updated partially,
based on the status of existing JIRA items

cbbcc8fa

Merge branch 'bb-10.3-release' into bb-10.4-release · 00a313ec

Sergei Golubchik authored Feb 12, 2021

Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately

00a313ec

MDEV-24643: Assertion failed in rw_lock::update_unlock() · a1542f8a

Marko Mäkelä authored Feb 12, 2021

mtr_defer_drop_ahi(): Upgrade the U lock to X lock and downgrade
it back to U lock in case the adaptive hash index needs to be dropped.

This regression was introduced in
commit 03ca6495 (MDEV-24142).

a1542f8a

MDEV-20612: Enable concurrent lock_release() · 26d6224d

Marko Mäkelä authored Feb 12, 2021

lock_release_try(): Try to release locks while only holding
shared lock_sys.latch.

lock_release(): If 5 attempts of lock_release_try() fail,
proceed to acquire exclusive lock_sys.latch.

26d6224d

MDEV-20612: Partition lock_sys.latch · b08448de

Marko Mäkelä authored Feb 12, 2021

We replace the old lock_sys.mutex (which was renamed to lock_sys.latch)
with a combination of a global lock_sys.latch and table or page hash lock
mutexes.

The global lock_sys.latch can be acquired in exclusive mode, or
it can be acquired in shared mode and another mutex will be acquired
to protect the locks for a particular page or a table.

This is inspired by
mysql/mysql-server@1d259b87a63defa814e19a7534380cb43ee23c48
but the optimization of lock_release() will be done in the next commit.
Also, we will interleave mutexes with the hash table elements, similar
to how buf_pool.page_hash was optimized
in commit 5155a300 (MDEV-22871).

dict_table_t::autoinc_trx: Use Atomic_relaxed.

dict_table_t::autoinc_mutex: Use srw_mutex in order to reduce the
memory footprint. On 64-bit Linux or OpenBSD, both this and the new
dict_table_t::lock_mutex should be 32 bits and be stored in the same
64-bit word. On Microsoft Windows, the underlying SRWLOCK is 32 or 64
bits, and on other systems, sizeof(pthread_mutex_t) can be much larger.

ib_lock_t::trx_locks, trx_lock_t::trx_locks: Document the new rules.
Writers must assert lock_sys.is_writer() || trx->mutex_is_owner().

LockGuard: A RAII wrapper for acquiring a page hash table lock.

LockGGuard: Like LockGuard, but when Galera Write-Set Replication
is enabled, we must acquire all shards, for updating arbitrary trx_locks.

LockMultiGuard: A RAII wrapper for acquiring two page hash table locks.

lock_rec_create_wsrep(), lock_table_create_wsrep(): Special
Galera conflict resolution in non-inlined functions in order
to keep the common code paths shorter.

lock_sys_t::prdt_page_free_from_discard(): Refactored from
lock_prdt_page_free_from_discard() and
lock_rec_free_all_from_discard_page().

trx_t::commit_tables(): Replaces trx_update_mod_tables_timestamp().

lock_release(): Let trx_t::commit_tables() invalidate the query cache
for those tables that were actually modified by the transaction.
Merge lock_check_dict_lock() to lock_release().

We must never release lock_sys.latch while holding any
lock_sys_t::hash_latch. Failure to do that could lead to
memory corruption if the buffer pool is resized between
the time lock_sys.latch is released and the hash_latch is released.

b08448de