Commits · 28e166d6435741c46e0ea789657b739f78eb0425 · nexedi / MariaDB

21 Jan, 2022 1 commit

MDEV-26784 [Warning] InnoDB: Difficult to find free blocks in the buffer pool · 28e166d6

Thirunarayanan Balathandayuthapani authored Jan 20, 2022

Problem:
=======
  InnoDB ran out of memory during recovery and it fails to
flush the dirty LRU blocks. The reason is that buffer pool
can ran out before the LRU list length reaches
BUF_LRU_OLD_MIN_LEN(256) threshold.

Fix:
====
During recovery, InnoDB should write out and evict all
dirty blocks.

28e166d6

20 Jan, 2022 2 commits

MDEV-26223 Galera cluster node consider old server_id value even after... · a0f711e9

Jan Lindström authored Jan 20, 2022

MDEV-26223 Galera cluster node consider old server_id value even after modification of server_id [wsrep_gtid_mode=ON]

For non bootstrap node server id should be ignored because using custom
value can lead to inconsistency problem with replicated GTID in cluster.
Providing warning message when this happens.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>

a0f711e9

MDEV-27550: Disable galera.MW-328D · 66465914
Marko Mäkelä authored Jan 20, 2022

66465914

19 Jan, 2022 1 commit

MDEV-27382: OFFSET is ignored when combined with DISTINCT · 7259b299

Sergei Petrunia authored Jan 13, 2022

A query in form

  SELECT DISTINCT expr_that_is_inferred_to_be_const LIMIT 0 OFFSET n

produces one row when it should produce none. The issue was in
JOIN_TAB::remove_duplicates() in the piece of logic that tried to
avoid duplicate removal for such cases but didn't account for possible
"LIMIT 0".

Fixed by making Select_limit_counters::set_limit() change OFFSET to 0
when LIMIT is 0.

7259b299

18 Jan, 2022 2 commits

MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock · be811386

Vlad Lesin authored Jan 11, 2022

The code was backported from 10.6 bd03c0e5
commit. See that commit message for details.

Apart from the above commit trx_lock_t::wait_trx was also backported from
MDEV-24738. trx_lock_t::wait_trx is protected with lock_sys.wait_mutex
in 10.6, but that mutex was implemented only in MDEV-24789. As there is no
need to backport MDEV-24789 for MDEV-27025,
trx_lock_t::wait_trx is protected with the same mutexes as
trx_lock_t::wait_lock.

This fix should not break innodb-lock-schedule-algorithm=VATS. This
algorithm uses an Eldest-Transaction-First (ETF) heuristic, which prefers
older transactions over new ones. In this fix we just insert granted lock
just before the last granted lock of the same transaction, what does not
change transactions execution order.

The changes in lock_rec_create_low() should not break Galera Cluster,
there is a big "if" branch for WSREP. This branch is necessary to provide
the correct transactions execution order, and should not be changed for
the current bug fix.

be811386

MDEV-27499 Performance regression in log_checkpoint_margin() · e44439ab

Marko Mäkelä authored Jan 17, 2022

In commit 4c3ad244 (MDEV-27416)
an unnecessarily strict wait condition was introduced in the
function buf_flush_wait(). Most callers actually only care that
the pages have been flushed, not that a checkpoint has completed.

Only in the buf_flush_sync() call for log resizing, we might care
about the log checkpoint. But, in fact,
srv_prepare_to_delete_redo_log_file() is explicitly disabling
checkpoints. So, we can simply remove the unnecessary wait loop.

Thanks to Krunal Bauskar for reporting this performance regression
that we failed to repeat in our testing.

e44439ab

17 Jan, 2022 3 commits

MDEV-26230 mysql_upgrade fails to load type_mysql_json due to insufficient maturity level · 745aa8be
Sergei Golubchik authored Dec 29, 2021
```
bump maturity to beta
```
745aa8be

MDEV-25373 DROP TABLE doesn't raise error while dropping non-existing table in... · 5af6a137

Sergei Golubchik authored Dec 29, 2021

MDEV-25373 DROP TABLE doesn't raise error while dropping non-existing table in MariaDB 10.5.9 when OQGraph SE is loaded to the server

don't auto-succeed every DROP TABLE

5af6a137

MDEV-27461: Buffer pool resize fails to wake up the page cleaner · f18e2564

Marko Mäkelä authored Jan 17, 2022

buf_pool_t::realloc(): Invoke page_cleaner_wakeup()
if buf_LRU_get_free_only() returns a null pointer.

Ever since commit 7b1252c0 (MDEV-24278)
the page cleaner would remain in untimed sleep, expecting explicit
calls to buf_pool_t::page_cleaner_wakeup() when the ratio of dirty pages
could change.

Failure to wake up the page cleaner will cause all page writes to be
initiated by buf_flush_LRU_list_batch(). That might work too,
provided that the buffer pool size is at least BUF_LRU_MIN_LEN (256)
pages, but it would not advance the log checkpoint.

f18e2564

15 Jan, 2022 3 commits

MDEV-27240 fixup: remove dead code · b7e4dc12
Nayuta Yanagisawa authored Jan 15, 2022

b7e4dc12

MDEV-27240 fixup: remove #ifdef in macro call · 64f844b6

Nayuta Yanagisawa authored Jan 15, 2022

Windows builds failed due to the following error:
'#': invalid character: possibly the result of a macro expansion

64f844b6

MDEV-27240 SIGSEGV in ha_spider::store_lock on LOCK TABLE · 2ecd39c9

Nayuta Yanagisawa authored Jan 11, 2022

The commit e954d9de gave different lifetime to wide_share and
partition_handler_share. This introduced the possibility that
partition_handler_share could be accessed even after it was freed.

We stop sharing partitoiin_handler_share and make it belong to
a single wide_handler to fix the problem.

2ecd39c9

14 Jan, 2022 2 commits

Remove FIXME comments that refer to an early MDEV-14425 plan · 8535c260

Marko Mäkelä authored Jan 14, 2022

In MDEV-14425, an early plan was to introduce a separate log file
for file-level records and checkpoint information. The reasoning was
that fil_system.mutex contention would be reduced by not having to
maintain fil_system.named_spaces. The mutex contention was actually
fixed in MDEV-23855 by making some data fields in fil_space_t and
fil_node_t use std::atomic.

Using a single circular log file simplifies recovery and backup.

8535c260

MDEV-27500 buf_page_free() fails to drop the adaptive hash index · c104a01b

Marko Mäkelä authored Jan 14, 2022

The function buf_page_free() that was introduced
in commit a35b4ae8 (MDEV-15528)
failed to remove any adaptive hash index entries for the page
before freeing the page.

This caused an assertion failure on shutdown of 10.6 server of
in the function buf_pool_t::clear_hash_index() with the expression:
(s >= buf_page_t::UNFIXED || s == buf_page_t::REMOVE_HASH).
The assertion would fail for a block that is in the freed state.

The failing assertion was added in
commit aaef2e1d
in the 10.6 branch.

Thanks to Matthias Leich for finding the bug and testing the fix.

c104a01b

12 Jan, 2022 2 commits

MDEV-26824 Can't add foreign key with empty referenced columns list · 6831b3f2

Aleksey Midenkov authored Jan 12, 2022

create_table_info_t::create_foreign_keys() expects equal number of
iterations through fk->columns and fk->ref_columns. If fk->ref_columns
is empty copy it from fk->columns.

6831b3f2

MDEV-27476 heap-use-after-free in buf_pool_t::is_block_field() · 017d1b86

Marko Mäkelä authored Jan 12, 2022

mtr_t::modify(): Remove a debug assertion that had been added
in commit 05fa4558 (MDEV-22110).
The function buf_pool_t::is_uncompressed() is only safe to invoke
while holding a buf_pool.page_hash latch so that buf_pool_t::resize()
cannot concurrently invoke free() on any chunks.

017d1b86

11 Jan, 2022 1 commit

MDEV-27022 Buffer pool is being flushed during recovery · f443cd11

Eugene Kosov authored Nov 11, 2021

The problem was introduced by the removal of buf_pool.flush_rbt
in commit 46b1f500 (MDEV-23399)

recv_sys_t::apply(): don't write to disc and fsync() the last batch.
Insead, sort it by oldest_modification for MariaDB server and some
mariabackup operations.

log_sort_flush_list(): a thread-safe function which sorts buf_pool::flush_list

f443cd11

10 Jan, 2022 1 commit

MDEV-23836: Assertion `! is_set() || m_can_overwrite_status' in · 81e00485

Rucha Deodhar authored Oct 16, 2020

Diagnostics_area::set_error_status (interrupted ALTER TABLE under LOCK)

Analysis: KILL_QUERY is not ignored when local memory used exceeds maximum
session memory. Hence the query proceeds, OK is sent and we end up
reopening tables that are marked for reopen. During this, kill status is
eventually checked and assertion failure happens during trying to send error
message because OK has already been sent.
Fix: Ok is already sent so statement has already executed. It is too
late to give error. So ignore kill.

81e00485

09 Jan, 2022 1 commit

Silence CMake warning from exteral cmake project (pcre2) · c62bb9c3

Vladislav Vaintroub authored Jan 09, 2022


The warning reads:

CMake Deprecation Warning at CMakeLists.txt:101 (CMAKE_MINIMUM_REQUIRED):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.

c62bb9c3

04 Jan, 2022 1 commit

MDEV-27416 InnoDB hang in buf_flush_wait_flushed(), on log checkpoint · 4c3ad244

Marko Mäkelä authored Jan 04, 2022

InnoDB could sometimes hang when triggering a log checkpoint. This is
due to commit 7b1252c0 (MDEV-24278),
which introduced an untimed wait to buf_flush_page_cleaner().

The hang was noticed by occasional failures of IMPORT TABLESPACE tests,
such as innodb.innodb-wl5522, which would (unnecessarily) invoke
log_make_checkpoint() from row_import_cleanup().

The reason of the hang was that buf_flush_page_cleaner() would enter
untimed sleep despite buf_flush_sync_lsn being set. The exact failure
scenario is unclear, because buf_flush_sync_lsn should actually be
protected by buf_pool.flush_list_mutex. We prevent the hang by
invoking buf_pool.page_cleaner_set_idle(false) whenever we are
setting buf_flush_sync_lsn and signaling buf_pool.do_flush_list.

The bulk of these changes was originally developed as a preparation
for MDEV-26827, to invoke buf_flush_list() from fewer threads,
and tested on 10.6 by Matthias Leich.

This fix was tested by running 100 repetitions of 100 concurrent instances
of the test innodb.innodb-wl5522 on a RelWithDebInfo build, using ext4fs
and innodb_flush_method=O_DIRECT on a SATA SSD with 4096-byte block size.
During the test, the call to log_make_checkpoint() in row_import_cleanup()
was present.

buf_flush_list(): Make static.

buf_flush_wait(): Wait for buf_pool.get_oldest_modification()
to reach a target, by work done in the buf_flush_page_cleaner.
If buf_flush_sync_lsn is going to be set, we will invoke
buf_pool.page_cleaner_set_idle(false).

buf_flush_ahead(): If buf_flush_sync_lsn or buf_flush_async_lsn
is going to be set and the page cleaner woken up, we will invoke
buf_pool.page_cleaner_set_idle(false).

buf_flush_wait_flushed(): Invoke buf_flush_wait().

buf_flush_sync(): Invoke recv_sys.apply() at the start in case
crash recovery is active. Invoke buf_flush_wait().

buf_flush_sync_batch(): A lower-level variant of buf_flush_sync()
that is only called by recv_sys_t::apply().

buf_flush_sync_for_checkpoint(): Do not trigger log apply
or checkpoint during recovery.

buf_dblwr_t::create(): Only initiate a buffer pool flush, not
a checkpoint.

row_import_cleanup(): Do not unnecessarily invoke log_make_checkpoint().
Invoking buf_flush_list_space() before starting to generate redo log
for the imported tablespace should suffice.

srv_prepare_to_delete_redo_log_file():
Set recv_sys.recovery_on in order to prevent
buf_flush_sync_for_checkpoint() from initiating a checkpoint
while the log is inaccessible. Remove a wait loop that is already
part of buf_flush_sync().
Do not invoke fil_names_clear() if the log is being upgraded,
because the FILE_MODIFY record is specific to the latest format.

create_log_file(): Clear recv_sys.recovery_on only after calling
log_make_checkpoint(), to prevent buf_flush_page_cleaner from
invoking a checkpoint.

innodb_shutdown(): Simplify the logic in mariadb-backup --prepare.

os_aio_wait_until_no_pending_writes(): Update the function comment.
Apart from row_quiesce_table_start() during FLUSH TABLES...FOR EXPORT,
this is being called by buf_flush_list_space(), which is invoked
by ALTER TABLE...IMPORT TABLESPACE as well as some encryption operations.

4c3ad244

03 Jan, 2022 4 commits

Deb: Adapt custom build steps to be compatible with latest Salsa-CI · eab89f14

Otto Kekäläinen authored Dec 31, 2021

Upstream Salsa-CI refactored the build process in
https://salsa.debian.org/salsa-ci-team/pipeline/-/commit/58880fcef5b742cb9c661121a8c8707bf392b3b5

This broke our custom direct invocation of install-build-deps.sh as the
Salsa-CI images no longer contain them. Adapt the .build-script
equivalent to follow new Salsa-CI method so builds work again.

eab89f14

Merge 10.4 into 10.5 · c9db50b5
Marko Mäkelä authored Jan 03, 2022

c9db50b5

Correct some copyright messages · 1df05a08

Marko Mäkelä authored Jan 03, 2022

Most of the Facebook contribution
mysql/mysql-server@72d656acdf082d5ead1cc1be84f2fd68ab6a65a9
was removed in
commit 5bea43f5 (MDEV-12353).
Mainly the configuration parameter innodb_compression_level remains.
It had been renamed to page_zip_level in
mysql/mysql-server@5b38f2a712a7077c994c00787b891a7d4ee328df.

1df05a08

Cleanup: Remove RECV_READ_AHEAD_AREA · c14dd0d1
Marko Mäkelä authored Jan 03, 2022
```
Let us directly use the constant 32 in recv_read_in_area().
```
c14dd0d1

28 Dec, 2021 1 commit
- Add --valgrind to VERSION() string for valgrind builds · a48d2ec8
  Monty authored Dec 28, 2021
```
Fixes main.sp-no-valgrind for valgrind builds not done with BUILD scripts
```
  a48d2ec8
27 Dec, 2021 2 commits

MDEV-27304 SHOW ... result columns are right-aligned · 89a0364f

Sergei Golubchik authored Dec 24, 2021

--version=value was setting sys_var::CONFIG (meaning, the value
came from the config file), but the filename was left as NULL.

89a0364f

MDEV-27184 Assertion `(old_top == initial_top (av) && old_size == 0) ||... · 5045509b

Nayuta Yanagisawa authored Dec 23, 2021

MDEV-27184 Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed, Assertion `str.alloced_length() >= str.length() + data_len' failed

Spider crashes on a query that inserts some rows including float.
This is because Spider allocates a string of insufficient length.

5045509b

26 Dec, 2021 1 commit
- Merge branch 10.4 into 10.5 · 55bb933a
  Julius Goryavsky authored Dec 26, 2021
  
  55bb933a
25 Dec, 2021 1 commit
- Merge branch 10.3 into 10.4 · 681b7784
  Julius Goryavsky authored Dec 25, 2021
  
  681b7784
24 Dec, 2021 1 commit
- Merge branch 10.2 into 10.3 · 97695675
  Julius Goryavsky authored Dec 24, 2021
  
  97695675
23 Dec, 2021 6 commits

MDEV-24097: galera[_3nodes] suite tests in MTR sporadically fails · b5cbe506

Julius Goryavsky authored Dec 23, 2021

This is the first part of the fixes for MDEV-24097. This commit
contains the fixes for instability when testing Galera and when
restarting nodes quickly:

1) Protection against a "stuck" old SST process during the execution
   of the new SST (after restarting the node) is now implemented for
   mariabackup / xtrabackup, which should help to avoid almost all
   conflicts due to the use of the same ports - both during testing
   with mtr, so and when restarting nodes quickly in a production
   environment.
2) Added more protection to scripts against unexpected return of
   the rc != 0 (in the commands for deleting temporary files, etc).
3) Added protection against unexpected crashes during binlog transfer
   (in SST scripts for rsync).
4) Spaces and some special characters in binlog filenames shouldn't
   be a problem now (at the script level).
5) Daemon process termination tracking has been made more robust
   against crashes due to unexpected termination of the previous SST
   process while new scripts are running.
6) Reading ssl encryption parameters has been moved from specific
   SST scripts to a common wsrep_sst_common.sh script, which allows
   unified error handling, unified diagnostics and simplifies script
   revisions in the future.
7) Improved diagnostics of errors related to the use of openssl.
8) Corrections have been made for xtrabackup-v2 (both in tests and in
   the script code) that restore the work of xtrabackup with updated
   versions of innodb.
9) Fixed some tests for galera_3nodes, although the complete solution
   for the problem of starting three nodes at the same time on fast
   machines will be done in a separate commit.

No additional tests are required as this commit fixes problems with
existing tests.

b5cbe506

Merge branch 10.2 into 10.3 · 3376668c
Julius Goryavsky authored Dec 23, 2021

3376668c
Fix typos in optimizer trace output · 4b020bfd
Sergei Petrunia authored Dec 23, 2021

4b020bfd

MDEV-27238: Assertion `got_name == named_item_expected()' failed in Json_writer · 397f5cf7

Sergei Petrunia authored Dec 23, 2021

make_join_select() calls const_cond->val_int(). There are edge cases
where const_cond may have a not-yet optimized subquery.

(The subquery will have used_tables() covered by join->const_tables. It
will still have const_item()==false, so other parts of the optimizer
will not try to evaluate it.  We should probably mark such subqueries
as constant but that is outside the scope of this MDEV)

397f5cf7

result of wsrep logic in queue_for_group_commit was being ignored · 0165a063

Leandro Pacheco authored Sep 13, 2021

This could cause out of order wsrep checkpoints due wsrep specific leader
code not being executed in `MYSQL_BIN_LOG::write_transaction_to_binlog_events`.
Move original result assignment to before wsrep logic to prevent that.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>

0165a063

Only apply wsrep_trx_fragment_size to InnoDB tables · ca2ea4ff

Monty authored May 19, 2020

MDEV-22617 Galera node crashes when trying to log to slow_log table in
streaming replication mode

Other things:
- Changed name of wsrep_after_row(two arguments) to
  wsrep_after_row_internal(one argument) to not depended on the
  function signature with unused arguments.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
	     Added test case

ca2ea4ff

22 Dec, 2021 4 commits

MDEV-26919: binlog.binlog_truncate_active_log fails in bb with valgrind,... · be20b3b0

Brandon Nesterenko authored Dec 21, 2021

MDEV-26919: binlog.binlog_truncate_active_log fails in bb with valgrind, Conditional jump or move depends on uninitialised value

Problem:
========
When writing an XA based event to the binary log, an assert was
always referencing thd->lex->xa_opt. This variable, however, is
only set when using XA START, XA END, and XA COMMIT. When an
XA PREPARE statement is being processed, it is not guaranteed that
the xa_opt variable will be set (e.g. if existing within a stored
procedure). This caused valgrind to complain about accessing an
uninitialized variable.

Solution:
========
Before referencing xa_opt, ensure the context is valid such that
it is set.

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>

be20b3b0

MDEV-23175: my_timer_milliseconds clock_gettime for multiple platfomrs · 4eec6b99

Daniel Black authored Jul 15, 2020

Small postfix to MDEV-23175 to ensure faster option on FreeBSD
and compatibility to Solaris that isn't high resolution.

ftime is left as a backup in case an implementation doesn't
contain any of these clocks.

FreeBSD
    $ ./unittest/mysys/my_rdtsc-t
    1..11
    # ----- Routine ---------------
    # myt.cycles.routine          :             5
    # myt.nanoseconds.routine     :            11
    # myt.microseconds.routine    :            13
    # myt.milliseconds.routine    :            11
    # myt.ticks.routine           :            17
    # ----- Frequency -------------
    # myt.cycles.frequency        :    3610295566
    # myt.nanoseconds.frequency   :    1000000000
    # myt.microseconds.frequency  :       1000000
    # myt.milliseconds.frequency  :           899
    # myt.ticks.frequency         :           136
    # ----- Resolution ------------
    # myt.cycles.resolution       :             1
    # myt.nanoseconds.resolution  :             1
    # myt.microseconds.resolution :             1
    # myt.milliseconds.resolution :             7
    # myt.ticks.resolution        :             1
    # ----- Overhead --------------
    # myt.cycles.overhead         :            26
    # myt.nanoseconds.overhead    :         19140
    # myt.microseconds.overhead   :         19036
    # myt.milliseconds.overhead   :           578
    # myt.ticks.overhead          :         21544
    ok 1 - my_timer_init() did not crash
    ok 2 - The cycle timer is strictly increasing
    ok 3 - The cycle timer is implemented
    ok 4 - The nanosecond timer is increasing
    ok 5 - The nanosecond timer is implemented
    ok 6 - The microsecond timer is increasing
    ok 7 - The microsecond timer is implemented
    ok 8 - The millisecond timer is increasing
    ok 9 - The millisecond timer is implemented
    ok 10 - The tick timer is increasing
    ok 11 - The tick timer is implemented

4eec6b99

MDEV-23175: my_timer_milliseconds clock_gettime for multiple platfomrs · 12087d67

Daniel Black authored Dec 22, 2021

Small postfix to MDEV-23175 to ensure faster option on FreeBSD
and compatibility to Solaris that isn't high resolution.

ftime is left as a backup in case an implementation doesn't
contain any of these clocks.

FreeBSD
    $ ./unittest/mysys/my_rdtsc-t
    1..11
    # ----- Routine ---------------
    # myt.cycles.routine          :             5
    # myt.nanoseconds.routine     :            11
    # myt.microseconds.routine    :            13
    # myt.milliseconds.routine    :            11
    # myt.ticks.routine           :            17
    # ----- Frequency -------------
    # myt.cycles.frequency        :    3610295566
    # myt.nanoseconds.frequency   :    1000000000
    # myt.microseconds.frequency  :       1000000
    # myt.milliseconds.frequency  :           899
    # myt.ticks.frequency         :           136
    # ----- Resolution ------------
    # myt.cycles.resolution       :             1
    # myt.nanoseconds.resolution  :             1
    # myt.microseconds.resolution :             1
    # myt.milliseconds.resolution :             7
    # myt.ticks.resolution        :             1
    # ----- Overhead --------------
    # myt.cycles.overhead         :            26
    # myt.nanoseconds.overhead    :         19140
    # myt.microseconds.overhead   :         19036
    # myt.milliseconds.overhead   :           578
    # myt.ticks.overhead          :         21544
    ok 1 - my_timer_init() did not crash
    ok 2 - The cycle timer is strictly increasing
    ok 3 - The cycle timer is implemented
    ok 4 - The nanosecond timer is increasing
    ok 5 - The nanosecond timer is implemented
    ok 6 - The microsecond timer is increasing
    ok 7 - The microsecond timer is implemented
    ok 8 - The millisecond timer is increasing
    ok 9 - The millisecond timer is implemented
    ok 10 - The tick timer is increasing
    ok 11 - The tick timer is implemented

12087d67

MDEV-27195 SIGSEGV in Table_scope_and_contents_source_st::vers_check_system_fields · a5ef74e7

Alexander Barkov authored Dec 22, 2021

The old code erroneously used default_charset_info to compare field names.
default_charset_info can point to any arbitrary collation,
including ucs2*, utf16*, utf32*, including those that do not
support strcasecmp().

my_charset_utf8mb4_unicode_ci, which is used in this scenario:

CREATE TABLE t1 ENGINE=InnoDB WITH SYSTEM VERSIONING AS SELECT 0;

does not support strcasecmp().

Fixing the code to use Lex_ident::streq(), which uses
system_charset_info instead of default_charset_info.

a5ef74e7