Commits · 617dee34886987c53b14e09a9ee73abe923a18e4 · nexedi / MariaDB

29 Jun, 2021 2 commits

MDEV-26042 Atomic write capability is not detected correctly · 617dee34

Marko Mäkelä authored Jun 29, 2021

my_init_atomic_write(): Detect all forms of SSD, in case multiple
types of devices are installed in the same machine.
This was broken in commit ed008a74
and further in commit 70684afe.

SAME_DEV(): Match block devices, ignoring partition numbers.

Let us use stat() instead of lstat(), in case someone has a symbolic
link in /dev.

Instead of reporting errors with perror(), let us use fprintf(stderr)
with the file name, the impact of the error, and the strerror(errno).
Because this code is specific to Linux, we may depend on the
GNU libc/uClibc/musl extension %m for strerror(errno).

617dee34

MDEV-22640 fixup: clang -Winconsistent-missing-override · 3d15e3c0
Marko Mäkelä authored Jun 29, 2021

3d15e3c0

28 Jun, 2021 1 commit
- MDEV-23004 When using GROUP BY with JSON_ARRAYAGG with joint table, the · 98c7916f
  Alexey Botchkov authored Jun 24, 2021
```
square brackets are not included.

Item_func_json_arrayagg::copy_or_same() should be implemented.
```
  98c7916f
26 Jun, 2021 3 commits

MDEV-26017: Assertion stat.flush_list_bytes <= curr_pool_size · fc2ff464

Marko Mäkelä authored Jun 26, 2021

buf_flush_relocate_on_flush_list(): If we are removing the block from
buf_pool.flush_list, subtract its size from buf_pool.stat.flush_list_bytes.
This fixes a regression that was introduced in
commit 22b62eda (MDEV-25113).

fc2ff464

Cleanup: Remove unused mtr_block_dirtied · aa95c423
Marko Mäkelä authored Jun 26, 2021

aa95c423

MDEV-26010 fixup: Use acquire/release memory order · 759deaa0

Marko Mäkelä authored Jun 26, 2021

In commit 5f22511e we depend on
Total Store Ordering. For correct operation on ISAs that implement
weaker memory ordering, we must explicitly use release/acquire stores
and loads on buf_page_t::oldest_modification_ to prevent a race condition
when buf_page_t::list does not happen to be on the same cache line.

buf_page_t::clear_oldest_modification(): Assert that the block is
not in buf_pool.flush_list, and use std::memory_order_release.

buf_page_t::oldest_modification_acquire(): Read oldest_modification_
with std::memory_order_acquire. In this way, if the return value is 0,
the caller may safely assume that it will not observe the buf_page_t
as being in buf_pool.flush_list, even if it is not holding
buf_pool.flush_list_mutex.

buf_flush_relocate_on_flush_list(), buf_LRU_free_page():
Invoke buf_page_t::oldest_modification_acquire().

759deaa0

24 Jun, 2021 3 commits

MDEV-26010: Assertion lsn > 2 failed in buf_pool_t::get_oldest_modification · 5f22511e

Marko Mäkelä authored Jun 24, 2021

In commit 22b62eda (MDEV-25113)
we introduced a race condition. buf_LRU_free_page() would read
buf_page_t::oldest_modification() as 0 and assume that
buf_page_t::list can be used (for attaching the block to the
buf_pool.free list). In the observed race condition,
buf_pool_t::delete_from_flush_list() had cleared the field,
and buf_pool_t::delete_from_flush_list_low() was executing
concurrently with buf_LRU_block_free_non_file_page(),
which resulted in buf_pool.flush_list.end becoming corrupted.

buf_pool_t::delete_from_flush_list(), buf_flush_relocate_on_flush_list():
First remove the block from buf_pool.flush_list, and only then
invoke buf_page_t::clear_oldest_modification(), to ensure that
reading oldest_modification()==0 really implies that the block
no longer is in buf_pool.flush_list.

5f22511e

MDEV-25948 fixup: Demote a warning to a note · e329dc8d

Marko Mäkelä authored Jun 24, 2021

buf_dblwr_t::recover(): Issue a note, not a warning, about
pages whose FIL_PAGE_LSN is in the future. This was supposed to be
part of commit 762bcb81 (MDEV-25948)
but had been accidentally omitted.

e329dc8d

MDEV-26004 Excessive wait times in buf_LRU_get_free_block() · 60ed4797

Marko Mäkelä authored Jun 24, 2021

buf_LRU_get_free_block(): Initially wait for a single block to be
freed, signaled by buf_pool.done_free. Only if that fails and no
LRU eviction flushing batch is already running, we initiate a
flushing batch that should serve all threads that are currently
waiting in buf_LRU_get_free_block().

Note: In an extreme case, this may introduce a performance regression
at larger numbers of connections. We observed this in sysbench
oltp_update_index with 512MiB buffer pool, 4GiB of data on fast NVMe,
and 1000 concurrent connections, on a 20-thread CPU. The contention point
appears to be buf_pool.mutex, and the improvement would turn into a
regression somewhere beyond 32 concurrent connections.

On slower storage, such regression was not observed; instead, the
throughput was improving and maximum latency was reduced.

The excessive waits were pointed out by Vladislav Vaintroub.

60ed4797

23 Jun, 2021 10 commits

MDEV-25113: Introduce a page cleaner mode before 'furious flush' · 6441bc61

Marko Mäkelä authored Jun 23, 2021

MDEV-23855 changed the way how the page cleaner is signaled by
user threads. If a threshold is exceeded, a mini-transaction commit
would invoke buf_flush_ahead() in order to initiate page flushing
before all writers would eventually grind to halt in
log_free_check(), waiting for the checkpoint age to reduce.

However, buf_flush_ahead() would always initiate 'furious flushing',
making the buf_flush_page_cleaner thread write innodb_io_capacity_max
pages per batch, and sleeping no time between batches, until the
limit LSN is reached. Because this could saturate the I/O subsystem,
system throughput could significantly reduce during these
'furious flushing' spikes.

With this change, we introduce a gentler version of flush-ahead,
which would write innodb_io_capacity_max pages per second until
the 'soft limit' is reached.

buf_flush_ahead(): Add a parameter to specify whether furious flushing
is requested.

buf_flush_async_lsn: Similar to buf_flush_sync_lsn, a limit for
the less intrusive flushing.

buf_flush_page_cleaner(): Keep working until buf_flush_async_lsn
has been reached.

log_close(): Suppress a warning message in the event that a new log
is being created during startup, when old logs did not exist.
Return what type of page cleaning will be needed.

mtr_t::finish_write(): Also when m_log.is_small(), invoke log_close().
Return what type of page cleaning will be needed.

mtr_t::commit(): Invoke buf_flush_ahead() based on the return value of
mtr_t::finish_write().

6441bc61

MDEV-25113: Make page flushing faster · 22b62eda

Marko Mäkelä authored Jun 23, 2021

buf_page_write_complete(): Reduce the buf_pool.mutex hold time,
and do not acquire buf_pool.flush_list_mutex at all.
Instead, mark blocks clean by setting oldest_modification to 1.
Dirty pages of temporary tables will be identified by the special
value 2 instead of the previous special value 1.
(By design of the ib_logfile0 format, actual LSN values smaller
than 2048 are not possible.)

buf_LRU_free_page(), buf_pool_t::get_oldest_modification()
and many other functions will remove the garbage (clean blocks)
from buf_pool.flush_list while holding buf_pool.flush_list_mutex.

buf_pool_t::n_flush_LRU, buf_pool_t::n_flush_list:
Replaced with non-atomic variables, protected by buf_pool.mutex,
to avoid unnecessary synchronization when modifying the counts.

export_vars: Remove unnecessary indirection for
innodb_pages_created, innodb_pages_read, innodb_pages_written.

22b62eda

MDEV-25801: buf_flush_dirty_pages() is very slow · 8af53897

Marko Mäkelä authored Jun 23, 2021

In commit 7cffb5f6 (MDEV-23399)
the implementation of buf_flush_dirty_pages() was replaced with
a slow one, which would perform excessive scans of the
buf_pool.flush_list and make little progress.

buf_flush_list(), buf_flush_LRU(): Split from buf_flush_lists().
Vladislav Vaintroub noticed that we will not need to invoke
log_flush_task.wait() for the LRU eviction flushing.

buf_flush_list_space(): Replaces buf_flush_dirty_pages().
This is like buf_flush_list(), but operating on a single
tablespace at a time. Writes at most innodb_io_capacity
pages. Returns whether some of the tablespace might remain
in the buffer pool.

8af53897

MDEV-25948 Remove log_flush_task · 762bcb81

Marko Mäkelä authored Jun 23, 2021

Vladislav Vaintroub suggested that invoking log_flush_up_to()
for every page could perform better than invoking a log write
between buf_pool.flush_list batches, like we started doing in
commit 3a9a3be1 (MDEV-23855).
This could depend on the sequence in which pages are being
modified. The buf_pool.flush_list is ordered by
oldest_modification, while the FIL_PAGE_LSN of the pages is
theoretically independent of that. In the pathological case,
we will wait for a log write before writing each individual page.

It turns out that we can defer the call to log_flush_up_to()
until just before submitting the page write. If the doublewrite
buffer is being used, we can submit a write batch of "future" pages
to the doublewrite buffer, and only wait for the log write right
before we are writing an already doublewritten page.
The next doublewrite batch will not be initiated before the last
page write from the current batch has completed.

When a future version introduces asynchronous writes if the log,
we could initiate a write at the start of a flushing batch, to
reduce waiting further.

762bcb81

MDEV-25954: Trim os_aio_wait_until_no_pending_writes() · 6dfd44c8

Marko Mäkelä authored Jun 23, 2021

It turns out that we had some unnecessary waits for no outstanding
write requests to exist. They were basically working around a
bug that was fixed in MDEV-25953.

On write completion callback, blocks will be marked clean.
So, it is sufficient to consult buf_pool.flush_list to determine
which writes have not been completed yet.

On FLUSH TABLES...FOR EXPORT we must still wait for all pending
asynchronous writes to complete, because buf_flush_file_space()
would merely guarantee that writes will have been initiated.

6dfd44c8

Merge 10.4 into 10.5 · 344e5990
Marko Mäkelä authored Jun 23, 2021

344e5990
Merge 10.3 into 10.4 · 09b03ff3
Marko Mäkelä authored Jun 23, 2021

09b03ff3
bump the VERSION · 55b3a3f4
Daniel Bartholomew authored Jun 22, 2021

55b3a3f4
bump the VERSION · bf2680ea
Daniel Bartholomew authored Jun 22, 2021

bf2680ea
bump the VERSION · 1deb6304
Daniel Bartholomew authored Jun 22, 2021

1deb6304

22 Jun, 2021 5 commits

MDEV-25679 Wrong result selecting from simple view with LIMIT and ORDER BY · 7f24e37f
Igor Babaev authored Jun 22, 2021
```
Cherry-picking only test case.
```
7f24e37f

MDEV-25981 InnoDB upgrade fails · 35a9aaeb

Marko Mäkelä authored Jun 22, 2021

trx_undo_mem_create_at_db_start(): Relax too strict upgrade checks
that were introduced in
commit e46f76c9 (MDEV-15912).
On commit, pages will typically be set to TRX_UNDO_CACHED state.
Having the type TRX_UNDO_INSERT in such pages is common and
unproblematic; the type would be reset in trx_undo_reuse_cached().

trx_rseg_array_init(): On failure, clean up the rollback segments
that were initialized so far, to avoid an assertion failure later
during shutdown.

35a9aaeb

Merge 10.2 into 10.3 · e07f0a2d
Marko Mäkelä authored Jun 22, 2021

e07f0a2d

MDEV-25982 Upgrade of MariaDB 10.1 log crashes due to missing encryption key · 19716ad5

Marko Mäkelä authored Jun 22, 2021

init_crypt_key(): On failure, set info->key_version to
ENCRYPTION_KEY_VERSION_INVALID.

log_crypt_101_read_block(): Refuse to attempt decryption if
info->key_version is ENCRYPTION_KEY_VERSION_INVALID.

19716ad5

MDEV-25679 Wrong result selecting from simple view with LIMIT and ORDER BY · 6e94ef41

Igor Babaev authored Jun 21, 2021

This bug affected queries with views / derived_tables / CTEs whose
specifications were of the form
  (SELECT ... LIMIT <n>) ORDER BY ...
Units representing such specifications contains one SELECT_LEX structure
for (SELECT ... LIMIT <n>) and additionally SELECT_LEX structure for
fake_select_lex. This fact should have been taken into account in the
function mysql_derived_fill().

This patch has to be applied to 10.2 and 10.3 only.

6e94ef41

21 Jun, 2021 13 commits

MDEV-25679 Wrong result selecting from simple view with LIMIT and ORDER BY · cc0bd843

Igor Babaev authored Jun 21, 2021

This bug affected queries with views / derived_tables / CTEs whose
specifications were of the form
  (SELECT ... LIMIT <n>) ORDER BY ...
Units representing such specifications contains one SELECT_LEX structure
for (SELECT ... LIMIT <n>) and additionally SELECT_LEX structure for
fake_select_lex. This fact should have been taken into account in the
function mysql_derived_fill().

This patch has to be applied to 10.2 and 10.3 only.

cc0bd843

Merge 10.4 into 10.5 · a1907fed
Marko Mäkelä authored Jun 21, 2021

a1907fed
Merge 10.3 into 10.4 · ce868cd8
Marko Mäkelä authored Jun 21, 2021

ce868cd8

MDEV-25979 Invalid page number written to DB_ROLL_PTR · 9dc50ea2

Marko Mäkelä authored Jun 21, 2021

trx_undo_report_row_operation(): Fix a race condition that was introduced
in commit f74023b9 (MDEV-15090).
We must not access undo_block after the page latch has been released
in mtr_t::commit(), because the block could be evicted or replaced.

9dc50ea2

Merge 10.4 into 10.5 · a42c80bd
Marko Mäkelä authored Jun 21, 2021

a42c80bd

After-merge fix: Remove duplicated code · baf0ef9a

Marko Mäkelä authored Jun 21, 2021

In the merge commit d3e4fae7
a message about innodb_force_recovery was accidentally duplicated.

baf0ef9a

MDEV-25878: mytop bugs: check for mysql driver and sockets · bcedb420
Anel Husakovic authored Oct 13, 2020
```
- Adding socket check for MariaDB/Mysql driver

Reviewed by: serg@mariadb.com
```
bcedb420
MDEV-25878: mytop bugs: check for mysql driver and sockets · 59e3ac2e
Jean Weisbuch authored May 18, 2020
```
mytop fall-back to DBD::mysql if DBD::MariaDB is not available

Apply #1546
```
59e3ac2e
Merge 10.3 into 10.4 · d3e4fae7
Marko Mäkelä authored Jun 21, 2021

d3e4fae7

MDEV-15912: Remove traces of insert_undo · e46f76c9

Marko Mäkelä authored Jun 21, 2021

Let us simply refuse an upgrade from earlier versions if the
upgrade procedure was not followed. This simplifies the purge,
commit, and rollback of transactions.

Before upgrading to MariaDB 10.3 or later, a clean shutdown
of the server (with innodb_fast_shutdown=1 or 0) is necessary,
to ensure that any incomplete transactions are rolled back.
The undo log format was changed in MDEV-12288. There is only
one persistent undo log for each transaction.

e46f76c9

After-merge fixes for MDEV-14180 · 241d30d3
Marko Mäkelä authored Jun 21, 2021

241d30d3
Merge 10.2 into 10.3 · c9a85fb1
Marko Mäkelä authored Jun 21, 2021

c9a85fb1
Remove Travis CI status · 773a07b6
Marko Mäkelä authored Jun 21, 2021
```
Builds on travis-ci.org ceased on 2021-06-15.
```
773a07b6

19 Jun, 2021 3 commits
- fix spider tests for --ps in 10.5 · c3a1ba0f
  Sergei Golubchik authored Jun 19, 2021
```
see also 068246c0 and 690ae1de
```
  c3a1ba0f
- fix spider tests for --ps in 10.4 · 690ae1de
  Sergei Golubchik authored Jun 19, 2021
```
see also 068246c0
```
  690ae1de
- spider tests aren't big in 10.4 · a4f48591
  Sergei Golubchik authored Jun 19, 2021
```
see also a5f6eca5
```
  a4f48591