Commits · 8e6e5acef1517ebf0c645a7e704bede8f7366b2d · nexedi / MariaDB

05 Jun, 2020 5 commits

MDEV-22753 Server crashes upon INSERT into versioned partitioned table with WITHOUT OVERLAPS · 8e6e5ace
Nikita Malyavin authored Jun 05, 2020
```
Add `append_system_key_parts` call inside `fast_alter_partition_table` during new partition creation.
```
8e6e5ace
MDEV-22599 WITHOUT OVERLAPS does not work with prefix indexes · 35d327fd
Nikita Malyavin authored Jun 04, 2020
```
cmp_max is used instead of cmp to compare key_parts
```
35d327fd

MDEV-22434 UPDATE on RocksDB table with WITHOUT OVERLAPS fails · 0c595bde

Nikita Malyavin authored Jun 02, 2020

Insert worked incorrect as well. RocksDB used table->record[0] internally to store some
intermediate results for key conversion, during index searching among other operations.
So table->record[0] is spoiled during ha_rnd_index_map in ha_check_overlaps, so in turn
the broken record data was inserted.

The fix is to store RocksDB intermediate result in its own buffer instead of table->record[0].

`rocksdb` MTR suite is is checked and runs fine.
No need for additional tests. The existing overlaps.test covers the case completely.
However, I am not going to add anything related to rocksdb to suite, to keep it away
from additional dependencies.

To run tests with RocksDB engine, one can add following to engines.combinations:
[rocksdb]
plugin-load=$HA_ROCKSDB_SO
default-storage-engine=rocksdb
rocksdb

0c595bde

MDEV-22439 Add FOR PORTION OF statements to the test for WITHOUT OVERLAPS · c3e09a2d
Nikita Malyavin authored Jun 01, 2020

c3e09a2d

MDEV-15053 Reduce buf_pool_t::mutex contention · b1ab211d

Marko Mäkelä authored Jun 05, 2020

User-visible changes: The INFORMATION_SCHEMA views INNODB_BUFFER_PAGE
and INNODB_BUFFER_PAGE_LRU will report a dummy value FLUSH_TYPE=0
and will no longer report the PAGE_STATE value READY_FOR_USE.

We will remove some fields from buf_page_t and move much code to
member functions of buf_pool_t and buf_page_t, so that the access
rules of data members can be enforced consistently.

Evicting or adding pages in buf_pool.LRU will remain covered by
buf_pool.mutex.

Evicting or adding pages in buf_pool.page_hash will remain
covered by both buf_pool.mutex and the buf_pool.page_hash X-latch.

After this fix, buf_pool.page_hash lookups can entirely
avoid acquiring buf_pool.mutex, only relying on
buf_pool.hash_lock_get() S-latch.

Similarly, buf_flush_check_neighbors() can will rely solely on
buf_pool.mutex, no buf_pool.page_hash latch at all.

The buf_pool.mutex is rather contended in I/O heavy benchmarks,
especially when the workload does not fit in the buffer pool.

The first attempt to alleviate the contention was the
buf_pool_t::mutex split in
commit 4ed7082e
which introduced buf_block_t::mutex, which we are now removing.

Later, multiple instances of buf_pool_t were introduced
in commit c18084f7
and recently removed by us in
commit 1a6f708e (MDEV-15058).

UNIV_BUF_DEBUG: Remove. This option to enable some buffer pool
related debugging in otherwise non-debug builds has not been used
for years. Instead, we have been using UNIV_DEBUG, which is enabled
in CMAKE_BUILD_TYPE=Debug.

buf_block_t::mutex, buf_pool_t::zip_mutex: Remove. We can mainly rely on
std::atomic and the buf_pool.page_hash latches, and in some cases
depend on buf_pool.mutex or buf_pool.flush_list_mutex just like before.
We must always release buf_block_t::lock before invoking
unfix() or io_unfix(), to prevent a glitch where a block that was
added to the buf_pool.free list would apper X-latched. See
commit c5883deb how this glitch
was finally caught in a debug environment.

We move some buf_pool_t::page_hash specific code from the
ha and hash modules to buf_pool, for improved readability.

buf_pool_t::close(): Assert that all blocks are clean, except
on aborted startup or crash-like shutdown.

buf_pool_t::validate(): No longer attempt to validate
n_flush[] against the number of BUF_IO_WRITE fixed blocks,
because buf_page_t::flush_type no longer exists.

buf_pool_t::watch_set(): Replaces buf_pool_watch_set().
Reduce mutex contention by separating the buf_pool.watch[]
allocation and the insert into buf_pool.page_hash.

buf_pool_t::page_hash_lock<bool exclusive>(): Acquire a
buf_pool.page_hash latch.
Replaces and extends buf_page_hash_lock_s_confirm()
and buf_page_hash_lock_x_confirm().

buf_pool_t::READ_AHEAD_PAGES: Renamed from BUF_READ_AHEAD_PAGES.

buf_pool_t::curr_size, old_size, read_ahead_area, n_pend_reads:
Use Atomic_counter.

buf_pool_t::running_out(): Replaces buf_LRU_buf_pool_running_out().

buf_pool_t::LRU_remove(): Remove a block from the LRU list
and return its predecessor. Incorporates buf_LRU_adjust_hp(),
which was removed.

buf_page_get_gen(): Remove a redundant call of fsp_is_system_temporary(),
for mode == BUF_GET_IF_IN_POOL_OR_WATCH, which is only used by
BTR_DELETE_OP (purge), which is never invoked on temporary tables.

buf_free_from_unzip_LRU_list_batch(): Avoid redundant assignments.

buf_LRU_free_from_unzip_LRU_list(): Simplify the loop condition.

buf_LRU_free_page(): Clarify the function comment.

buf_flush_check_neighbor(), buf_flush_check_neighbors():
Rewrite the construction of the page hash range. We will hold
the buf_pool.mutex for up to buf_pool.read_ahead_area (at most 64)
consecutive lookups of buf_pool.page_hash.

buf_flush_page_and_try_neighbors(): Remove.
Merge to its only callers, and remove redundant operations in
buf_flush_LRU_list_batch().

buf_read_ahead_random(), buf_read_ahead_linear(): Rewrite.
Do not acquire buf_pool.mutex, and iterate directly with page_id_t.

ut_2_power_up(): Remove. my_round_up_to_next_power() is inlined
and avoids any loops.

fil_page_get_prev(), fil_page_get_next(), fil_addr_is_null(): Remove.

buf_flush_page(): Add a fil_space_t* parameter. Minimize the
buf_pool.mutex hold time. buf_pool.n_flush[] is no longer updated
atomically with the io_fix, and we will protect most buf_block_t
fields with buf_block_t::lock. The function
buf_flush_write_block_low() is removed and merged here.

buf_page_init_for_read(): Use static linkage. Initialize the newly
allocated block and acquire the exclusive buf_block_t::lock while not
holding any mutex.

IORequest::IORequest(): Remove the body. We only need to invoke
set_punch_hole() in buf_flush_page() and nowhere else.

buf_page_t::flush_type: Remove. Replaced by IORequest::flush_type.
This field is only used during a fil_io() call.
That function already takes IORequest as a parameter, so we had
better introduce  for the rarely changing field.

buf_block_t::init(): Replaces buf_page_init().

buf_page_t::init(): Replaces buf_page_init_low().

buf_block_t::initialise(): Initialise many fields, but
keep the buf_page_t::state(). Both buf_pool_t::validate() and
buf_page_optimistic_get() requires that buf_page_t::in_file()
be protected atomically with buf_page_t::in_page_hash
and buf_page_t::in_LRU_list.

buf_page_optimistic_get(): Now that buf_block_t::mutex
no longer exists, we must check buf_page_t::io_fix()
after acquiring the buf_pool.page_hash lock, to detect
whether buf_page_init_for_read() has been initiated.
We will also check the io_fix() before acquiring hash_lock
in order to avoid unnecessary computation.
The field buf_block_t::modify_clock (protected by buf_block_t::lock)
allows buf_page_optimistic_get() to validate the block.

buf_page_t::real_size: Remove. It was only used while flushing
pages of page_compressed tables.

buf_page_encrypt(): Add an output parameter that allows us ot eliminate
buf_page_t::real_size. Replace a condition with debug assertion.

buf_page_should_punch_hole(): Remove.

buf_dblwr_t::add_to_batch(): Replaces buf_dblwr_add_to_batch().
Add the parameter size (to replace buf_page_t::real_size).

buf_dblwr_t::write_single_page(): Replaces buf_dblwr_write_single_page().
Add the parameter size (to replace buf_page_t::real_size).

fil_system_t::detach(): Replaces fil_space_detach().
Ensure that fil_validate() will not be violated even if
fil_system.mutex is released and reacquired.

fil_node_t::complete_io(): Renamed from fil_node_complete_io().

fil_node_t::close_to_free(): Replaces fil_node_close_to_free().
Avoid invoking fil_node_t::close() because fil_system.n_open
has already been decremented in fil_space_t::detach().

BUF_BLOCK_READY_FOR_USE: Remove. Directly use BUF_BLOCK_MEMORY.

BUF_BLOCK_ZIP_DIRTY: Remove. Directly use BUF_BLOCK_ZIP_PAGE,
and distinguish dirty pages by buf_page_t::oldest_modification().

BUF_BLOCK_POOL_WATCH: Remove. Use BUF_BLOCK_NOT_USED instead.
This state was only being used for buf_page_t that are in
buf_pool.watch.

buf_pool_t::watch[]: Remove pointer indirection.

buf_page_t::in_flush_list: Remove. It was set if and only if
buf_page_t::oldest_modification() is nonzero.

buf_page_decrypt_after_read(), buf_corrupt_page_release(),
buf_page_check_corrupt(): Change the const fil_space_t* parameter
to const fil_node_t& so that we can report the correct file name.

buf_page_monitor(): Declare as an ATTRIBUTE_COLD global function.

buf_page_io_complete(): Split to buf_page_read_complete() and
buf_page_write_complete().

buf_dblwr_t::in_use: Remove.

buf_dblwr_t::buf_block_array: Add IORequest::flush_t.

buf_dblwr_sync_datafiles(): Remove. It was a useless wrapper of
os_aio_wait_until_no_pending_writes().

buf_flush_write_complete(): Declare static, not global.
Add the parameter IORequest::flush_t.

buf_flush_freed_page(): Simplify the code.

recv_sys_t::flush_lru: Renamed from flush_type and changed to bool.

fil_read(), fil_write(): Replaced with direct use of fil_io().

fil_buffering_disabled(): Remove. Check srv_file_flush_method directly.

fil_mutex_enter_and_prepare_for_io(): Return the resolved
fil_space_t* to avoid a duplicated lookup in the caller.

fil_report_invalid_page_access(): Clean up the parameters.

fil_io(): Return fil_io_t, which comprises fil_node_t and error code.
Always invoke fil_space_t::acquire_for_io() and let either the
sync=true caller or fil_aio_callback() invoke
fil_space_t::release_for_io().

fil_aio_callback(): Rewrite to replace buf_page_io_complete().

fil_check_pending_operations(): Remove a parameter, and remove some
redundant lookups.

fil_node_close_to_free(): Wait for n_pending==0. Because we no longer
do an extra lookup of the tablespace between fil_io() and the
completion of the operation, we must give fil_node_t::complete_io() a
chance to decrement the counter.

fil_close_tablespace(): Remove unused parameter trx, and document
that this is only invoked during the error handling of IMPORT TABLESPACE.

row_import_discard_changes(): Merged with the only caller,
row_import_cleanup(). Do not lock up the data dictionary while
invoking fil_close_tablespace().

logs_empty_and_mark_files_at_shutdown(): Do not invoke
fil_close_all_files(), to avoid a !needs_flush assertion failure
on fil_node_t::close().

innodb_shutdown(): Invoke os_aio_free() before fil_close_all_files().

fil_close_all_files(): Invoke fil_flush_file_spaces()
to ensure proper durability.

thread_pool::unbind(): Fix a crash that would occur on Windows
after srv_thread_pool->disable_aio() and os_file_close().
This fix was submitted by Vladislav Vaintroub.

Thanks to Matthias Leich and Axel Schwenke for extensive testing,
Vladislav Vaintroub for helpful comments, and Eugene Kosov for a review.

b1ab211d

04 Jun, 2020 8 commits
- Cleanup - remove HAVE_AIOWAIT and associated code from mysys · 9c55f83e
  Vladislav Vaintroub authored Jun 04, 2020
```
HAVE_AIOWAIT had not been disabled and unused for at least 10 years.
```
  9c55f83e
- For better experience in Visual Studio IDE, add header files to Innodb sources · 4af3f848
  Vladislav Vaintroub authored Jun 04, 2020
  
  4af3f848
- FreeBSD compilation fixes · 4adc1269
  Sergei Golubchik authored Jun 03, 2020
```
* FreeBSD calls amd64 what Linux calls x86_64
* signal returns void (*)(int)
* struct pam_message has char*, not const char*
* krb5_free_unparsed_name exists, but is deprecated
```
  4adc1269
- disable Cassandra engine by default · 70ab43b5
  Sergei Golubchik authored Jun 02, 2020
  
  70ab43b5
- MDEV-21902 Nested JSON_ARRAYAGG in JSON_OBJECT should not get escaped. · 2fcff310
  Alexey Botchkov authored Jun 04, 2020
  
  2fcff310
- MDEV-21914 JSON_ARRAYAGG doesn't reject ORDER BY clause, but doesn't work either. · 74198384
  Alexey Botchkov authored Jun 04, 2020
```
ORDER BY fixed for JSON_ARRAYAGG.
```
  74198384
- MDEV-22084 Squared brackets missing from JSON_ARRAYAGG when used in a view. · 07daf735
  Alexey Botchkov authored Jun 04, 2020
```
Item_func_groupconcat::print() should be fixed to work for the derived
classes.
```
  07daf735
- MDEV-22640, MDEV-22449, MDEV-21528 JSON_ARRAYAGG crashes with NULL values. · bb47050e
  Alexey Botchkov authored Jun 04, 2020
```
We have to include NULL in the result which the GOUP_CONCAT doesn't
always do. Also converting should be done into another String instance
as these can be same.
```
  bb47050e
03 Jun, 2020 9 commits

MDEV-22787 postfix · e7bab059

Vladislav Vaintroub authored Jun 03, 2020

Ensure that FTS_MSG_STOP is the very last message, and nothing comes after
it in fts_optimize_shutdown.

Stop the timer to ensure that.

e7bab059

MDEV-21751 postfix · 4c522234
Vladislav Vaintroub authored Jun 03, 2020
```
Use symbolic constant for max purge threads.
```
4c522234
MDEV-21751 innodb_fast_shutdown=0 can be unnecessarily slow · bee4b044
Vladislav Vaintroub authored Jun 03, 2020
```
max out parallel purge worker tasks, on slow shutdown, to speedup
```
bee4b044

MDEV-22758 Assertion `!item->null_value' failed in Type_handler_inet6::make_sort_key_part · 839ad5e1

Alexander Barkov authored Jun 03, 2020

When some expression of an INET6 data type involves conversion to INET6 from
other data types, e.g. in:

- CAST:

    SELECT CAST(non_inet6_expr AS INET6)

- CASE and hybrid functions:

    SELECT CASE WHEN expr THEN inet6_expr ELSE non_inet6_expr END

- UNION:

    SELECT inet6_expr UNION SELECT non_inet6_expr

the result column must be fixed as NULL-able even if the non-inet6 expression itself
is not NULL-able, because at the execution time the conversion can fail.

Details:
- Forcing NULL-ability if conversion from some data type to INET6 is involved
  (for non-constant or for expensive expressions).
- Non-expensive constant expressions are tested for NULL-ability at fix_fields() time,
  so things like `CAST('::' AS INET6)` are still detected as NOT NULL.
- Adding "bool warn" parameter into a few methods, to avoid redundant warnings
  at fix_fields() time when calculating NULL-ability of constant values.

839ad5e1

MDEV-22787 fts_optimize_shutdown() deletes timer prematurely · 5b18ade0
Marko Mäkelä authored Jun 03, 2020
```
fts_optimize_shutdown(): Wait for fts_optimize_callback()
to terminate before deleting the timer that it uses.
```
5b18ade0

MDEV-22710 Assertion ...status != buf_page_t::FREED in ibuf_remove_free_page() · 58d2d820

Marko Mäkelä authored Jun 03, 2020

The buf_page_free() call that was introduced in MDEV-15528 was
performed too early in fseg_free_page(), tripping a debug check
in ibuf_remove_free_page(). In all other callers, we can (and will)
invoke buf_page_free() right after fseg_free_page(), but in
ibuf_remove_free_page() we will defer that call to the end of the
mini-transaction. (That call was already present.)

58d2d820

MDEV-22641: postfix - crc32{,c} fixups for ppc64 · 463a8fc5
Daniel Black authored Jun 03, 2020

463a8fc5
Merge 10.4 into 10.5 · 701efbb2
Marko Mäkelä authored Jun 03, 2020

701efbb2
Merge 10.3 into 10.4 · 80591481
Marko Mäkelä authored Jun 03, 2020

80591481

02 Jun, 2020 7 commits

MDEV-22773 Assertion page_get_page_no... in btr_pcur_store_position() · 95ac7902

Marko Mäkelä authored Jun 02, 2020

btr_pcur_store_position(): Replace a too strict debug assertion.
It is possible to have a clustered index B-tree for a logically
empty table, which will consist of a node pointer from the root
page to a leaf page that contains the metadata record.

The too strict debug assertion was added in
commit 0e5a4ac2 (MDEV-15562).

95ac7902

Added larger timeout to backup_stages.test · 457e3128
Monty authored Jun 02, 2020
```
MDEV-21546 main.backup_stages occasionally fails with lock wait timeout
```
457e3128

MDEV-22509: Server crashes in Field_inet6::store_inet6_null_with_warn / Field::maybe_null · d5e8b4d7

Varun Gupta authored Jun 02, 2020

For field with type INET, during EITS collection the min and max values are store in text
representation in the statistical table.
While retrieving the value from the statistical table, the value is stored back in the original
field using binary form instead of text and this was resulting in the crash.

Introduced 2 functions in the Field structure:
  1) store_to_statistical_minmax_field
  2) store_from_statistical_minmax_field

d5e8b4d7

MDEV-21546 main.backup_stages occasionally reports lock wait timeout · 6df2f2db

Marko Mäkelä authored Jun 02, 2020

With MDEV-16678, InnoDB background tasks (most notably, the purge of
committed transaction history) can acquire metadata locks.
Because of this, the lock_wait_timeout=0 is too strict and must
be relaxed.

The test used to fail easily if an extra sleep was added to
the end of dict_table_close(), before the MDL release. Now,
with lock_wait_timeout=1, the test passes even with an extra
0.1-second sleep added to dict_table_close().

Thanks to Monty for providing this fix.

6df2f2db

Merge 10.2 into 10.3 · 8300f639
Marko Mäkelä authored Jun 02, 2020

8300f639

MDEV-22770 trx_undo_report_rename() fails to release page latches · 804761a8

Marko Mäkelä authored Jun 02, 2020

commit f74023b9 (MDEV-15090)
inadvertently removed a mtr_t::commit() call from
trx_undo_report_rename(), causing an InnoDB hang if
we failed to log a RENAME operation.

It is unclear whether this condition is possible in practice.
The test case involved SET GLOBAL innodb_trx_rseg_n_slots_debug=1
and a failed CREATE TABLE...SELECT, whose error handling would
internally invoke RENAME in InnoDB.

804761a8

MDEV-22027 Assertion oldest_lsn >= log_sys.last_checkpoint_lsn failed · 0d6d63e1

Marko Mäkelä authored Jun 02, 2020

log_buf_pool_get_oldest_modification(): Acquire
log_sys_t::flush_order_mutex in order to prevent a race condition
that was introduced in
commit 1a6f708e (MDEV-15058).

Before that change, log_buf_pool_get_oldest_modification()
was protected by both log_sys.mutex and log_sys.flush_order_mutex
like it was supposed to be ever since
commit a52c4820 (MySQL 5.5.10).

buf_pool_t::get_oldest_modification(): Replaces
buf_pool_get_oldest_modification(), to emphasize that
log_sys.flush_order_mutex must be acquired by the caller if needed.

log_close(): Invoke log_buf_pool_get_oldest_modification()
in order to ensure a clean shutdown.

The scenario of the race condition is as follows:

1. The buffer pool is clean (no writes are pending).
2. mtr_add_dirtied_pages_to_flush_list() releases log_sys.mutex.
3. log_buf_pool_get_oldest_modification() observes that the
buffer pool is clean and returns log_sys.lsn.
4. log_checkpoint() completes, writing a wrong checkpoint header
according to which everything up to log_sys.lsn was clean.
5. mtr_add_dirtied_pages_to_flush_list() completes the execution
of mtr_memo_note_modifications(), releases the page latches and
the flush_order_mutex.
6. On a subsequent log_checkpoint(), the assertion could fail
if the page modifications had not been flushed yet.

The failing assertion (which is valid) was added in MySQL 5.7
mysql/mysql-server@5c6c6ec69336369487dfc080a6980089b4e1a3c2
and merged to MariaDB Server 10.2.2 in
commit fec844ac.

0d6d63e1

01 Jun, 2020 11 commits

Fix my_checksum declaration. · 661ebd46
Vladislav Vaintroub authored Jun 01, 2020
```
exporting data from the server needs MYSQL_PLUGIN_IMPORT.
```
661ebd46
Merge branch '10.4' into 10.5 · 6e6d79a5
Vladislav Vaintroub authored Jun 01, 2020

6e6d79a5
Merge branch '10.3' into 10.4 · f1c35a99
Vladislav Vaintroub authored Jun 01, 2020

f1c35a99
fix warning · fd2b46d8
Vladislav Vaintroub authored Jun 01, 2020

fd2b46d8
fix warning · 50641db2
Vladislav Vaintroub authored Jun 01, 2020

50641db2

MDEV-22303: Incorrect ordering with REGEXP_REPLACE and OFFSET/LIMIT · ade8253c

Varun Gupta authored May 30, 2020

For character sets and collation where character to weight mapping > 1,
there we need to make sure while creating a sort key,
a temporary buffer is created to store the value of the item by val_str function
and then copy that value back to the sort buffer.
In this case when using a priority queue Sort_param::tmp_buffer was not allocated.

Minor refactoring:
Changed Sort_param::tmp_buffer from char* to String

ade8253c

MDEV-22650 Dirty compressed page checksum validation fails · 02f68552

Thirunarayanan Balathandayuthapani authored Jun 01, 2020

Problem:
=======
  While evicting the uncompressed page from buffer pool, InnoDB writes
the checksum for the compressed page in buf_LRU_free_page().
So while flushing the compressed page, checksum validation fails
when innodb_checksum_algorithm variable changed to strict_none.

Solution:
========
- Calculate the checksum only during flushing of page. Removed the
checksum write in buf_LRU_free_page().

02f68552

Cleanup: Remove thr_is_recv(), trx_is_recv() · 83d0e72b
Marko Mäkelä authored Jun 01, 2020
```
Compare to trx_roll_crash_recv_trx directly where needed.
```
83d0e72b
MDEV-21615 InnoDB: innodb_page_size=x requires... should be logged as error · c50b7bee
Marko Mäkelä authored Jun 01, 2020
```
innobase_init(): On every path to refused startup, log the reason
to refuse startup as an error, instead of a note.
```
c50b7bee
Merge 10.1 into 10.2 · d72eebaa
Marko Mäkelä authored Jun 01, 2020

d72eebaa

MENT-458 MTR Big test "spider/bugfix.sql_mode_mariadb & myself" are both... · 132d5822

Kentoku authored Nov 01, 2019

MENT-458 MTR Big test "spider/bugfix.sql_mode_mariadb & myself" are both failing on Azure MTR pipeline

Support the dash number of MariaDB versions by Spider's tests

132d5822