- 20 Feb, 2021 1 commit
-
-
Marko Mäkelä authored
We have innodb_use_native_aio=ON by default since the introduction of that parameter in commit 2f9fb41b (MySQL 5.5 and MariaDB 5.5). However, to really benefit from the setting, the files should be opened in O_DIRECT mode, to bypass the file system cache. In this way, the reads and writes can be submitted with DMA, using the InnoDB buffer pool directly, and no processor cycles need to be used for copying data. The use of O_DIRECT benefits not only the current libaio implementation, but also liburing. os_file_set_nocache(): Test innodb_flush_method in the function, not in the callers.
-
- 18 Feb, 2021 2 commits
-
-
Marko Mäkelä authored
The fix of MDEV-23328 introduced a background thread for killing conflicting transactions. Thanks to the refactoring that was conducted in MDEV-24671, the high-priority ("brute-force") applier thread can kill the conflicting transactions itself, before waiting for the locks to be finally released (after the conflicting transactions have been rolled back). This also allows us to remove the hack LockGGuard that had to be added in MDEV-20612, and remove Galera-related function parameters from lock creation.
-
Marko Mäkelä authored
lock_wait_rpl_report(): Only reload trx->lock.wait_lock if lock_sys.wait_mutex had to be released and reacquired.
-
- 17 Feb, 2021 11 commits
-
-
Robert Bindar authored
-
Marko Mäkelä authored
-
Kartik Soneji authored
* MDEV-19168: Add ssl-flush command. Improve flush error messages and move error printing into the `flush` function.
-
Marko Mäkelä authored
Deadlock::report(): Require the caller to acquire lock_sys.latch if invoking on a transaction that is now owned by the current thread.
-
Vladislav Vaintroub authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
btr_cur_upd_rec_in_place(): Prefer WRITE to MEMSET for a single-byte operation. log_phys_t::apply(): Relax the assertion to allow a single-byte MEMSET, even though it is 1 byte longer than a WRITE record.
-
Marko Mäkelä authored
A new configuration parameter innodb_deadlock_report is introduced: * innodb_deadlock_report=off: Do not report any details of deadlocks. * innodb_deadlock_report=basic: Report transactions and waiting locks. * innodb_deadlock_report=full (default): Report also the blocking locks. The improved deadlock checker will consider all involved transactions in one loop, even if the deadlock loop includes several transactions. The theoretical maximum number of transactions that can be involved in a deadlock is `innodb_page_size` * 8, limited by the persistent data structures. Note: Similar to mysql/mysql-server@3859219875b62154b921e8c6078c751198071b9c our deadlock checker will consider at most one blocking transaction for each waiting transaction. The new field trx->lock.wait_trx be nullptr if and only if trx->lock.wait_lock is nullptr. Note that trx->lock.wait_lock->trx == trx (the waiting transaction), while trx->lock.wait_trx points to one of the transactions whose lock is conflicting with trx->lock.wait_lock. Considering only one blocking transaction will greatly simplify our deadlock checker, but it may also make the deadlock checker blind to some deadlocks where the deadlock cycle is 'hidden' by the fact that the registered trx->lock.wait_trx is not actually waiting for any InnoDB lock, but something else. So, instead of deadlocks, sometimes lock wait timeout may be reported. To improve on this, whenever trx->lock.wait_trx is changed, we will register further 'candidate' transactions in Deadlock::to_check(), and check for 'revealed' deadlocks as soon as possible, in lock_release() and innobase_kill_query(). The old DeadlockChecker was holding lock_sys.latch, even though using lock_sys.wait_mutex should be less contended (and thus preferred) in the likely case that no deadlock is present. lock_wait(): Defer the deadlock check to this function, instead of executing it in lock_rec_enqueue_waiting(), lock_table_enqueue_waiting(). DeadlockChecker: Complete rewrite: (1) Explicitly keep track of transactions that are being waited for, in trx->lock.wait_trx, protected by lock_sys.wait_mutex. Previously, we were painstakingly traversing the lock heaps while blocking concurrent registration or removal of any locks (even uncontended ones). (2) Use Brent's cycle-detection algorithm for deadlock detection, traversing each trx->lock.wait_trx edge at most 2 times. (3) If a deadlock is detected, release lock_sys.wait_mutex, acquire LockMutexGuard, re-acquire lock_sys.wait_mutex and re-invoke find_cycle() to find out whether the deadlock is still present. (4) Display information on all transactions that are involved in the deadlock, and choose a victim to be rolled back. lock_sys.deadlocks: Replaces lock_deadlock_found. Protected by wait_mutex. Deadlock::find_cycle(): Quickly find a cycle of trx->lock.wait_trx... using Brent's cycle detection algorithm. Deadlock::report(): Report a deadlock cycle that was found by Deadlock::find_cycle(), and choose a victim with the least weight. Altogether, we may traverse each trx->lock.wait_trx edge up to 5 times (2*find_cycle()+1 time for reporting and choosing the victim). Deadlock::check_and_resolve(): Find and resolve a deadlock. lock_wait_rpl_report(): Report the waits-for information to replication. This used to be executed as part of DeadlockChecker. Replication must know the waits-for relations even if no deadlocks are present in InnoDB. Reviewed by: Vladislav Vaintroub
-
Marko Mäkelä authored
-
Marko Mäkelä authored
ssux_lock_low::write_lock(): Before invoking writer_wait(), keep attempting write_lock_wait_try() as long as no conflict exists. rw_lock::upgrade_trylock(): Relax a bogus assertion and correct the acquisition operation. Another thread may be executing in ssux_lock_low::write_lock() on the same latch. Because we are the only thread that can make progress on that latch, we must become the writer. Any waiting thread will be eventually woken up by ssux_lock_low::u_unlock() or ssux_lock_low::wr_unlock(), but not by wr_u_downgrade() because the upgrade is a very rare operation.
-
Marko Mäkelä authored
-
- 16 Feb, 2021 4 commits
-
-
Elena Stepanova authored
Test code modifications and new failures from buildbot were registered only for the main suite. The rest was updated partially, based on the status of existing JIRA items
-
Vladislav Vaintroub authored
-
Marko Mäkelä authored
-
Sergei Golubchik authored
reset thd->lex->query_tables_own_last, because open_table() uses it and will try to dereference whatever garbage it might have
-
- 15 Feb, 2021 7 commits
-
-
Monty authored
The reason for this is that Galera can lock LOCK_thd_data for a long time. Instead of stalling any long running process, like alter or repair table, because of progress reporting, ignore the progress reporting for this call. Progress reporting will continue on the next call after the lock has been released.
-
Marko Mäkelä authored
The fix of MDEV-24569 and MDEV-24695 introduced a race condition when a table is being rebuilt or dropped during the fseg_page_is_free() check. The server would occasionally crash during the execution of the test encryption.create_or_replace. The fil_space_t::STOPPING flag can be set by DDL operations. Normally, such concurrent operations are prevented by a metadata lock (MDL). However, neither the change buffer merge nor the fil_crypt_thread() are protected by MDL. fil_crypt_read_crypt_data(), xdes_get_descriptor_const(): Pass the BUF_GET_POSSIBLY_FREED flag to avoid the fatal error in buf_page_get_low() if a DDL operation was just initiated.
-
Sergei Golubchik authored
-
Sergei Golubchik authored
-
Sergei Golubchik authored
-
Aleksey Midenkov authored
Related to commit f0baa864 and MDEV-23632.
-
Marko Mäkelä authored
trx_t::commit_tables(): Ensure that mod_tables will be empty. This was broken in commit b08448de where the query cache invalidation was moved from lock_release().
-
- 14 Feb, 2021 3 commits
-
-
Monty authored
When creating a summary temporary table with bit fields used in the sum expression with several parameters, like GROUP_CONCAT(), the counting of bits needed in the record was wrong. The reason we got an assert in Aria was because the bug caused a memory overwrite in the record and Aria noticed that the data was 'impossible.
-
Sergei Golubchik authored
wsrep_cluster_address_update() causes LOCK_wsrep_slave_threads to be locked under LOCK_wsrep_cluster_config, while normally the order should be the opposite. Fix: don't protect @@wsrep_cluster_address value with the LOCK_wsrep_cluster_config, LOCK_global_system_variables is enough. Only protect wsrep reinitialization with the LOCK_wsrep_cluster_config. And make it use a local copy of the global @@wsrep_cluster_address. Also, introduce a helper function that checks whether wsrep_cluster_address is set and also asserts that it can be safely read by the caller.
-
Vladislav Vaintroub authored
-
- 12 Feb, 2021 11 commits
-
-
Jan Lindström authored
Problem was that when engine substitution is allowd (e.g. sql_mode='') we must also check db_type. Additionally, we did not resolve default storage engine on that case and used that to check is TOI possible or not.
-
Sergei Golubchik authored
rarely (try --repeat 1000), the following happens: * from wsrep_bf_abort (when a thread is being killed), wsrep-lib starts streaming_rollback that wants to convert_streaming_client_to_applier. wsrep_create_streaming_applier creates a new THD(). All while the other THD is being killed, so under LOCK_thd_kill and LOCK_thd_data. In particular, THD::init() takes LOCK_global_system_variables under LOCK_thd_kill. * updating @@wsrep_slave_threads takes LOCK_global_system_variables and LOCK_wsrep_cluster_config (in that order) and invokes wsrep_slave_threads_update() that takes LOCK_wsrep_slave_threads * wsrep_replication_process() takes LOCK_wsrep_slave_threads and invokes wsrep_close_applier(), that does thd->set_killed() which takes LOCK_thd_kill. et voilà. As a fix I copied a workaround from wsrep_cluster_address_update() to wsrep_slave_threads_update(). It seems to be safe: without mutexes a race condition is possible and a concurrent SET might change wsrep_slave_threads, but wsrep_slave_threads_update() always verifies if there's a need to do something, so it will not run twice in this case, it'll be a no-op.
-
Sergei Golubchik authored
let the caller take the lock if needed
-
Sergei Golubchik authored
adaptation of 29bbcac0 for 10.4
-
Sergei Golubchik authored
-
Sergei Golubchik authored
* reuse the loop in THD::abort_current_cond_wait, don't duplicate it * find_thread_by_id should return whatever it has found, it's the caller's task not to kill COM_DAEMON (if the caller's a killer) and other minor changes
-
Elena Stepanova authored
Test code modifications and new failures from buildbot registered only for the main suite. The rest was updated partially, based on the status of existing JIRA items
-
Sergei Golubchik authored
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution" was null-merged. 10.4 version of the fix is coming up separately
-
Marko Mäkelä authored
mtr_defer_drop_ahi(): Upgrade the U lock to X lock and downgrade it back to U lock in case the adaptive hash index needs to be dropped. This regression was introduced in commit 03ca6495 (MDEV-24142).
-
Marko Mäkelä authored
lock_release_try(): Try to release locks while only holding shared lock_sys.latch. lock_release(): If 5 attempts of lock_release_try() fail, proceed to acquire exclusive lock_sys.latch.
-
Marko Mäkelä authored
We replace the old lock_sys.mutex (which was renamed to lock_sys.latch) with a combination of a global lock_sys.latch and table or page hash lock mutexes. The global lock_sys.latch can be acquired in exclusive mode, or it can be acquired in shared mode and another mutex will be acquired to protect the locks for a particular page or a table. This is inspired by mysql/mysql-server@1d259b87a63defa814e19a7534380cb43ee23c48 but the optimization of lock_release() will be done in the next commit. Also, we will interleave mutexes with the hash table elements, similar to how buf_pool.page_hash was optimized in commit 5155a300 (MDEV-22871). dict_table_t::autoinc_trx: Use Atomic_relaxed. dict_table_t::autoinc_mutex: Use srw_mutex in order to reduce the memory footprint. On 64-bit Linux or OpenBSD, both this and the new dict_table_t::lock_mutex should be 32 bits and be stored in the same 64-bit word. On Microsoft Windows, the underlying SRWLOCK is 32 or 64 bits, and on other systems, sizeof(pthread_mutex_t) can be much larger. ib_lock_t::trx_locks, trx_lock_t::trx_locks: Document the new rules. Writers must assert lock_sys.is_writer() || trx->mutex_is_owner(). LockGuard: A RAII wrapper for acquiring a page hash table lock. LockGGuard: Like LockGuard, but when Galera Write-Set Replication is enabled, we must acquire all shards, for updating arbitrary trx_locks. LockMultiGuard: A RAII wrapper for acquiring two page hash table locks. lock_rec_create_wsrep(), lock_table_create_wsrep(): Special Galera conflict resolution in non-inlined functions in order to keep the common code paths shorter. lock_sys_t::prdt_page_free_from_discard(): Refactored from lock_prdt_page_free_from_discard() and lock_rec_free_all_from_discard_page(). trx_t::commit_tables(): Replaces trx_update_mod_tables_timestamp(). lock_release(): Let trx_t::commit_tables() invalidate the query cache for those tables that were actually modified by the transaction. Merge lock_check_dict_lock() to lock_release(). We must never release lock_sys.latch while holding any lock_sys_t::hash_latch. Failure to do that could lead to memory corruption if the buffer pool is resized between the time lock_sys.latch is released and the hash_latch is released.
-
- 11 Feb, 2021 1 commit
-
-
Otto Kekäläinen authored
The Readline library is no longer available in Debian Sid. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=980504 Add the dependency on-the-fly in autobake-deb.sh for older distro versions and keep the native build in a state that works on Debian Sid as-is.
-