An error occurred fetching the project authors.
- 03 Sep, 2020 1 commit
-
-
Marko Mäkelä authored
-
- 19 Aug, 2020 1 commit
-
-
Marko Mäkelä authored
In commit fe39d02f (MDEV-20638) we removed some wake-up signaling of the master thread that should have been there, to ensure a steady log checkpointing workload. Common sense suggests that the commit omitted some necessary calls to srv_inc_activity_count(). But, an attempt to add the call to trx_flush_log_if_needed_low() as well as to reinstate the function innobase_active_small() did not restore the performance for the case where sync_binlog=1 is set. Therefore, we will revert the entire commit in MariaDB Server 10.2. In MariaDB Server 10.5, adding a srv_inc_activity_count() call to trx_flush_log_if_needed_low() did restore the performance, so we will not revert MDEV-20638 across all versions.
-
- 04 Aug, 2020 1 commit
-
-
Marko Mäkelä authored
The parameters innodb_thread_concurrency and innodb_commit_concurrency were useful years ago when both computing resources and the implementation of some shared data structures were limited. MySQL 5.0 or 5.1 had trouble scaling beyond 8 concurrent connections. Most of the scalability bottlenecks have been removed since then, and the transactions per second delivered by MariaDB Server 10.5 should not dramatically drop upon exceeding the 'optimal' number of connections. Hence, enabling any concurrency throttling for InnoDB actually makes things worse. We have seen many customers mistakenly setting this to a small value like 16 or 64 and then complaining the server was slow. Ignoring the parameters allows us to remove some normally unused code and data structures, which could slightly improve performance. innodb_thread_concurrency, innodb_commit_concurrency, innodb_replication_delay, innodb_concurrency_tickets, innodb_thread_sleep_delay, innodb_adaptive_max_sleep_delay: Deprecate and ignore; hard-wire to 0. The column INFORMATION_SCHEMA.INNODB_TRX.trx_concurrency_tickets will always report 0.
-
- 23 Jul, 2020 1 commit
-
-
Thirunarayanan Balathandayuthapani authored
- Due to commit fe95cb2e (MDEV-16125), InnoDB master thread does not need to call srv_resume_thread() and therefore there is no need to wake up the thread. Due to the above patch, InnoDB should remove the following dead code. srv_check_activity(): Makes the parameter as in,out and returns the recent activity value innobase_active_small(): Removed srv_active_wake_master_thread(): Removed srv_wake_master_thread(): Removed srv_active_wake_master_thread_low(): Removed Simplify srv_master_thread() and remove switch cases, added the assert. Replace srv_wake_master_thread() with srv_inc_activity_count() INNOBASE_WAKE_INTERVAL: Removed
-
- 20 Jul, 2020 1 commit
-
-
Marko Mäkelä authored
When InnoDB is extending a data file, it is updating the FSP_SIZE field in the first page of the data file. In commit 8451e090 (MDEV-11556) we removed a work-around for this bug and made recovery stricter, by making it track changes to FSP_SIZE via redo log records, and extend the data files before any changes are being applied to them. It turns out that the function fsp_fill_free_list() is not crash-safe with respect to this when it is initializing the change buffer bitmap page (page 1, or generally, N*innodb_page_size+1). It uses a separate mini-transaction that is committed (and will be written to the redo log file) before the mini-transaction that actually extended the data file. Hence, recovery can observe a reference to a page that is beyond the current end of the data file. fsp_fill_free_list(): Initialize the change buffer bitmap page in the same mini-transaction. The rest of the changes are fixing a bug that the use of the separate mini-transaction was attempting to work around. Namely, we must ensure that no other thread will access the change buffer bitmap page before our mini-transaction has been committed and all page latches have been released. That is, for read-ahead as well as neighbour flushing, we must avoid accessing pages that might not yet be durably part of the tablespace. fil_space_t::committed_size: The size of the tablespace as persisted by mtr_commit(). fil_space_t::max_page_number_for_io(): Limit the highest page number for I/O batches to committed_size. MTR_MEMO_SPACE_X_LOCK: Replaces MTR_MEMO_X_LOCK for fil_space_t::latch. mtr_x_space_lock(): Replaces mtr_x_lock() for fil_space_t::latch. mtr_memo_slot_release_func(): When releasing MTR_MEMO_SPACE_X_LOCK, copy space->size to space->committed_size. In this way, read-ahead or flushing will never be invoked on pages that do not yet exist according to FSP_SIZE.
-
- 18 Jun, 2020 1 commit
-
-
Marko Mäkelä authored
btr_search_sys::parts[]: A single structure for the partitions of the adaptive hash index. Replaces the 3 separate arrays: btr_search_latches[], btr_search_sys->hash_tables, btr_search_sys->hash_tables[i]->heap. hash_table_t::heap, hash_table_t::adaptive: Remove. ha0ha.cc: Remove. Move all code to btr0sea.cc.
-
- 08 Jun, 2020 2 commits
-
-
Marko Mäkelä authored
buf_LRU_make_block_young(): Merge with buf_page_make_young(). buf_pool_check_no_pending_io(): Remove. Replaced with buf_pool.any_io_pending() and buf_pool.io_pending(), which do not unnecessarily acquire buf_pool.mutex. buf_pool_t::init_flush[]: Use atomic access, so that buf_flush_wait_LRU_batch_end() can avoid acquiring buf_pool.mutex. buf_pool_t::try_LRU_scan: Declare as bool.
-
Marko Mäkelä authored
When MDEV-22769 introduced srv_shutdown_state=SRV_SHUTDOWN_INITIATED in commit efc70da5 we forgot to adjust a few checks for SRV_SHUTDOWN_NONE. In the initial shutdown step, we are waiting for the background DROP TABLE queue to be processed or discarded. At that time, some background tasks (such as buffer pool resizing or dumping or encryption key rotation) may be terminated, but others must remain running normally. srv_purge_coordinator_suspend(), srv_purge_coordinator_thread(), srv_start_wait_for_purge_to_start(): Treat SRV_SHUTDOWN_NONE and SRV_SHUTDOWN_INITIATED equally.
-
- 05 Jun, 2020 3 commits
-
-
Marko Mäkelä authored
-
Marko Mäkelä authored
The background drop table queue in InnoDB is a work-around for cases where the SQL layer is requesting DDL on tables on which transactional locks exist. One such case are XA transactions. Our test case exploits the fact that the recovery of XA PREPARE transactions will only resurrect InnoDB table locks, but not MDL that should block any concurrent DDL. srv_shutdown_t: Introduce the srv_shutdown_state=SRV_SHUTDOWN_INITIATED for the initial part of shutdown, to wait for the background drop table queue to be emptied. srv_shutdown_bg_undo_sources(): Assign srv_shutdown_state=SRV_SHUTDOWN_INITIATED before waiting for the background drop table queue to be emptied. row_drop_tables_for_mysql_in_background(): On slow shutdown, if no active transactions exist (excluding ones that are in XA PREPARE state), skip any tables on which locks exist. row_drop_table_for_mysql(): Do not unnecessarily attempt to drop InnoDB persistent statistics for tables that have already been added to the background drop table queue. row_mysql_close(): Relax an assertion, and free all memory even if innodb_force_recovery=2 would prevent the background drop table queue from being emptied.
-
Marko Mäkelä authored
User-visible changes: The INFORMATION_SCHEMA views INNODB_BUFFER_PAGE and INNODB_BUFFER_PAGE_LRU will report a dummy value FLUSH_TYPE=0 and will no longer report the PAGE_STATE value READY_FOR_USE. We will remove some fields from buf_page_t and move much code to member functions of buf_pool_t and buf_page_t, so that the access rules of data members can be enforced consistently. Evicting or adding pages in buf_pool.LRU will remain covered by buf_pool.mutex. Evicting or adding pages in buf_pool.page_hash will remain covered by both buf_pool.mutex and the buf_pool.page_hash X-latch. After this fix, buf_pool.page_hash lookups can entirely avoid acquiring buf_pool.mutex, only relying on buf_pool.hash_lock_get() S-latch. Similarly, buf_flush_check_neighbors() can will rely solely on buf_pool.mutex, no buf_pool.page_hash latch at all. The buf_pool.mutex is rather contended in I/O heavy benchmarks, especially when the workload does not fit in the buffer pool. The first attempt to alleviate the contention was the buf_pool_t::mutex split in commit 4ed7082e which introduced buf_block_t::mutex, which we are now removing. Later, multiple instances of buf_pool_t were introduced in commit c18084f7 and recently removed by us in commit 1a6f708e (MDEV-15058). UNIV_BUF_DEBUG: Remove. This option to enable some buffer pool related debugging in otherwise non-debug builds has not been used for years. Instead, we have been using UNIV_DEBUG, which is enabled in CMAKE_BUILD_TYPE=Debug. buf_block_t::mutex, buf_pool_t::zip_mutex: Remove. We can mainly rely on std::atomic and the buf_pool.page_hash latches, and in some cases depend on buf_pool.mutex or buf_pool.flush_list_mutex just like before. We must always release buf_block_t::lock before invoking unfix() or io_unfix(), to prevent a glitch where a block that was added to the buf_pool.free list would apper X-latched. See commit c5883deb how this glitch was finally caught in a debug environment. We move some buf_pool_t::page_hash specific code from the ha and hash modules to buf_pool, for improved readability. buf_pool_t::close(): Assert that all blocks are clean, except on aborted startup or crash-like shutdown. buf_pool_t::validate(): No longer attempt to validate n_flush[] against the number of BUF_IO_WRITE fixed blocks, because buf_page_t::flush_type no longer exists. buf_pool_t::watch_set(): Replaces buf_pool_watch_set(). Reduce mutex contention by separating the buf_pool.watch[] allocation and the insert into buf_pool.page_hash. buf_pool_t::page_hash_lock<bool exclusive>(): Acquire a buf_pool.page_hash latch. Replaces and extends buf_page_hash_lock_s_confirm() and buf_page_hash_lock_x_confirm(). buf_pool_t::READ_AHEAD_PAGES: Renamed from BUF_READ_AHEAD_PAGES. buf_pool_t::curr_size, old_size, read_ahead_area, n_pend_reads: Use Atomic_counter. buf_pool_t::running_out(): Replaces buf_LRU_buf_pool_running_out(). buf_pool_t::LRU_remove(): Remove a block from the LRU list and return its predecessor. Incorporates buf_LRU_adjust_hp(), which was removed. buf_page_get_gen(): Remove a redundant call of fsp_is_system_temporary(), for mode == BUF_GET_IF_IN_POOL_OR_WATCH, which is only used by BTR_DELETE_OP (purge), which is never invoked on temporary tables. buf_free_from_unzip_LRU_list_batch(): Avoid redundant assignments. buf_LRU_free_from_unzip_LRU_list(): Simplify the loop condition. buf_LRU_free_page(): Clarify the function comment. buf_flush_check_neighbor(), buf_flush_check_neighbors(): Rewrite the construction of the page hash range. We will hold the buf_pool.mutex for up to buf_pool.read_ahead_area (at most 64) consecutive lookups of buf_pool.page_hash. buf_flush_page_and_try_neighbors(): Remove. Merge to its only callers, and remove redundant operations in buf_flush_LRU_list_batch(). buf_read_ahead_random(), buf_read_ahead_linear(): Rewrite. Do not acquire buf_pool.mutex, and iterate directly with page_id_t. ut_2_power_up(): Remove. my_round_up_to_next_power() is inlined and avoids any loops. fil_page_get_prev(), fil_page_get_next(), fil_addr_is_null(): Remove. buf_flush_page(): Add a fil_space_t* parameter. Minimize the buf_pool.mutex hold time. buf_pool.n_flush[] is no longer updated atomically with the io_fix, and we will protect most buf_block_t fields with buf_block_t::lock. The function buf_flush_write_block_low() is removed and merged here. buf_page_init_for_read(): Use static linkage. Initialize the newly allocated block and acquire the exclusive buf_block_t::lock while not holding any mutex. IORequest::IORequest(): Remove the body. We only need to invoke set_punch_hole() in buf_flush_page() and nowhere else. buf_page_t::flush_type: Remove. Replaced by IORequest::flush_type. This field is only used during a fil_io() call. That function already takes IORequest as a parameter, so we had better introduce for the rarely changing field. buf_block_t::init(): Replaces buf_page_init(). buf_page_t::init(): Replaces buf_page_init_low(). buf_block_t::initialise(): Initialise many fields, but keep the buf_page_t::state(). Both buf_pool_t::validate() and buf_page_optimistic_get() requires that buf_page_t::in_file() be protected atomically with buf_page_t::in_page_hash and buf_page_t::in_LRU_list. buf_page_optimistic_get(): Now that buf_block_t::mutex no longer exists, we must check buf_page_t::io_fix() after acquiring the buf_pool.page_hash lock, to detect whether buf_page_init_for_read() has been initiated. We will also check the io_fix() before acquiring hash_lock in order to avoid unnecessary computation. The field buf_block_t::modify_clock (protected by buf_block_t::lock) allows buf_page_optimistic_get() to validate the block. buf_page_t::real_size: Remove. It was only used while flushing pages of page_compressed tables. buf_page_encrypt(): Add an output parameter that allows us ot eliminate buf_page_t::real_size. Replace a condition with debug assertion. buf_page_should_punch_hole(): Remove. buf_dblwr_t::add_to_batch(): Replaces buf_dblwr_add_to_batch(). Add the parameter size (to replace buf_page_t::real_size). buf_dblwr_t::write_single_page(): Replaces buf_dblwr_write_single_page(). Add the parameter size (to replace buf_page_t::real_size). fil_system_t::detach(): Replaces fil_space_detach(). Ensure that fil_validate() will not be violated even if fil_system.mutex is released and reacquired. fil_node_t::complete_io(): Renamed from fil_node_complete_io(). fil_node_t::close_to_free(): Replaces fil_node_close_to_free(). Avoid invoking fil_node_t::close() because fil_system.n_open has already been decremented in fil_space_t::detach(). BUF_BLOCK_READY_FOR_USE: Remove. Directly use BUF_BLOCK_MEMORY. BUF_BLOCK_ZIP_DIRTY: Remove. Directly use BUF_BLOCK_ZIP_PAGE, and distinguish dirty pages by buf_page_t::oldest_modification(). BUF_BLOCK_POOL_WATCH: Remove. Use BUF_BLOCK_NOT_USED instead. This state was only being used for buf_page_t that are in buf_pool.watch. buf_pool_t::watch[]: Remove pointer indirection. buf_page_t::in_flush_list: Remove. It was set if and only if buf_page_t::oldest_modification() is nonzero. buf_page_decrypt_after_read(), buf_corrupt_page_release(), buf_page_check_corrupt(): Change the const fil_space_t* parameter to const fil_node_t& so that we can report the correct file name. buf_page_monitor(): Declare as an ATTRIBUTE_COLD global function. buf_page_io_complete(): Split to buf_page_read_complete() and buf_page_write_complete(). buf_dblwr_t::in_use: Remove. buf_dblwr_t::buf_block_array: Add IORequest::flush_t. buf_dblwr_sync_datafiles(): Remove. It was a useless wrapper of os_aio_wait_until_no_pending_writes(). buf_flush_write_complete(): Declare static, not global. Add the parameter IORequest::flush_t. buf_flush_freed_page(): Simplify the code. recv_sys_t::flush_lru: Renamed from flush_type and changed to bool. fil_read(), fil_write(): Replaced with direct use of fil_io(). fil_buffering_disabled(): Remove. Check srv_file_flush_method directly. fil_mutex_enter_and_prepare_for_io(): Return the resolved fil_space_t* to avoid a duplicated lookup in the caller. fil_report_invalid_page_access(): Clean up the parameters. fil_io(): Return fil_io_t, which comprises fil_node_t and error code. Always invoke fil_space_t::acquire_for_io() and let either the sync=true caller or fil_aio_callback() invoke fil_space_t::release_for_io(). fil_aio_callback(): Rewrite to replace buf_page_io_complete(). fil_check_pending_operations(): Remove a parameter, and remove some redundant lookups. fil_node_close_to_free(): Wait for n_pending==0. Because we no longer do an extra lookup of the tablespace between fil_io() and the completion of the operation, we must give fil_node_t::complete_io() a chance to decrement the counter. fil_close_tablespace(): Remove unused parameter trx, and document that this is only invoked during the error handling of IMPORT TABLESPACE. row_import_discard_changes(): Merged with the only caller, row_import_cleanup(). Do not lock up the data dictionary while invoking fil_close_tablespace(). logs_empty_and_mark_files_at_shutdown(): Do not invoke fil_close_all_files(), to avoid a !needs_flush assertion failure on fil_node_t::close(). innodb_shutdown(): Invoke os_aio_free() before fil_close_all_files(). fil_close_all_files(): Invoke fil_flush_file_spaces() to ensure proper durability. thread_pool::unbind(): Fix a crash that would occur on Windows after srv_thread_pool->disable_aio() and os_file_close(). This fix was submitted by Vladislav Vaintroub. Thanks to Matthias Leich and Axel Schwenke for extensive testing, Vladislav Vaintroub for helpful comments, and Eugene Kosov for a review.
-
- 04 Jun, 2020 1 commit
-
-
Marko Mäkelä authored
Introduce a new ATTRIBUTE_NOINLINE to ib::logger member functions, and add UNIV_UNLIKELY hints to callers. Also, remove some crash reporting output. If needed, the information will be available using debugging tools. Furthermore, remove some fts_enable_diag_print output that included indexed words in raw form. The code seemed to assume that words are NUL-terminated byte strings. It is not clear whether a NUL terminator is always guaranteed to be present. Also, UCS2 or UTF-16 strings would typically contain many NUL bytes.
-
- 03 Jun, 2020 2 commits
-
-
Vladislav Vaintroub authored
max out parallel purge worker tasks, on slow shutdown, to speedup
-
Thirunarayanan Balathandayuthapani authored
Problem: ======== During buffer pool resizing, InnoDB recreates the dictionary hash tables. Dictionary hash table reuses the heap of AHI hash tables. It leads to memory corruption. Fix: ==== - While disabling AHI, free the heap and AHI hash tables. Recreate the AHI hash tables and assign new heap when AHI is enabled. - btr_blob_free() access invalid page if page was reallocated during buffer poolresizing. So btr_blob_free() should get the page from buf_pool instead of using existing block. - btr_search_enabled and block->index should be checked after acquiring the btr_search_sys latch - Moved the buffer_pool_scan debug sync to earlier before accessing the btr_search_sys latches to avoid the hang of truncate_purge_debug test case - srv_printf_innodb_monitor() should acquire btr_search_sys latches before AHI hash tables.
-
- 22 May, 2020 1 commit
-
-
Marko Mäkelä authored
The GCC __atomic_ functions were removed already in commit 2b47f8ff, and starting with MariaDB Server 10.4, InnoDB relies mostly on C++11 std::atomic.
-
- 12 May, 2020 1 commit
-
-
Vlad Lesin authored
Flush LSN to system tablespace on innodb shutdown if XA is rolled back by mariabackup.
-
- 30 Apr, 2020 1 commit
-
-
Eugene Kosov authored
Maybe this patch will help catch problems like buffer overflow. log_t::first_in_use: removed log_t::buf: this is where mtr_t are supposed to append data log_t::flush_buf: this is from server writes to a file Those two buffers are std::swap()ped when some thread is gonna write to a file
-
- 28 Apr, 2020 1 commit
-
-
Eugene Kosov authored
Replace all fsync() with fdatasync() when possible (e.g. On Linux) InnoDB doesn't care about file timestamps. So, to achieve a better performance it makes sense to use fdatasync() everywhere. file_io::flush(): renamed from flush_data_only() os_file_flush_data(): removed os_file_sync_posix(): renamed from os_file_fsync_posix(). Now it uses fdatasync() when it's available.
-
- 07 Apr, 2020 1 commit
-
-
Vlad Lesin authored
was restored. Optionally rollback prepared XA's on "mariabackup --prepare". The fix MUST NOT be ported on 10.5+, as MDEV-742 fix solves the issue for slaves.
-
- 31 Mar, 2020 1 commit
-
-
Marko Mäkelä authored
There is no background change buffer merge any more. Change buffer merge will only take place during a slow shutdown (a shutdown initiated after SET GLOBAL innodb_fast_shutdown=0).
-
- 30 Mar, 2020 1 commit
-
-
Marko Mäkelä authored
recv_sys.recovery_on: Replaces recv_recovery_on. recv_sys_t::apply(): Replaces recv_apply_hashed_log_recs(). recv_sys_var_init(): Remove. recv_sys_t::recover_low(): Attempt to initialize a page based on buffered redo log records.
-
- 18 Mar, 2020 1 commit
-
-
Marko Mäkelä authored
Thanks to MDEV-15058, there is only one InnoDB buffer pool. Allocating buf_pool statically removes one level of pointer indirection and makes code more readable, and removes the awkward initialization of some buf_pool members. While doing this, we will also declare some buf_pool_t data members private and replace some functions with member functions. This is mostly affecting buffer pool resizing. This is not aiming to be a complete rewrite of buf_pool_t to a proper class. Most of the buffer pool interface, such as buf_page_get_gen(), will remain in the C programming style for now. buf_pool_t::withdrawing: Replaces buf_pool_withdrawing. buf_pool_t::withdraw_clock_: Replaces buf_withdraw_clock. buf_pool_t::create(): Repalces buf_pool_init(). buf_pool_t::close(): Replaces buf_pool_free(). buf_bool_t::will_be_withdrawn(): Replaces buf_block_will_be_withdrawn(), buf_frame_will_be_withdrawn(). buf_pool_t::clear_hash_index(): Replaces buf_pool_clear_hash_index(). buf_pool_t::get_n_pages(): Replaces buf_pool_get_n_pages(). buf_pool_t::validate(): Replaces buf_validate(). buf_pool_t::print(): Replaces buf_print(). buf_pool_t::block_from_ahi(): Replaces buf_block_from_ahi(). buf_pool_t::is_block_field(): Replaces buf_pointer_is_block_field(). buf_pool_t::is_block_mutex(): Replaces buf_pool_is_block_mutex(). buf_pool_t::is_block_lock(): Replaces buf_pool_is_block_lock(). buf_pool_t::is_obsolete(): Replaces buf_pool_is_obsolete(). buf_pool_t::io_buf: Make default-constructible. buf_pool_t::io_buf::create(): Delayed 'constructor' buf_pool_t::io_buf::close(): Early 'destructor' HazardPointer: Make default-constructible. Define all member functions inline, also for derived classes.
-
- 16 Mar, 2020 1 commit
-
-
Eugene Kosov authored
Write log header just ones when file is created, instead of writing to it on every log file wrap around. log_t::file::write_header_durable(): this one writes to log header log_write_buf(): this one stops writing to log header
-
- 12 Mar, 2020 1 commit
-
-
Marko Mäkelä authored
The -Wconversion in GCC seems to be stricter than in clang. GCC at least since version 4.4.7 issues truncation warnings for assignments to bitfields, while clang 10 appears to only issue warnings when the sizes in bytes rounded to the nearest integer powers of 2 are different. Before GCC 10.0.0, -Wconversion required more casts and would not allow some operations, such as x<<=1 or x+=1 on a data type that is narrower than int. GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining about x|=y even when x and y are compatible types that are narrower than int. Hence, we must rewrite some x|=y as x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion. In GCC 6 and later, the warning for assigning wider to bitfields that are narrower than 8, 16, or 32 bits can be suppressed by applying a bitwise & with the exact bitmask of the bitfield. For older GCC, we must disable -Wconversion for GCC 4 or 5 in such cases. The bitwise negation operator appears to promote short integers to a wider type, and hence we must add explicit truncation casts around them. Microsoft Visual C does not allow a static_cast to truncate a constant, such as static_cast<byte>(1) truncating int. Hence, we will use the constructor-style cast byte(~1) for such cases. This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0, clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019) on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
-
- 11 Mar, 2020 1 commit
-
-
Marko Mäkelä authored
Declare innodb_purge_threads as 4-byte integer (UINT) instead of 4-or-8-byte (ULONG) and adjust the documentation string.
-
- 10 Mar, 2020 1 commit
-
-
Thirunarayanan Balathandayuthapani authored
The following parameters are deprecated: innodb-background-scrub-data-uncompressed innodb-background-scrub-data-compressed innodb-background-scrub-data-interval innodb-background-scrub-data-check-interval Removed scrubbing code completely(btr0scrub.h, btr0scrub.cc) Removed information_schema.innodb_tablespaces_scrubbing tables Removed the scrubbing logic from fil_crypt_thread()
-
- 07 Mar, 2020 1 commit
-
-
Marko Mäkelä authored
create_log_file(): Delete all old redo log files where they used to be deleted, after the crash injection point innodb_log_abort_6, before commit 9ef2d29f deprecated and ignored the setting innodb_log_files_in_group.
-
- 05 Mar, 2020 1 commit
-
-
Marko Mäkelä authored
Some fields were protected by log_sys.mutex, which adds quite some overhead for readers. Some readers were submitting dirty reads. log_t::lsn: Declare private and atomic. Add wrappers get_lsn() and set_lsn() that will use relaxed memory access. Many accesses to log_sys.lsn are still protected by log_sys.mutex; we avoid the mutex for some readers. log_t::flushed_to_disk_lsn: Declare private and atomic, and move to the same cache line with log_t::lsn. log_t::buf_free: Declare as size_t, and move to the same cache line with log_t::lsn. log_t::check_flush_or_checkpoint_: Declare private and atomic, and move to the same cache line with log_t::lsn. log_get_lsn(): Define as an alias of log_sys.get_lsn(). log_get_lsn_nowait(), log_peek_lsn(): Remove. log_get_flush_lsn(): Define as an alias of log_sys.get_flush_lsn(). log_t::initiate_write(): Replaces log_buffer_sync_in_background().
-
- 04 Mar, 2020 2 commits
-
-
Marko Mäkelä authored
The configuration parameter innodb_scrub_log never really worked, as reported in MDEV-13019 and MDEV-18370. Because MDEV-14425 is changing the redo log format, the innodb_scrub_log feature would have to be adjusted for it. Due to the known problems, it is easier to remove the feature for now, and to ignore and deprecate the parameters. If old log contents should be kept secret, then enabling innodb_encrypt_log or setting a smaller innodb_log_file_size could help.
-
Marko Mäkelä authored
Compute MONITOR_LSN_CHECKPOINT_AGE on demand in srv_mon_process_existing_counter(). This allows us to remove the overhead of MONITOR_SET calls for the counter.
-
- 02 Mar, 2020 1 commit
-
-
Marko Mäkelä authored
Most of the time, we can refer to recv_sys.recovered_lsn.
-
- 01 Mar, 2020 1 commit
-
-
Vladislav Vaintroub authored
Introduce special synchronization primitive group_commit_lock for more efficient synchronization of redo log writing and flushing. The goal is to reduce CPU consumption on log_write_up_to, to reduce the spurious wakeups, and improve the throughput in write-intensive benchmarks.
-
- 20 Feb, 2020 1 commit
-
-
Marko Mäkelä authored
There is no reason for the dummy index object dict_ind_redundant to exist any more. It was only being passed to btr_create(). btr_create(): If !index, assume that a ROW_FORMAT=REDUNDANT table is being created. We could pass ibuf.index, dict_sys.sys_tables->indexes.start and so on, if those objects had been initialized before the function btr_create() is called.
-
- 19 Feb, 2020 1 commit
-
-
Eugene Kosov authored
Now there can be only one log file instead of several which logically work as a single file. Possible names of redo log files: ib_logfile0, ib_logfile101 (for just created one) innodb_log_fiels_in_group: value of this variable is not used by InnoDB. Possible values are still 1..100, to not break upgrade LOG_FILE_NAME: add constant of value "ib_logfile0" LOG_FILE_NAME_PREFIX: add constant of value "ib_logfile" get_log_file_path(): convenience function that returns full path of a redo log file SRV_N_LOG_FILES_MAX: removed srv_n_log_files: we can't remove this for compatibility reasons, but now server doesn't use this variable log_sys_t::file::fd: now just one, not std::vector log_sys_t::log_capacity: removed word 'group' find_and_check_log_file(): part of logic from huge srv_start() moved here recv_sys_t::files: file descriptors of redo log files. There can be several of those in case we're upgrading from older MariaDB version. recv_sys_t::remove_extra_log_files: whether to remove ib_logfile{1,2,3...} after successfull upgrade. recv_sys_t::read(): open if needed and read from one of several log files recv_sys_t::files_size(): open if needed and return files count redo_file_sizes_are_correct(): check that redo log files sizes are equal. Just to log an error for a user. Corresponding check was moved from srv0start.cc namespace deprecated: put all deprecated variables here to prevent usage of it by us, developers
-
- 13 Feb, 2020 2 commits
-
-
Marko Mäkelä authored
log_t::FORMAT_10_5: physical redo log format tag log_phys_t: Buffered records in the physical format. The log record bytes will follow the last data field, making use of alignment padding that would otherwise be wasted. If there are multiple records for the same page, also those may be appended to an existing log_phys_t object if the memory is available. In the physical format, the first byte of a record identifies the record and its length (up to 15 bytes). For longer records, the immediately following bytes will encode the remaining length in a variable-length encoding. Usually, a variable-length-encoded page identifier will follow, followed by optional payload, whose length is included in the initially encoded total record length. When a mini-transaction is updating multiple fields in a page, it can avoid repeating the tablespace identifier and page number by setting the same_page flag (most significant bit) in the first byte of the log record. The byte offset of the record will be relative to where the previous record for that page ended. Until MDEV-14425 introduces a separate file-level log for redo log checkpoints and file operations, we will write the file-level records in the page-level redo log file. The record FILE_CHECKPOINT (which replaces MLOG_CHECKPOINT) will be removed in MDEV-14425, and one sequential scan of the page recovery log will suffice. Compared to MLOG_FILE_CREATE2, FILE_CREATE will not include any flags. If the information is needed, it can be parsed from WRITE records that modify FSP_SPACE_FLAGS. MLOG_ZIP_WRITE_STRING: Remove. The record was only introduced temporarily as part of this work, before being replaced with WRITE (along with MLOG_WRITE_STRING, MLOG_1BYTE, MLOG_nBYTES). mtr_buf_t::empty(): Check if the buffer is empty. mtr_t::m_n_log_recs: Remove. It suffices to check if m_log is empty. mtr_t::m_last, mtr_t::m_last_offset: End of the latest m_log record, for the same_page encoding. page_recv_t::last_offset: Reflects mtr_t::m_last_offset. Valid values for last_offset during recovery should be 0 or above 8. (The first 8 bytes of a page are the checksum and the page number, and neither are ever updated directly by log records.) Internally, the special value 1 indicates that the same_page form will not be allowed for the subsequent record. mtr_t::page_create(): Take the block descriptor as parameter, so that it can be compared to mtr_t::m_last. The INIT_INDEX_PAGE record will always followed by a subtype byte, because same_page records must be longer than 1 byte. trx_undo_page_init(): Combine the writes in WRITE record. trx_undo_header_create(): Write 4 bytes using a special MEMSET record that includes 1 bytes of length and 2 bytes of payload. flst_write_addr(): Define as a static function. Combine the writes. flst_zero_both(): Replaces two flst_zero_addr() calls. flst_init(): Do not inline the function. fsp_free_seg_inode(): Zerofill the whole inode. fsp_apply_init_file_page(): Initialize FIL_PAGE_PREV,FIL_PAGE_NEXT to FIL_NULL when using the physical format. btr_create(): Assert !page_has_siblings() because fsp_apply_init_file_page() must have been invoked. fil_ibd_create(): Do not write FILE_MODIFY after FILE_CREATE. fil_names_dirty_and_write(): Remove the parameter mtr. Write the records using a separate mini-transaction object, because any FILE_ records must be at the start of a mini-transaction log. recv_recover_page(): Add a fil_space_t* parameter. After applying log to the a ROW_FORMAT=COMPRESSED page, invoke buf_zip_decompress() to restore the uncompressed page. buf_page_io_complete(): Remove the temporary hack to discard the uncompressed page of a ROW_FORMAT=COMPRESSED page. page_zip_write_header(): Remove. Use mtr_t::write() or mtr_t::memset() instead, and update the compressed page frame separately. trx_undo_header_add_space_for_xid(): Remove. trx_undo_seg_create(): Perform the changes that were previously made by trx_undo_header_add_space_for_xid(). btr_reset_instant(): New function: Reset the table to MariaDB 10.2 or 10.3 format when rolling back an instant ALTER TABLE operation. page_rec_find_owner_rec(): Merge with the only callers. page_cur_insert_rec_low(): Combine writes by using a local buffer. MEMMOVE data from the preceding record whenever feasible (copying at least 3 bytes). page_cur_insert_rec_zip(): Combine writes to page header fields. PageBulk::insertPage(): Issue MEMMOVE records to copy a matching part from the preceding record. PageBulk::finishPage(): Combine the writes to the page header and to the sparse page directory slots. mtr_t::write(): Only log the least significant (last) bytes of multi-byte fields that actually differ. For updating FSP_SIZE, we must always write all 4 bytes to the redo log, so that the fil_space_set_recv_size() logic in recv_sys_t::parse() will work. mtr_t::memcpy(), mtr_t::zmemcpy(): Take a pointer argument instead of a numeric offset to the page frame. Only log the last bytes of multi-byte fields that actually differ. In fil_space_crypt_t::write_page0(), we must log also any unchanged bytes, so that recovery will recognize the record and invoke fil_crypt_parse(). Future work: MDEV-21724 Optimize page_cur_insert_rec_low() redo logging MDEV-21725 Optimize btr_page_reorganize_low() redo logging MDEV-21727 Optimize redo logging for ROW_FORMAT=COMPRESSED
-
Marko Mäkelä authored
mtr_t::log_write_low(): Replaces mlog_write_initial_log_record_low(). mtr_t::log_file_op(): Replaces fil_op_write_log(). mtr_t::free(): Write MLOG_INIT_FREE_PAGE. mtr_t::init(): Write MLOG_INIT_FILE_PAGE2. mtr_t::page_create(): Write record about the partial initialization of an index page. mlog_catenate_ulint(), mlog_catenate_string(), mlog_open(), mlog_close(): Remove.
-
- 12 Feb, 2020 1 commit
-
-
Marko Mäkelä authored
Our benchmarking efforts indicate that the reasons for splitting the buf_pool in commit c18084f7 have mostly gone away, possibly as a result of mysql/mysql-server@ce6109ebfdedfdf185e391a0c97dc6d33867ed78 or similar work. Only in one write-heavy benchmark where the working set size is ten times the buffer pool size, the buf_pool->mutex would be less contended with 4 buffer pool instances than with 1 instance, in buf_page_io_complete(). That contention could be alleviated further by making more use of std::atomic and by splitting buf_pool_t::mutex further (MDEV-15053). We will deprecate and ignore the following parameters: innodb_buffer_pool_instances innodb_page_cleaners There will be only one buffer pool and one page cleaner task. In a number of INFORMATION_SCHEMA views, columns that indicated the buffer pool instance will be removed: information_schema.innodb_buffer_page.pool_id information_schema.innodb_buffer_page_lru.pool_id information_schema.innodb_buffer_pool_stats.pool_id information_schema.innodb_cmpmem.buffer_pool_instance information_schema.innodb_cmpmem_reset.buffer_pool_instance
-
- 11 Feb, 2020 1 commit
-
-
Marko Mäkelä authored
During native table rebuild or index creation, InnoDB used to skip redo logging and write MLOG_INDEX_LOAD records to inform crash recovery and Mariabackup of the gaps in redo log. This is fragile and prohibits some optimizations, such as skipping the doublewrite buffer for newly (re)initialized pages (MDEV-19738). row_merge_write_redo(): Remove. We do not write MLOG_INDEX_LOAD records any more. Instead, we write full redo log. FlushObserver: Remove. fseg_free_page_func(): Remove the parameter log. Redo logging cannot be disabled. fil_space_t::redo_skipped_count: Remove. We cannot remove buf_block_t::skip_flush_check, because PageBulk will temporarily generate invalid B-tree pages in the buffer pool.
-
- 01 Feb, 2020 1 commit
-
-
Eugene Kosov authored
main change: rename first redo log without file close second change: use os_offset_t to represent offset in a file third change: fix log texts
-
- 24 Jan, 2020 1 commit
-
-
Eugene Kosov authored
class log_file_t: more or less sane RAII wrapper around redo log file descriptor and its path. This change is motivated by the need of using that log_file_t somewhere else.
-