- 11 Sep, 2018 4 commits
-
-
Marko Mäkelä authored
Implement a 10.4 redo log format, which extends the 10.3 format by introducing the MLOG_MEMSET record. MLOG_MEMSET: A new redo log record type for filling an area with a byte. mlog_memset(): Write the MLOG_MEMSET record. mlog_parse_nbytes(): Handle MLOG_MEMSET as well. trx_rseg_header_create(): Reduce the redo log volume by making use of mlog_memset() and the zero-initialization that happens inside page allocation. fil_addr_null: Remove. flst_init(): Create a variant that takes a zero-initialized buf_block_t* as a parameter, and only writes the FIL_NULL using mlog_memset(). flst_zero_addr(): A variant of flst_write_addr() that writes a null address using mlog_memset() for the FIL_NULL. The following fixes are replacing some use of MLOG_WRITE_STRING with the more compact MLOG_MEMSET record, or eliminating redundant redo log writes: btr_store_big_rec_extern_fields(): Invoke mlog_memset() for zero-initializing the tail of the ROW_FORMAT=COMPRESSED BLOB page. trx_sysf_create(), trx_rseg_format_upgrade(): Invoke mlog_memset() for zero-initializing the page trailer. fsp_header_init(), trx_rseg_header_create(): Remove redundant zero-initializations.
-
Marko Mäkelä authored
Stop supporting the additional *trunc.log files that were introduced via MySQL 5.7 to MariaDB Server 10.2 and 10.3. DB_TABLESPACE_TRUNCATED: Remove. purge_sys.truncate: A new structure to track undo tablespace file truncation. srv_start(): Remove the call to buf_pool_invalidate(). It is no longer necessary, given that we no longer access things in ways that violate the ARIES protocol. This call was originally added for innodb_file_format, and it may later have been necessary for the proper function of the MySQL 5.7 TRUNCATE recovery, which we are now removing. trx_purge_cleanse_purge_queue(): Take the undo tablespace as a parameter. trx_purge_truncate_history(): Rewrite everything mostly in a single function, replacing references to undo::Truncate. recv_apply_hashed_log_recs(): If any redo log is to be applied, and if the log_sys.log.subformat indicates that separately logged truncate may have been used, refuse to proceed except if innodb_force_recovery is set. We will still refuse crash-upgrade if TRUNCATE TABLE was logged. Undo tablespace truncation would only be logged in undo*trunc.log files, which we are no longer checking for.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
- 10 Sep, 2018 6 commits
-
-
Teodor Mircea Ionita authored
-
Teodor Mircea Ionita authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
With the TRUNCATE by rename, create, drop (MDEV-13564), old tables with invalid ROW_FORMAT attribute could not be truncated. Introduce a sloppy mode for allowing the TRUNCATE. create_table_info_t::prepare_create_table(): Add the parameter strict=true. ha_innobase::create(): Pass strict=false if trx!=NULL (the create is part of TRUNCATE).
-
Marko Mäkelä authored
-
Marko Mäkelä authored
It turned out that ha_innobase::truncate() would prematurely commit the transaction already before the completion of the ha_innobase::create(). All of this must be atomic. innodb.truncate_crash: Use the correct DEBUG_SYNC point, and tolerate non-truncation of the table, because the redo log for the TRUNCATE transaction commit might be flushed due to some InnoDB background activity. dict_build_tablespace_for_table(): Merge to the function dict_build_table_def_step(). dict_build_table_def_step(): If a table is being created during an already started data dictionary transaction (such as TRUNCATE), persistently write the table_id to the undo log header before creating any file. In this way, the recovery of TRUNCATE will be able to delete the new file before rolling back the rename of the original table. dict_table_rename_in_cache(): Add the parameter replace_new_file, used as part of rolling back a TRUNCATE operation. fil_rename_tablespace_check(): Add the parameter replace_new. If the parameter is set and a file identified by new_path exists, remove a possible tablespace and also the file. create_table_info_t::create_table_def(): Remove some debug assertions that no longer hold. During TRUNCATE, the transaction will already have been started (and performed a rename operation) before the table is created. Also, remove a call to dict_build_tablespace_for_table(). create_table_info_t::create_table(): Add the parameter create_fk=true. During TRUNCATE TABLE, do not add FOREIGN KEY constraints to the InnoDB data dictionary, because they will also not be removed. row_table_add_foreign_constraints(): If trx=NULL, do not modify the InnoDB data dictionary, but only load the FOREIGN KEY constraints from the data dictionary. ha_innobase::create(): Lock the InnoDB data dictionary cache only if no transaction was passed by the caller. Unlock it in any case. innobase_rename_table(): Add the parameter commit = true. If !commit, do not lock or unlock the data dictionary cache. ha_innobase::truncate(): Lock the data dictionary before invoking rename or create, and let ha_innobase::create() unlock it and also commit or roll back the transaction. trx_undo_mark_as_dict(): Renamed from trx_undo_mark_as_dict_operation() and declared global instead of static. row_undo_ins_parse_undo_rec(): If table_id is set, this must be rolling back the rename operation in TRUNCATE TABLE, and therefore replace_new_file=true.
-
- 09 Sep, 2018 5 commits
-
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Sergey Vojtovich authored
sql/table.cc:8561:42: error: non-constant-expression cannot be narrowed from type 'uint' (aka 'unsigned int') to '__darwin_suseconds_t' (aka 'int') in initializer list [-Wc++11-narrowing] timeval end_time= {thd->query_start(), uint(thd->query_start_sec_part())}; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sql/table.cc:8561:42: note: insert an explicit cast to silence this issue timeval end_time= {thd->query_start(), uint(thd->query_start_sec_part())}; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ static_cast<__darwin_suseconds_t>( )
-
Marko Mäkelä authored
Tables whose reference count is not zero will be crash-safely dropped in the background when the count reaches zero. Therefore, it is no longer necessary to wait for all references to be released before possibly adding the table to the background queue.
-
Sergey Vojtovich authored
-
- 08 Sep, 2018 1 commit
-
-
Vladislav Vaintroub authored
-
- 07 Sep, 2018 17 commits
-
-
Marko Mäkelä authored
This is a merge from 10.2, but the 10.2 version of this will not be pushed into 10.2 yet, because the 10.2 version would include backports of MDEV-14717 and MDEV-14585, which would introduce a crash recovery regression: Tables could be lost on table-rebuilding DDL operations, such as ALTER TABLE, OPTIMIZE TABLE or this new backup-friendly TRUNCATE TABLE. The test innodb.truncate_crash occasionally loses the table due to the following bug: MDEV-17158 log_write_up_to() sometimes fails
-
Marko Mäkelä authored
A crash-downgrade of a RENAME (or TRUNCATE or table-rebuilding ALTER TABLE or OPTIMIZE TABLE) operation to an earlier 10.2 version would trigger a debug assertion failure during rollback, in trx_roll_pop_top_rec_of_trx(). In a non-debug build, the TRX_UNDO_RENAME_TABLE record would be misinterpreted as an update_undo log record, and typically the file name would be interpreted as DB_TRX_ID,DB_ROLL_PTR,PRIMARY KEY. If a matching record would be found, row_undo_mod() would hit ut_error in switch (node->rec_type). Typically, ut_a(table2 == NULL) would fail when opening the table from SQL. Because of this, we prevent a crash-downgrade to earlier MariaDB 10.2 versions by changing the InnoDB redo log format identifier to the 10.3 identifier, and by introducing a subformat identifier so that 10.2 can continue to refuse crash-downgrade from 10.3 or later. After a clean shutdown, a downgrade to MariaDB 10.2.13 or later would still be possible thanks to MDEV-14909. A downgrade to older 10.2 versions is only possible after removing the log files (not recommended). LOG_HEADER_FORMAT_CURRENT: Change to 103 (originally the 10.3 format). log_group_t: Add subformat. For 10.2, we will use subformat 1, and will refuse crash recovery from any other subformat of the 10.3 format, that is, a genuine 10.3 redo log. recv_find_max_checkpoint(): Allow startup after clean shutdown from a future LOG_HEADER_FORMAT_10_4 (unencrypted only). We cannot handle the encrypted 10.4 redo log block format, which was introduced in MDEV-12041. Allow crash recovery from the original 10.2 format as well as the new format. In Mariabackup --backup, do not allow any startup from 10.3 or 10.4 redo logs. recv_recovery_from_checkpoint_start(): Skip redo log apply for clean 10.3 redo log, but not for the new 10.2 redo log (10.3 format, subformat 1). srv_prepare_to_delete_redo_log_files(): On format or subformat mismatch, set srv_log_file_size = 0, so that we will display the correct message. innobase_start_or_create_for_mysql(): Check for format or subformat mismatch. xtrabackup_backup_func(): Remove debug assertions that were made redundant by the code changes in recv_find_max_checkpoint().
-
Marko Mäkelä authored
This is a backport of the following commits: commit b4165985 commit 69e88de0 commit 40f4525f commit 656f66de Now that MDEV-14717 made RENAME TABLE crash-safe within InnoDB, it should be safe to drop the #sql- tables within InnoDB during crash recovery. These tables can be one of two things: (1) #sql-ib related to deferred DROP TABLE (follow-up to MDEV-13407) or to table-rebuilding ALTER TABLE...ALGORITHM=INPLACE (since MDEV-14378, only related to the intermediate copy of a table), (2) #sql- related to the intermediate copy of a table during ALTER TABLE...ALGORITHM=COPY We will not drop tables whose name starts with #sql2, because the server can be killed during an ALGORITHM=COPY operation at a point where the original table was renamed to #sql2 but the finished intermediate copy was not yet renamed from #sql- to the original table name. If an old version of MariaDB Server before 10.2.13 (MDEV-11415) was killed while ALTER TABLE...ALGORITHM=COPY was in progress, after recovery there could be undo log records for some records that were inserted into an intermediate copy of the table. Due to these undo log records, InnoDB would resurrect locks at recovery, and the intermediate table would be locked while we are trying to drop it. This would cause a call to row_rename_table_for_mysql(), either from row_mysql_drop_garbage_tables() or from the rollback of a RENAME operation that was part of the ALTER TABLE. row_rename_table_for_mysql(): Do not attempt to parse FOREIGN KEY constraints when renaming from #sql-something to #sql-something-else, because it does not make any sense. row_drop_table_for_mysql(): When deferring DROP TABLE due to locks, do not rename the table if its name already starts with the #sql- prefix, which is what row_mysql_drop_garbage_tables() uses. Previously, the too strict prefix #sql-ib was used, and some tables were renamed unnecessarily.
-
Marko Mäkelä authored
Follow-up to MDEV-13407 innodb.drop_table_background failed in buildbot with "Tablespace for table exists" This is a backport of commit 88aff5f4. The InnoDB background DROP TABLE queue is something that we should really remove, but are unable to until we remove dict_operation_lock so that DDL and DML operations can be combined in a single transaction. Because the queue is not persistent, it is not crash-safe. We should in some way ensure that the deferred-dropped tables will be dropped after server restart. The existence of two separate transactions complicates the error handling of CREATE TABLE...SELECT. We should really not break locks in DROP TABLE. Our solution to these problems is to rename the table to a temporary name, and to drop such-named tables on InnoDB startup. Also, the queue will use table IDs instead of names from now on. check-testcase.test: Ignore #sql-ib*.ibd files, because tables may enter the background DROP TABLE queue shortly before the test finishes. innodb.drop_table_background: Test CREATE...SELECT and the creation of tables whose file name starts with #sql-ib. innodb.alter_crash: Adjust the recovery, now that the #sql-ib tables will be dropped on InnoDB startup. row_mysql_drop_garbage_tables(): New function, to drop all #sql-ib tables on InnoDB startup. row_drop_table_for_mysql_in_background(): Remove an unnecessary and misplaced call to log_buffer_flush_to_disk(). (The call should have been after the transaction commit. We do not care about flushing the redo log here, because the table would be dropped again at server startup.) Remove the entry from the list after the table no longer exists. If server shutdown has been initiated, empty the list without actually dropping any tables. They will be dropped again on startup. row_drop_table_for_mysql(): Do not call lock_remove_all_on_table(). Instead, if locks exist, defer the DROP TABLE until they do not exist. If the table name does not start with #sql-ib, rename it to that prefix before adding it to the background DROP TABLE queue.
-
Marko Mäkelä authored
This is a backport of commit 07e9ff1f. Allow DROP TABLE `#mysql50##sql-...._.` to drop tables that were being rebuilt by ALGORITHM=INPLACE NOTE: If the server is killed after the table-rebuilding ALGORITHM=INPLACE commits inside InnoDB but before the .frm file has been replaced, then the recovery will involve something else than DROP TABLE. NOTE: If the server is killed in a true inplace ALTER TABLE commits inside InnoDB but before the .frm file has been replaced, then we are really out of luck. To properly handle that situation, we would need a transactional mysql.ddl_fixup table that directs recovery to rename or remove files. prepare_inplace_alter_table_dict(): Use the altered_table->s->table_name for generating the new_table_name. table_name_t::part_suffix: The start of the partition name suffix. table_name_t::dbend(): Return the end of the schema name. table_name_t::dblen(): Return the length of the schema name, in bytes. table_name_t::basename(): Return the name without the schema name. table_name_t::part(): Return the partition name, or NULL if none. row_drop_table_for_mysql(): Assert for #sql, not #sql-ib.
-
Marko Mäkelä authored
This is a backport of commit 0bc36758 and commit 9eb3fcc9. InnoDB in MariaDB 10.2 appears to only write MLOG_FILE_RENAME2 redo log records during table-rebuilding ALGORITHM=INPLACE operations. We must write the records for any .ibd file renames, so that the operations are crash-safe. If InnoDB is killed during a RENAME TABLE operation, it can happen that the transaction for updating the data dictionary will be rolled back. But, nothing will roll back the renaming of the .ibd file (the MLOG_FILE_RENAME2 only guarantees roll-forward), or for that matter, the renaming of the dict_table_t::name in the dict_sys cache. We introduce the undo log record TRX_UNDO_RENAME_TABLE to fix this. fil_space_for_table_exists_in_mem(): Remove the parameters adjust_space, table_id and some code that was trying to work around these deficiencies. fil_name_write_rename(): Write a MLOG_FILE_RENAME2 record. dict_table_rename_in_cache(): Invoke fil_name_write_rename(). trx_undo_rec_copy(): Set the first 2 bytes to the length of the copied undo log record. trx_undo_page_report_rename(), trx_undo_report_rename(): Write a TRX_UNDO_RENAME_TABLE record with the old table name. row_rename_table_for_mysql(): Invoke trx_undo_report_rename() before modifying any data dictionary tables. row_undo_ins_parse_undo_rec(): Roll back TRX_UNDO_RENAME_TABLE by invoking dict_table_rename_in_cache(), which will take care of both renaming the table and the file. ha_innobase::truncate(): Remove a work-around.
-
Marko Mäkelä authored
Remove the innodb_undo suite, and move and adapt the tests. Remove unnecessary restarts, and add innodb_page_size_small.inc for combinations. innodb.undo_truncate is the merge of innodb_undo.truncate and innodb_undo.truncate_multi_client. Add the global status variable innodb_undo_truncations. Without this, the test innodb.undo_truncate would occasionally report that truncation did not happen. The test was only waiting for the history list length to reach 0, but the undo tablespace truncation would only take place some time after that. Undo tablespace truncation will only occasionally occur with innodb_page_size=32k, and typically never occur (with this amount of undo log operations) with innodb_page_size=64k. We disable these combinations. innodb.undo_truncate_recover was formerly called innodb_undo.truncate_recover.
-
Marko Mäkelä authored
Implement undo tablespace truncation via normal redo logging. Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name, CREATE, and DROP. Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2 is killed before the DROP operation is committed. If MariaDB Server 10.2 is killed during TRUNCATE, it is also possible that the old table was renamed to #sql-ib*.ibd but the data dictionary will refer to the table using the original name. In MariaDB Server 10.3, RENAME inside InnoDB is transactional, and #sql-* tables will be dropped on startup. So, this new TRUNCATE will be fully crash-safe in 10.3. ha_mroonga::wrapper_truncate(): Pass table options to the underlying storage engine, now that ha_innobase::truncate() will need them. rpl_slave_state::truncate_state_table(): Before truncating mysql.gtid_slave_pos, evict any cached table handles from the table definition cache, so that there will be no stale references to the old table after truncating. == TRUNCATE TABLE == WL#6501 in MySQL 5.7 introduced separate log files for implementing atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB undo and redo log. Some convoluted logic was added to the InnoDB crash recovery, and some extra synchronization (including a redo log checkpoint) was introduced to make this work. This synchronization has caused performance problems and race conditions, and the extra log files cannot be copied or applied by external backup programs. In order to support crash-upgrade from MariaDB 10.2, we will keep the logic for parsing and applying the extra log files, but we will no longer generate those files in TRUNCATE TABLE. A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE (with full redo and undo logging and proper rollback). This will be implemented in MDEV-14717. ha_innobase::truncate(): Invoke RENAME, create(), delete_table(). Because RENAME cannot be fully rolled back before MariaDB 10.3 due to missing undo logging, add some explicit rename-back in case the operation fails. ha_innobase::delete(): Introduce a variant that takes sqlcom as a parameter. In TRUNCATE TABLE, we do not want to touch any FOREIGN KEY constraints. ha_innobase::create(): Add the parameters file_per_table, trx. In TRUNCATE, the new table must be created in the same transaction that renames the old table. create_table_info_t::create_table_info_t(): Add the parameters file_per_table, trx. row_drop_table_for_mysql(): Replace a bool parameter with sqlcom. row_drop_table_after_create_fail(): New function, wrapping row_drop_table_for_mysql(). dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(), fil_prepare_for_truncate(), fil_reinit_space_header_for_table(), row_truncate_table_for_mysql(), TruncateLogger, row_truncate_prepare(), row_truncate_rollback(), row_truncate_complete(), row_truncate_fts(), row_truncate_update_system_tables(), row_truncate_foreign_key_checks(), row_truncate_sanity_checks(): Remove. row_upd_check_references_constraints(): Remove a check for TRUNCATE, now that the table is no longer truncated in place. The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some race-condition like scenarios. The test innodb-innodb.truncate does not use any synchronization. We add a redo log subformat to indicate backup-friendly format. MariaDB 10.4 will remove support for the old TRUNCATE logging, so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve limitations. == Undo tablespace truncation == MySQL 5.7 implements undo tablespace truncation. It is only possible when innodb_undo_tablespaces is set to at least 2. The logging is implemented similar to the WL#6501 TRUNCATE, that is, using separate log files and a redo log checkpoint. We can simply implement undo tablespace truncation within a single mini-transaction that reinitializes the undo log tablespace file. Unfortunately, due to the redo log format of some operations, currently, the total redo log written by undo tablespace truncation will be more than the combined size of the truncated undo tablespace. It should be acceptable to have a little more than 1 megabyte of log in a single mini-transaction. This will be fixed in MDEV-17138 in MariaDB Server 10.4. recv_sys_t: Add truncated_undo_spaces[] to remember for which undo tablespaces a MLOG_FILE_CREATE2 record was seen. namespace undo: Remove some unnecessary declarations. fil_space_t::is_being_truncated: Document that this flag now only applies to undo tablespaces. Remove some references. fil_space_t::is_stopping(): Do not refer to is_being_truncated. This check is for tablespaces of tables. Potentially used tablespaces are never truncated any more. buf_dblwr_process(): Suppress the out-of-bounds warning for undo tablespaces. fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero page number (new size of the tablespace in pages) to inform crash recovery that the undo tablespace size has been reduced. fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2 can be written for undo tablespaces (without .ibd file suffix) for a nonzero page number. os_file_truncate(): Add the parameter allow_shrink=false so that undo tablespaces can actually be shrunk using this function. fil_name_parse(): For undo tablespace truncation, buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[]. recv_read_in_area(): Avoid reading pages for which no redo log records remain buffered, after recv_addr_trim() removed them. trx_rseg_header_create(): Add a FIXME comment that we could write much less redo log. trx_undo_truncate_tablespace(): Reinitialize the undo tablespace in a single mini-transaction, which will be flushed to the redo log before the file size is trimmed. recv_addr_trim(): Discard any redo logs for pages that were logged after the new end of a file, before the truncation LSN. If the rec_list becomes empty, reduce n_addrs. After removing any affected records, actually truncate the file. recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before applying any log records. The undo tablespace files must be open at this point. buf_flush_or_remove_pages(), buf_flush_dirty_pages(), buf_LRU_flush_or_remove_pages(): Add a parameter for specifying the number of the first page to flush or remove (default 0). trx_purge_initiate_truncate(): Remove the log checkpoints, the extra logging, and some unnecessary crash points. Merge the code from trx_undo_truncate_tablespace(). First, flush all to-be-discarded pages (beyond the new end of the file), then trim the space->size to make the page allocation deterministic. At the only remaining crash injection point, flush the redo log, so that the recovery can be tested.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
recv_addr_state, recv_addr_t: Define in log0recv.cc only.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
Because innodb_file_per_table can be enabled at runtime after it was disabled at startup, it is better to always register the same innobase_hton->tablefile_extensions. Besides, innodb_file_per_table=OFF does not prevent loading tables that may have been created earlier with the .ibd file extension.
-
Vladislav Vaintroub authored
Remove plugin-load option from mariabackup. It does not needed to be an option (we only need to store the plugin-load value during backup phase, and reuse the same value during --prepare). Fix is to read plugin-load from backup-my.cnf during prepare.
-
Marko Mäkelä authored
-
Igor Babaev authored
Field_iterator_table_ref::set_field_iterator Several functions that processed different prepare statements missed the DT_INIT flag in last parameter of the open_normal_and_derived_tables() calls. It made context analysis of derived tables dependent on the order in which the derived tables were processed by mysql_handle_derived(). This order was induced by the order of SELECTs in all_select_list. In 10.4 the order of SELECTs in all_select_list became different and lack of the DT_INIT flags in some open_normal_and_derived_tables() call became critical as some derived tables were not identified as such.
-
- 06 Sep, 2018 6 commits
-
-
Sergei Golubchik authored
-
Sergei Golubchik authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Vladislav Vaintroub authored
Fix exit condition for the log copying thread.
-
- 05 Sep, 2018 1 commit
-
-
Vladislav Vaintroub authored
(https://github.com/facebook/rocksdb/issues/4344) Also, disable /permissive- flag if set, it breaks rocksdb compilation in 10.3 on older versions of Windows 8.1 SDK.
-