An error occurred fetching the project authors.
  1. 22 Feb, 2018 1 commit
    • Daniel Black's avatar
      MDEV-11455: create status variable innodb_buffer_pool_load_incomplete · 8440e8fa
      Daniel Black authored
      This status variable indicates that an innodb buffer pool load never
      completed and dumping at shutdown would result in an incomplete dump file.
      
      This status variable is set to 1 once a buffer pool loads. Upon a successful
      load this status variable returns to 0.
      
      With this status variable set, the system variable
      innodb_buffer_pool_dump_at_shutdown==1 will have no effect as dumping after
      an incomplete load will generate a less complete dump file than the current
      one.
      
      If a user aborts a buffer pool load by changing the system variable
      innodb_buffer_pool_load_abort=1 will cause the the status variable
      innodb_buffer_pool_load_incomplete to remain set to 1.
      
      A shutdown that occurs while innodb is loading the buffer pool will
      not save the buffer pool on shutdown.
      
      A user may indirectly set innodb_buffer_pool_load_incomplete
      to 0 by:
      * Forcing a load, by setting innodb_buffer_pool_load_now=ON, or
      * Forcing a dump, by setting innodb_buffer_pool_dump_now=ON
      
      This will enable the next dump on shutdown to complete.
      Signed-off-by: default avatarDaniel Black <daniel.black@au.ibm.com>
      8440e8fa
  2. 20 Jan, 2018 1 commit
  3. 09 Jan, 2018 1 commit
  4. 10 Oct, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-13311 Presence of old logs in 10.2.7 will corrupt restored instance (change in behavior) · 1b478a7a
      Marko Mäkelä authored
      Mariabackup 10.2.7 would delete the redo log files after a successful
      --prepare operation. If the user is manually copying the prepared files
      instead of using the --copy-back option, it could happen that some old
      redo log file would be preserved in the restored location. These old
      redo log files could cause corruption of the restored data files when
      the server is started up.
      
      We prevent this scenario by creating a "poisoned" redo log file
      ib_logfile0 at the end of the --prepare step. The poisoning consists
      of simply truncating the file to an empty file. InnoDB will refuse
      to start up on an empty redo log file.
      
      copy_back(): Delete all redo log files in the target if the source
      file ib_logfile0 is empty. (Previously we did this if the source
      file is missing.)
      
      SRV_OPERATION_RESTORE_EXPORT: A new variant of SRV_OPERATION_RESTORE
      when the --export option is specified. In this mode, we will keep
      deleting all redo log files, instead of truncating the first one.
      
      delete_log_files(): Add a parameter for the first file to delete,
      to be passed as 0 or 1.
      
      innobase_start_or_create_for_mysql(): In mariabackup --prepare,
      tolerate an empty ib_logfile0 file. Otherwise, require the first
      redo log file to be longer than 4 blocks (2048 bytes). Unless
      --export was specified, truncate the first log file at the
      end of --prepare.
      1b478a7a
  5. 06 Oct, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-11369 Instant ADD COLUMN for InnoDB · a4948daf
      Marko Mäkelä authored
      For InnoDB tables, adding, dropping and reordering columns has
      required a rebuild of the table and all its indexes. Since MySQL 5.6
      (and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
      concurrent modification of the tables.
      
      This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
      and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
      with only minor changes performed to the table structure. The counter
      innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
      is incremented whenever a table rebuild operation is converted into
      an instant ADD COLUMN operation.
      
      ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.
      
      Some usability limitations will be addressed in subsequent work:
      
      MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
      and ALGORITHM=INSTANT
      MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE
      
      The format of the clustered index (PRIMARY KEY) is changed as follows:
      
      (1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
      and a new field PAGE_INSTANT will contain the original number of fields
      in the clustered index ('core' fields).
      If instant ADD COLUMN has not been used or the table becomes empty,
      or the very first instant ADD COLUMN operation is rolled back,
      the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
      to 0 and FIL_PAGE_INDEX.
      
      (2) A special 'default row' record is inserted into the leftmost leaf,
      between the page infimum and the first user record. This record is
      distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
      same format as records that contain values for the instantly added
      columns. This 'default row' always has the same number of fields as
      the clustered index according to the table definition. The values of
      'core' fields are to be ignored. For other fields, the 'default row'
      will contain the default values as they were during the ALTER TABLE
      statement. (If the column default values are changed later, those
      values will only be stored in the .frm file. The 'default row' will
      contain the original evaluated values, which must be the same for
      every row.) The 'default row' must be completely hidden from
      higher-level access routines. Assertions have been added to ensure
      that no 'default row' is ever present in the adaptive hash index
      or in locked records. The 'default row' is never delete-marked.
      
      (3) In clustered index leaf page records, the number of fields must
      reside between the number of 'core' fields (dict_index_t::n_core_fields
      introduced in this work) and dict_index_t::n_fields. If the number
      of fields is less than dict_index_t::n_fields, the missing fields
      are replaced with the column value of the 'default row'.
      Note: The number of fields in the record may shrink if some of the
      last instantly added columns are updated to the value that is
      in the 'default row'. The function btr_cur_trim() implements this
      'compression' on update and rollback; dtuple::trim() implements it
      on insert.
      
      (4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
      status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
      a new record header that will encode n_fields-n_core_fields-1 in
      1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
      always explicitly encodes the number of fields.)
      
      We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
      covering the insert of the 'default row' record when instant ADD COLUMN
      is used for the first time. Subsequent instant ADD COLUMN can use
      TRX_UNDO_UPD_EXIST_REC.
      
      This is joint work with Vin Chen (陈福荣) from Tencent. The design
      that was discussed in April 2017 would not have allowed import or
      export of data files, because instead of the 'default row' it would
      have introduced a data dictionary table. The test
      rpl.rpl_alter_instant is exactly as contributed in pull request #408.
      The test innodb.instant_alter is based on a contributed test.
      
      The redo log record format changes for ROW_FORMAT=DYNAMIC and
      ROW_FORMAT=COMPACT are as contributed. (With this change present,
      crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
      Also the semantics of higher-level redo log records that modify the
      PAGE_INSTANT field is changed. The redo log format version identifier
      was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.
      
      Everything else has been rewritten by me. Thanks to Elena Stepanova,
      the code has been tested extensively.
      
      When rolling back an instant ADD COLUMN operation, we must empty the
      PAGE_FREE list after deleting or shortening the 'default row' record,
      by calling either btr_page_empty() or btr_page_reorganize(). We must
      know the size of each entry in the PAGE_FREE list. If rollback left a
      freed copy of the 'default row' in the PAGE_FREE list, we would be
      unable to determine its size (if it is in ROW_FORMAT=COMPACT or
      ROW_FORMAT=DYNAMIC) because it would contain more fields than the
      rolled-back definition of the clustered index.
      
      UNIV_SQL_DEFAULT: A new special constant that designates an instantly
      added column that is not present in the clustered index record.
      
      len_is_stored(): Check if a length is an actual length. There are
      two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.
      
      dict_col_t::def_val: The 'default row' value of the column.  If the
      column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.
      
      dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
      instant_value().
      
      dict_col_t::remove_instant(): Remove the 'instant ADD' status of
      a column.
      
      dict_col_t::name(const dict_table_t& table): Replaces
      dict_table_get_col_name().
      
      dict_index_t::n_core_fields: The original number of fields.
      For secondary indexes and if instant ADD COLUMN has not been used,
      this will be equal to dict_index_t::n_fields.
      
      dict_index_t::n_core_null_bytes: Number of bytes needed to
      represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).
      
      dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
      n_core_null_bytes was not initialized yet from the clustered index
      root page.
      
      dict_index_t: Add the accessors is_instant(), is_clust(),
      get_n_nullable(), instant_field_value().
      
      dict_index_t::instant_add_field(): Adjust clustered index metadata
      for instant ADD COLUMN.
      
      dict_index_t::remove_instant(): Remove the 'instant ADD' status
      of a clustered index when the table becomes empty, or the very first
      instant ADD COLUMN operation is rolled back.
      
      dict_table_t: Add the accessors is_instant(), is_temporary(),
      supports_instant().
      
      dict_table_t::instant_add_column(): Adjust metadata for
      instant ADD COLUMN.
      
      dict_table_t::rollback_instant(): Adjust metadata on the rollback
      of instant ADD COLUMN.
      
      prepare_inplace_alter_table_dict(): First create the ctx->new_table,
      and only then decide if the table really needs to be rebuilt.
      We must split the creation of table or index metadata from the
      creation of the dictionary table records and the creation of
      the data. In this way, we can transform a table-rebuilding operation
      into an instant ADD COLUMN operation. Dictionary objects will only
      be added to cache when table rebuilding or index creation is needed.
      The ctx->instant_table will never be added to cache.
      
      dict_table_t::add_to_cache(): Modified and renamed from
      dict_table_add_to_cache(). Do not modify the table metadata.
      Let the callers invoke dict_table_add_system_columns() and if needed,
      set can_be_evicted.
      
      dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
      system columns (which will now exist in the dict_table_t object
      already at this point).
      
      dict_create_table_step(): Expect the callers to invoke
      dict_table_add_system_columns().
      
      pars_create_table(): Before creating the table creation execution
      graph, invoke dict_table_add_system_columns().
      
      row_create_table_for_mysql(): Expect all callers to invoke
      dict_table_add_system_columns().
      
      create_index_dict(): Replaces row_merge_create_index_graph().
      
      innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
      Call my_error() if an error occurs.
      
      btr_cur_instant_init(), btr_cur_instant_init_low(),
      btr_cur_instant_root_init():
      Load additional metadata from the clustered index and set
      dict_index_t::n_core_null_bytes. This is invoked
      when table metadata is first loaded into the data dictionary.
      
      dict_boot(): Initialize n_core_null_bytes for the four hard-coded
      dictionary tables.
      
      dict_create_index_step(): Initialize n_core_null_bytes. This is
      executed as part of CREATE TABLE.
      
      dict_index_build_internal_clust(): Initialize n_core_null_bytes to
      NO_CORE_NULL_BYTES if table->supports_instant().
      
      row_create_index_for_mysql(): Initialize n_core_null_bytes for
      CREATE TEMPORARY TABLE.
      
      commit_cache_norebuild(): Call the code to rename or enlarge columns
      in the cache only if instant ADD COLUMN is not being used.
      (Instant ADD COLUMN would copy all column metadata from
      instant_table to old_table, including the names and lengths.)
      
      PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
      This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
      least significant 3 bits were used. The original byte containing
      PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.
      
      page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.
      
      page_ptr_get_direction(), page_get_direction(),
      page_ptr_set_direction(): Accessors for PAGE_DIRECTION.
      
      page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.
      
      page_direction_increment(): Increment PAGE_N_DIRECTION
      and set PAGE_DIRECTION.
      
      rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
      and assume that heap_no is always set.
      Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
      even if the record contains fewer fields.
      
      rec_offs_make_valid(): Add the parameter 'leaf'.
      
      rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
      on the core fields. Instant ADD COLUMN only applies to the
      clustered index, and we should never build a search key that has
      more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
      All these columns are always present.
      
      dict_index_build_data_tuple(): Remove assertions that would be
      duplicated in rec_copy_prefix_to_dtuple().
      
      rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
      number of fields is between n_core_fields and n_fields.
      
      cmp_rec_rec_with_match(): Implement the comparison between two
      MIN_REC_FLAG records.
      
      trx_t::in_rollback: Make the field available in non-debug builds.
      
      trx_start_for_ddl_low(): Remove dangerous error-tolerance.
      A dictionary transaction must be flagged as such before it has generated
      any undo log records. This is because trx_undo_assign_undo() will mark
      the transaction as a dictionary transaction in the undo log header
      right before the very first undo log record is being written.
      
      btr_index_rec_validate(): Account for instant ADD COLUMN
      
      row_undo_ins_remove_clust_rec(): On the rollback of an insert into
      SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
      last column from the table and the clustered index.
      
      row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
      trx_undo_update_rec_get_update(): Handle the 'default row'
      as a special case.
      
      dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
      before insert or update. After instant ADD COLUMN, if the last fields
      of a clustered index tuple match the 'default row', there is no
      need to store them. While trimming the entry, we must hold a page latch,
      so that the table cannot be emptied and the 'default row' be deleted.
      
      btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
      row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
      Invoke dtuple_t::trim() if needed.
      
      row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
      row_ins_clust_index_entry_low().
      
      rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
      of fields to be between n_core_fields and n_fields. Do not support
      infimum,supremum. They are never supposed to be stored in dtuple_t,
      because page creation nowadays uses a lower-level method for initializing
      them.
      
      rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
      number of fields.
      
      btr_cur_trim(): In an update, trim the index entry as needed. For the
      'default row', handle rollback specially. For user records, omit
      fields that match the 'default row'.
      
      btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
      Skip locking and adaptive hash index for the 'default row'.
      
      row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
      In the temporary file that is applied by row_log_table_apply(),
      we must identify whether the records contain the extra header for
      instantly added columns. For now, we will allocate an additional byte
      for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
      has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
      fine, as they will be converted and will only contain 'core' columns
      (PRIMARY KEY and some system columns) that are converted from dtuple_t.
      
      rec_get_converted_size_temp(), rec_init_offsets_temp(),
      rec_convert_dtuple_to_temp(): Add the parameter 'status'.
      
      REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
      An info_bits constant for distinguishing the 'default row' record.
      
      rec_comp_status_t: An enum of the status bit values.
      
      rec_leaf_format: An enum that replaces the bool parameter of
      rec_init_offsets_comp_ordinary().
      a4948daf
  6. 14 Sep, 2017 1 commit
    • Jan Lindström's avatar
      MDEV-12634: Uninitialised ROW_MERGE_RESERVE_SIZE bytes written to tem… · fa2701c6
      Jan Lindström authored
      …porary file
      
      Fixed by removing writing key version to start of every block that
      was encrypted. Instead we will use single key version from log_sys
      crypt info.
      
      After this MDEV also blocks writen to row log are encrypted and blocks
      read from row log aren decrypted if encryption is configured for the
      table.
      
      innodb_status_variables[], struct srv_stats_t
      	Added status variables for merge block and row log block
      	encryption and decryption amounts.
      
      Removed ROW_MERGE_RESERVE_SIZE define.
      
      row_merge_fts_doc_tokenize
      	Remove ROW_MERGE_RESERVE_SIZE
      
      row_log_t
      	Add index, crypt_tail, crypt_head to be used in case of
      	encryption.
      
      row_log_online_op, row_log_table_close_func
      	Before writing a block encrypt it if encryption is enabled
      
      row_log_table_apply_ops, row_log_apply_ops
      	After reading a block decrypt it if encryption is enabled
      
      row_log_allocate
      	Allocate temporary buffers crypt_head and crypt_tail
      	if needed.
      
      row_log_free
      	Free temporary buffers crypt_head and crypt_tail if they
      	exist.
      
      row_merge_encrypt_buf, row_merge_decrypt_buf
      	Removed.
      
      row_merge_buf_create, row_merge_buf_write
      	Remove ROW_MERGE_RESERVE_SIZE
      
      row_merge_build_indexes
      	Allocate temporary buffer used in decryption and encryption
      	if needed.
      
      log_tmp_blocks_crypt, log_tmp_block_encrypt, log_temp_block_decrypt
      	New functions used in block encryption and decryption
      
      log_tmp_is_encrypted
      	New function to check is encryption enabled.
      
      Added test case innodb-rowlog to force creating a row log and
      verify that operations are done using introduced status
      variables.
      fa2701c6
  7. 01 Sep, 2017 1 commit
  8. 23 Aug, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-13485 MTR tests fail massively with --innodb-sync-debug · 59caf2c3
      Marko Mäkelä authored
      The parameter --innodb-sync-debug, which is disabled by default,
      aims to find potential deadlocks in InnoDB.
      
      When the parameter is enabled, lots of tests failed. Most of these
      failures were due to bogus diagnostics. But, as part of this fix,
      we are also fixing a bug in error handling code and removing dead
      code, and fixing cases where an uninitialized mutex was being
      locked and unlocked.
      
      dict_create_foreign_constraints_low(): Remove an extraneous
      mutex_exit() call that could cause corruption in an error handling
      path. Also, do not unnecessarily acquire dict_foreign_err_mutex.
      Its only purpose is to control concurrent access to
      dict_foreign_err_file.
      
      row_ins_foreign_trx_print(): Replace a redundant condition with a
      debug assertion.
      
      srv_dict_tmpfile, srv_dict_tmpfile_mutex: Remove. The
      temporary file is never being written to or read from.
      
      log_free_check(): Allow SYNC_FTS_CACHE (fts_cache_t::lock)
      to be held.
      
      ha_innobase::inplace_alter_table(), row_merge_insert_index_tuples():
      Assert that no unexpected latches are being held.
      
      sync_latch_meta_init(): Properly initialize dict_operation_lock_key
      at SYNC_DICT_OPERATION. dict_sys->mutex is SYNC_DICT, and
      the now-removed SRV_DICT_TMPFILE was wrongly registered at
      SYNC_DICT_OPERATION.
      
      buf_block_init(): Correctly register buf_block_t::debug_latch.
      It was previously misleadingly reported as LATCH_ID_DICT_FOREIGN_ERR.
      
      latch_level_t: Correct the relative latching order of
      SYNC_IBUF_PESS_INSERT_MUTEX,SYNC_INDEX_TREE and
      SYNC_FILE_FORMAT_TAG,SYNC_DICT_OPERATION to avoid bogus failures.
      
      row_drop_table_for_mysql(): Avoid accessing btr_defragment_mutex
      if the defragmentation thread has not been started. This is the
      case during fts_drop_orphaned_tables() in recv_recovery_rollback_active().
      
      fil_space_destroy_crypt_data(): Avoid acquiring fil_crypt_threads_mutex
      when it is uninitialized. We may have created crypt_data before the
      mutex was created, and the mutex creation would be skipped if
      InnoDB startup failed or --innodb-read-only was specified.
      59caf2c3
  9. 18 Aug, 2017 1 commit
    • Marko Mäkelä's avatar
      Follow-up fix to MDEV-12988 backup fails if innodb_undo_tablespaces>0 · e9e051d2
      Marko Mäkelä authored
      The fix broke mariabackup --prepare --incremental.
      
      The restore of an incremental backup starts up (parts of) InnoDB twice.
      First, all data files are discovered for applying .delta files. Then,
      after the .delta files have been applied, InnoDB will be restarted
      more completely, so that the redo log records will be applied via the
      buffer pool.
      
      During the first startup, the buffer pool is not initialized, and thus
      trx_rseg_get_n_undo_tablespaces() must not be invoked. The apply of
      the .delta files will currently assume that the --innodb-undo-tablespaces
      option correctly specifies the number of undo tablespace files, just
      like --backup does.
      
      The second InnoDB startup of --prepare for applying the redo log will
      properly invoke trx_rseg_get_n_undo_tablespaces().
      
      enum srv_operation_mode: Add SRV_OPERATION_RESTORE_DELTA for
      distinguishing the apply of .delta files from SRV_OPERATION_RESTORE.
      
      srv_undo_tablespaces_init(): In mariabackup --prepare --incremental,
      in the initial SRV_OPERATION_RESTORE_DELTA phase, do not invoke
      trx_rseg_get_n_undo_tablespaces() because the buffer pool or the
      redo logs are not available. Instead, blindly rely on the parameter
      --innodb-undo-tablespaces.
      e9e051d2
  10. 05 Jul, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-12548 Initial implementation of Mariabackup for MariaDB 10.2 · 8c71c6aa
      Marko Mäkelä authored
      InnoDB I/O and buffer pool interfaces and the redo log format
      have been changed between MariaDB 10.1 and 10.2, and the backup
      code has to be adjusted accordingly.
      
      The code has been simplified, and many memory leaks have been fixed.
      Instead of the file name xtrabackup_logfile, the file name ib_logfile0
      is being used for the copy of the redo log. Unnecessary InnoDB startup and
      shutdown and some unnecessary threads have been removed.
      
      Some help was provided by Vladislav Vaintroub.
      
      Parameters have been cleaned up and aligned with those of MariaDB 10.2.
      
      The --dbug option has been added, so that in debug builds,
      --dbug=d,ib_log can be specified to enable diagnostic messages
      for processing redo log entries.
      
      By default, innodb_doublewrite=OFF, so that --prepare works faster.
      If more crash-safety for --prepare is needed, double buffering
      can be enabled.
      
      The parameter innodb_log_checksums=OFF can be used to ignore redo log
      checksums in --backup.
      
      Some messages have been cleaned up.
      Unless --export is specified, Mariabackup will not deal with undo log.
      The InnoDB mini-transaction redo log is not only about user-level
      transactions; it is actually about mini-transactions. To avoid confusion,
      call it the redo log, not transaction log.
      
      We disable any undo log processing in --prepare.
      
      Because MariaDB 10.2 supports indexed virtual columns, the
      undo log processing would need to be able to evaluate virtual column
      expressions. To reduce the amount of code dependencies, we will not
      process any undo log in prepare.
      
      This means that the --export option must be disabled for now.
      
      This also means that the following options are redundant
      and have been removed:
      	xtrabackup --apply-log-only
      	innobackupex --redo-only
      
      In addition to disabling any undo log processing, we will disable any
      further changes to data pages during --prepare, including the change
      buffer merge. This means that restoring incremental backups should
      reliably work even when change buffering is being used on the server.
      Because of this, preparing a backup will not generate any further
      redo log, and the redo log file can be safely deleted. (If the
      --export option is enabled in the future, it must generate redo log
      when processing undo logs and buffered changes.)
      
      In --prepare, we cannot easily know if a partial backup was used,
      especially when restoring a series of incremental backups. So, we
      simply warn about any missing files, and ignore the redo log for them.
      
      FIXME: Enable the --export option.
      
      FIXME: Improve the handling of the MLOG_INDEX_LOAD record, and write
      a test that initiates a backup while an ALGORITHM=INPLACE operation
      is creating indexes or rebuilding a table. An error should be detected
      when preparing the backup.
      
      FIXME: In --incremental --prepare, xtrabackup_apply_delta() should
      ensure that if FSP_SIZE is modified, the file size will be adjusted
      accordingly.
      8c71c6aa
  11. 29 Jun, 2017 1 commit
    • Marko Mäkelä's avatar
      Reduce the granularity of innodb_log_file_size · 84e4e450
      Marko Mäkelä authored
      In Mariabackup, we would want the backed-up redo log file size to be
      a multiple of 512 bytes, or OS_FILE_LOG_BLOCK_SIZE. However, at startup,
      InnoDB would be picky, requiring the file size to be a multiple of
      innodb_page_size.
      
      Furthermore, InnoDB would require the parameter to be a multiple of
      one megabyte, while the minimum granularity is 512 bytes. Because
      the data-file-oriented fil_io() API is being used for writing the
      InnoDB redo log, writes will for now require innodb_log_file_size to
      be a multiple of the maximum innodb_page_size (65536 bytes).
      
      To complicate matters, InnoDB startup divided srv_log_file_size by
      UNIV_PAGE_SIZE, so that initially, the unit was bytes, and later it
      was innodb_page_size. We will simplify this and keep srv_log_file_size
      in bytes at all times.
      
      innobase_log_file_size: Remove. Remove some obsolete checks against
      overflow on 32-bit systems. srv_log_file_size is always 64 bits, and
      the maximum size 512GiB in multiples of innodb_page_size always fits
      in ulint (which is 32 or 64 bits). 512GiB would be 8,388,608*64KiB or
      134,217,728*4KiB.
      
      log_init(): Remove the parameter file_size that was always passed as
      srv_log_file_size.
      
      log_set_capacity(): Add a parameter for passing the requested file size.
      
      srv_log_file_size_requested: Declare static in srv0start.cc.
      
      create_log_file(), create_log_files(),
      innobase_start_or_create_for_mysql(): Invoke fil_node_create()
      with srv_log_file_size expressed in multiples of innodb_page_size.
      
      innobase_start_or_create_for_mysql(): Require the redo log file sizes
      to be multiples of 512 bytes.
      84e4e450
  12. 28 Jun, 2017 1 commit
    • Marko Mäkelä's avatar
      Avoid InnoDB messages about recovery after creating redo logs · b3171607
      Marko Mäkelä authored
      srv_log_files_created: A debug flag to ensure that InnoDB redo log
      files can only be created once in the server lifetime, and that
      after log files have been created, no crash recovery will take place.
      
      recv_scan_log_recs(): Detect the special case where the log consists
      of a sole MLOG_CHECKPOINT record, such as immediately after creating
      the redo logs.
      
      recv_recovery_from_checkpoint_start(): Skip the recovery message
      if the redo log is logically empty.
      b3171607
  13. 27 Jun, 2017 1 commit
    • Marko Mäkelä's avatar
      Fix a merge error in commit 8f643e20 · 3e1d0ff5
      Marko Mäkelä authored
      A merge error caused InnoDB bootstrap to fail when
      innodb_undo_tablespaces was set to more than 2.
      This was because of a bug that was introduced to
      srv_undo_tablespaces_init() by the merge.
      
      Furthermore, some adjustments for Oracle Bug#25551311 aka
      Bug#23517560 changes were forgotten. We must minimize direct
      references to srv_undo_tablespaces_open and use predicates
      instead.
      
      srv_undo_tablespaces_init(): Increment srv_undo_tablespaces_open
      once, not twice, per loop iteration.
      
      is_system_or_undo_tablespace(): Remove (unused function).
      
      is_predefined_tablespace(): Invoke srv_is_undo_tablespace().
      3e1d0ff5
  14. 09 Jun, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-13039 innodb_fast_shutdown=0 may fail to purge all undo log · 417434f1
      Marko Mäkelä authored
      When a slow shutdown is performed soon after spawning some work for
      background threads that can create or commit transactions, it is possible
      that new transactions are started or committed after the purge has finished.
      This is violating the specification of innodb_fast_shutdown=0, namely that
      the purge must be completed. (None of the history of the recent transactions
      would be purged.)
      
      Also, it is possible that the purge threads would exit in slow shutdown
      while there exist active transactions, such as recovered incomplete
      transactions that are being rolled back. Thus, the slow shutdown could
      fail to purge some undo log that becomes purgeable after the transaction
      commit or rollback.
      
      srv_undo_sources: A flag that indicates if undo log can be generated
      or the persistent, whether by background threads or by user SQL.
      Even when this flag is clear, active transactions that already exist
      in the system may be committed or rolled back.
      
      innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
      Do not return an error code; the operation never fails.
      Clear the srv_undo_sources flag, and also ensure that the background
      DROP TABLE queue is empty.
      
      srv_purge_should_exit(): Do not allow the purge to exit if
      srv_undo_sources are active or the background DROP TABLE queue is not
      empty, or in slow shutdown, if any active transactions exist
      (and are being rolled back).
      
      srv_purge_coordinator_thread(): Remove some previous workarounds
      for this bug.
      
      innobase_start_or_create_for_mysql(): Set buf_page_cleaner_is_active
      and srv_dict_stats_thread_active directly. Set srv_undo_sources before
      starting the purge subsystem, to prevent immediate shutdown of the purge.
      Create dict_stats_thread and fts_optimize_thread immediately
      after setting srv_undo_sources, so that shutdown can use this flag to
      determine if these subsystems were started.
      
      dict_stats_shutdown(): Shut down dict_stats_thread. Backported from 10.2.
      
      srv_shutdown_table_bg_threads(): Remove (unused).
      417434f1
  15. 02 Jun, 2017 1 commit
    • Marko Mäkelä's avatar
      Remove deprecated InnoDB file format parameters · 0c92794d
      Marko Mäkelä authored
      The following options will be removed:
      
      innodb_file_format
      innodb_file_format_check
      innodb_file_format_max
      innodb_large_prefix
      
      They have been deprecated in MySQL 5.7.7 (and MariaDB 10.2.2) in WL#7703.
      
      The file_format column in two INFORMATION_SCHEMA tables will be removed:
      
      innodb_sys_tablespaces
      innodb_sys_tables
      
      Code to update the file format tag at the end of page 0:5
      (TRX_SYS_PAGE in the InnoDB system tablespace) will be removed.
      When initializing a new database, the bytes will remain 0.
      
      All references to the Barracuda file format will be removed.
      Some references to the Antelope file format (meaning
      ROW_FORMAT=REDUNDANT or ROW_FORMAT=COMPACT) will remain.
      
      This basically ports WL#7704 from MySQL 8.0.0 to MariaDB 10.3.1:
      
      commit 4a69dc2a95995501ed92d59a1de74414a38540c6
      Author: Marko Mäkelä <marko.makela@oracle.com>
      Date:   Wed Mar 11 22:19:49 2015 +0200
      0c92794d
  16. 15 May, 2017 1 commit
  17. 12 May, 2017 2 commits
    • Marko Mäkelä's avatar
      MDEV-12674 Post-merge fix: Include accidentally omitted changes · f9069a3d
      Marko Mäkelä authored
      In 10.2, the definition of simple_counter resides in the file
      sync0types.h, not in the file os0sync.h which has been removed.
      f9069a3d
    • Marko Mäkelä's avatar
      MDEV-12674 Innodb_row_lock_current_waits has overflow · ff166093
      Marko Mäkelä authored
      There is a race condition related to the variable
      srv_stats.n_lock_wait_current_count, which is only
      incremented and decremented by the function lock_wait_suspend_thread(),
      
      The incrementing is protected by lock_sys->wait_mutex, but the
      decrementing does not appear to be protected by anything.
      This mismatch could allow the counter to be corrupted when a
      transactional InnoDB table or record lock wait is terminating
      roughly at the same time with the start of a wait on a
      (possibly different) lock.
      
      ib_counter_t: Remove some unused methods. Prevent instantiation for N=1.
      Add an inc() method that takes a slot index as a parameter.
      
      single_indexer_t: Remove.
      
      simple_counter<typename Type, bool atomic=false>: A new counter wrapper.
      Optionally use atomic memory operations for modifying the counter.
      Aligned to the cache line size.
      
      lsn_ctr_1_t, ulint_ctr_1_t, int64_ctr_1_t: Define as simple_counter<Type>.
      These counters are either only incremented (and we do not care about
      losing some increment operations), or the increment/decrement operations
      are protected by some mutex.
      
      srv_stats_t::os_log_pending_writes: Document that the number is protected
      by log_sys->mutex.
      
      srv_stats_t::n_lock_wait_current_count: Use simple_counter<ulint, true>,
      that is, atomic inc() and dec() operations.
      
      lock_wait_suspend_thread(): Release the mutexes before incrementing
      the counters. Avoid acquiring the lock mutex if the lock wait has
      already been resolved. Atomically increment and decrement
      srv_stats.n_lock_wait_current_count.
      
      row_insert_for_mysql(), row_update_for_mysql(),
      row_update_cascade_for_mysql(): Use the inc() method with the trx->id
      as the slot index. This is a non-functional change, just using
      inc() instead of add(1).
      
      buf_LRU_get_free_block(): Replace the method add(index, n) with inc().
      There is no slot index in the simple_counter.
      ff166093
  18. 26 Apr, 2017 3 commits
    • Marko Mäkelä's avatar
      MDEV-11802 InnoDB purge does not always run when there is work to do · 13dcdb09
      Marko Mäkelä authored
      srv_sys_t::n_threads_active[]: Protect writes by both the mutex and
      by atomic memory access.
      
      srv_active_wake_master_thread_low(): Reliably wake up the master
      thread if there is work to do. The trick is to atomically read
      srv_sys->n_threads_active[].
      
      srv_wake_purge_thread_if_not_active(): Atomically read
      srv_sys->n_threads_active[] (and trx_sys->rseg_history_len),
      so that the purge should always be triggered when there is work to do.
      
      trx_commit_in_memory(): Invoke srv_wake_purge_thread_if_not_active()
      whenever a transaction is committed. Purge could have been prevented by
      the read view of the currently committing transaction, even if it is
      a read-only transaction.
      
      trx_purge_add_update_undo_to_history(): Do not wake up the purge.
      This is only called by trx_undo_update_cleanup(), as part of
      trx_write_serialisation_history(), which in turn is only called by
      trx_commit_low() which will always call trx_commit_in_memory().
      Thus, the added call in trx_commit_in_memory() will cover also
      this use case where a committing read-write transaction added
      some update_undo log to the purge queue.
      
      trx_rseg_mem_restore(): Atomically modify trx_sys->rseg_history_len.
      13dcdb09
    • Marko Mäkelä's avatar
      Remove redundant initialization of global InnoDB variables · 6264e892
      Marko Mäkelä authored
      Also, remove the unused global variable srv_priority_boost.
      6264e892
    • Marko Mäkelä's avatar
      Follow-up to MDEV-12289: Support innodb_undo_tablespaces=127 · 206ecb79
      Marko Mäkelä authored
      MySQL 5.7 reduced the maximum number of innodb_undo_tablespaces
      from 126 to 95 when it reserved 32 persistent rollback segments
      for the temporary undo logs. Since MDEV-12289 restored all 128
      persistent rollback segments for persistent undo logs, the
      reasonable maximum value of innodb_undo_tablespaces is 127
      (not 126 or 95). This is because out of the 128 rollback segments,
      the first one will always be created in the system tablespace
      and the remaining ones can be created in dedicated undo tablespaces.
      206ecb79
  19. 24 Apr, 2017 1 commit
    • Aditya A's avatar
      WL9513 Bug#23333990 PERSISTENT INDEX STATISTICS UPDATE BEFORE TRANSACTION IS COMMITTED · 44b1fb36
      Aditya A authored
      PROBLEM
      
      By design stats estimation always reading uncommitted data. In this scenario
      an uncommitted transaction has deleted all rows in the table. In Innodb
      uncommitted delete records are marked as delete but not actually removed
      from Btree until the transaction has committed or a read view for the rows
      is present.While calculating persistent stats we were ignoring the delete
      marked records,since all the records are delete marked we were estimating
      the number of rows present in the table as zero which leads to bad plans
      in other transaction operating on the table.
      
      Fix
      
      Introduced a system variable called innodb_stats_include_delete_marked
      which when enabled includes delete marked records for stat
      calculations .
      44b1fb36
  20. 31 Mar, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-12289 Keep 128 persistent rollback segments for compatibility and performance · 124bae08
      Marko Mäkelä authored
      InnoDB divides the allocation of undo logs into rollback segments.
      The DB_ROLL_PTR system column of clustered indexes can address up to
      128 rollback segments (TRX_SYS_N_RSEGS). Originally, InnoDB only
      created one rollback segment. In MySQL 5.5 or in the InnoDB Plugin
      for MySQL 5.1, all 128 rollback segments were created.
      
      MySQL 5.7 hard-codes the rollback segment IDs 1..32 for temporary undo logs.
      On upgrade, unless a slow shutdown (innodb_fast_shutdown=0)
      was performed on the old server instance, these rollback segments
      could be in use by transactions that are in XA PREPARE state or
      transactions that were left behind by a server kill followed by a
      normal shutdown immediately after restart.
      
      Persistent tables cannot refer to temporary undo logs or vice versa.
      Therefore, we should keep two distinct sets of rollback segments:
      one for persistent tables and another for temporary tables. In this way,
      all 128 rollback segments will be available for both types of tables,
      which could improve performance. Also, MariaDB 10.2 will remain more
      compatible than MySQL 5.7 with data files from earlier versions of
      MySQL or MariaDB.
      
      trx_sys_t::temp_rsegs[TRX_SYS_N_RSEGS]: A new array of temporary
      rollback segments. The trx_sys_t::rseg_array[TRX_SYS_N_RSEGS] will
      be solely for persistent undo logs.
      
      srv_tmp_undo_logs. Remove. Use the constant TRX_SYS_N_RSEGS.
      
      srv_available_undo_logs: Change the type to ulong.
      
      trx_rseg_get_on_id(): Remove. Instead, let the callers refer to
      trx_sys directly.
      
      trx_rseg_create(), trx_sysf_rseg_find_free(): Remove unneeded parameters.
      These functions only deal with persistent undo logs.
      
      trx_temp_rseg_create(): New function, to create all temporary rollback
      segments at server startup.
      
      trx_rseg_t::is_persistent(): Determine if the rollback segment is for
      persistent tables.
      
      trx_sys_is_noredo_rseg_slot(): Remove. The callers must know based on
      context (such as table handle) whether the DB_ROLL_PTR is referring to
      a persistent undo log.
      
      trx_sys_create_rsegs(): Remove all parameters, which were always passed
      as global variables. Instead, modify the global variables directly.
      
      enum trx_rseg_type_t: Remove.
      
      trx_t::get_temp_rseg(): A method to ensure that a temporary
      rollback segment has been assigned for the transaction.
      
      trx_t::assign_temp_rseg(): Replaces trx_assign_rseg().
      
      trx_purge_free_segment(), trx_purge_truncate_rseg_history():
      Remove the redundant variable noredo=false.
      Temporary undo logs are discarded immediately at transaction commit
      or rollback, not lazily by purge.
      
      trx_purge_mark_undo_for_truncate(): Remove references to the
      temporary rollback segments.
      
      trx_purge_mark_undo_for_truncate(): Remove a check for temporary
      rollback segments. Only the dedicated persistent undo log tablespaces
      can be truncated.
      
      trx_undo_get_undo_rec_low(), trx_undo_get_undo_rec(): Add the
      parameter is_temp.
      
      trx_rseg_mem_restore(): Split from trx_rseg_mem_create().
      Initialize the undo log and the rollback segment from the file
      data structures.
      
      trx_sysf_get_n_rseg_slots(): Renamed from
      trx_sysf_used_slots_for_redo_rseg(). Count the persistent
      rollback segment headers that have been initialized.
      
      trx_sys_close(): Also free trx_sys->temp_rsegs[].
      
      get_next_redo_rseg(): Merged to trx_assign_rseg_low().
      
      trx_assign_rseg_low(): Remove the parameters and access the
      global variables directly. Revert to simple round-robin, now that
      the whole trx_sys->rseg_array[] is for persistent undo log again.
      
      get_next_noredo_rseg(): Moved to trx_t::assign_temp_rseg().
      
      srv_undo_tablespaces_init(): Remove some parameters and use the
      global variables directly. Clarify some error messages.
      
      Adjust the test innodb.log_file. Apparently, before these changes,
      InnoDB somehow ignored missing dedicated undo tablespace files that
      are pointed by the TRX_SYS header page, possibly losing part of
      essential transaction system state.
      124bae08
  21. 14 Mar, 2017 1 commit
    • Jan Lindström's avatar
      MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing · 50eb40a2
      Jan Lindström authored
      MDEV-11581: Mariadb starts InnoDB encryption threads
      when key has not changed or data scrubbing turned off
      
      Background: Key rotation is based on background threads
      (innodb-encryption-threads) periodically going through
      all tablespaces on fil_system. For each tablespace
      current used key version is compared to max key age
      (innodb-encryption-rotate-key-age). This process
      naturally takes CPU. Similarly, in same time need for
      scrubbing is investigated. Currently, key rotation
      is fully supported on Amazon AWS key management plugin
      only but InnoDB does not have knowledge what key
      management plugin is used.
      
      This patch re-purposes innodb-encryption-rotate-key-age=0
      to disable key rotation and background data scrubbing.
      All new tables are added to special list for key rotation
      and key rotation is based on sending a event to
      background encryption threads instead of using periodic
      checking (i.e. timeout).
      
      fil0fil.cc: Added functions fil_space_acquire_low()
      to acquire a tablespace when it could be dropped concurrently.
      This function is used from fil_space_acquire() or
      fil_space_acquire_silent() that will not print
      any messages if we try to acquire space that does not exist.
      fil_space_release() to release a acquired tablespace.
      fil_space_next() to iterate tablespaces in fil_system
      using fil_space_acquire() and fil_space_release().
      Similarly, fil_space_keyrotation_next() to iterate new
      list fil_system->rotation_list where new tables.
      are added if key rotation is disabled.
      Removed unnecessary functions fil_get_first_space_safe()
      fil_get_next_space_safe()
      
      fil_node_open_file(): After page 0 is read read also
      crypt_info if it is not yet read.
      
      btr_scrub_lock_dict_func()
      buf_page_check_corrupt()
      buf_page_encrypt_before_write()
      buf_merge_or_delete_for_page()
      lock_print_info_all_transactions()
      row_fts_psort_info_init()
      row_truncate_table_for_mysql()
      row_drop_table_for_mysql()
          Use fil_space_acquire()/release() to access fil_space_t.
      
      buf_page_decrypt_after_read():
          Use fil_space_get_crypt_data() because at this point
          we might not yet have read page 0.
      
      fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly
      to functions needing it and store fil_space_t* to rotation state.
      Use fil_space_acquire()/release() when iterating tablespaces
      and removed unnecessary is_closing from fil_crypt_t. Use
      fil_space_t::is_stopping() to detect when access to
      tablespace should be stopped. Removed unnecessary
      fil_space_get_crypt_data().
      
      fil_space_create(): Inform key rotation that there could
      be something to do if key rotation is disabled and new
      table with encryption enabled is created.
      Remove unnecessary functions fil_get_first_space_safe()
      and fil_get_next_space_safe(). fil_space_acquire()
      and fil_space_release() are used instead. Moved
      fil_space_get_crypt_data() and fil_space_set_crypt_data()
      to fil0crypt.cc.
      
      fsp_header_init(): Acquire fil_space_t*, write crypt_data
      and release space.
      
      check_table_options()
      	Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_*
      
      i_s.cc: Added ROTATING_OR_FLUSHING field to
      information_schema.innodb_tablespace_encryption
      to show current status of key rotation.
      50eb40a2
  22. 09 Mar, 2017 1 commit
    • Vladislav Vaintroub's avatar
      MDEV-12201 innodb_flush_method are not available on Windows · a98009ab
      Vladislav Vaintroub authored
       Remove srv_win_file_flush_method
      
      - Rename srv_unix_file_flush_method to srv_file_flush_method, and
        rename constants to remove UNIX from them, i.e SRV_UNIX_FSYNC=>SRV_FSYNC
      
      - Add SRV_ALL_O_DIRECT_FSYNC corresponding to current Windows default
      (no buffering for either log or data, flush on both log and data)
      
      - change os_file_open on Windows to behave identically to Unix wrt
      O_DIRECT and O_DSYNC settings. map O_DIRECT to FILE_FLAG_NO_BUFFERING and
      O_DSYNC to FILE_FLAG_WRITE_THROUGH
      
      - remove various #ifdef _WIN32
      a98009ab
  23. 01 Mar, 2017 1 commit
  24. 28 Feb, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-12146 Deprecate and remove innodb_instrument_semaphores · 6cf29ab0
      Marko Mäkelä authored
      MDEV-7618 introduced configuration parameter innodb_instrument_semaphores
      in MariaDB Server 10.1. The parameter seems to only affect the rw-lock
      X-latch acquisition. Extra fields are added to rw_lock_t to remember one
      X-latch holder or waiter. These fields are not being consulted or reported
      anywhere. This is basically only adding code bloat.
      
      If the intention is to debug hangs or deadlocks, we have better tools for
      that in the debug server, and for the non-debug server, core dumps can
      reveal a lot. For example, the mini-transaction memo records the
      currently held buffer block or index rw-locks, to be released at
      mtr_t::commit().
      
      The configuration parameter innodb_instrument_semaphores will be
      deprecated in 10.2.5 and removed in 10.3.0.
      
      rw_lock_t: Remove the members lock_name, file_name, line, thread_id
      which did not affect any output.
      6cf29ab0
  25. 20 Feb, 2017 2 commits
    • Marko Mäkelä's avatar
      MDEV-11802 innodb.innodb_bug14676111 fails · a13a636c
      Marko Mäkelä authored
      The function trx_purge_stop() was calling os_event_reset(purge_sys->event)
      before calling rw_lock_x_lock(&purge_sys->latch). The os_event_set()
      call in srv_purge_coordinator_suspend() is protected by that X-latch.
      
      It would seem a good idea to consistently protect both os_event_set()
      and os_event_reset() calls with a common mutex or rw-lock in those
      cases where os_event_set() and os_event_reset() are used
      like condition variables, tied to changes of shared state.
      
      For each os_event_t, we try to document the mutex or rw-lock that is
      being used. For some events, frequent calls to os_event_set() seem to
      try to avoid hangs. Some events are never waited for infinitely, only
      timed waits, and os_event_set() is used for early termination of these
      waits.
      
      os_aio_simulated_put_read_threads_to_sleep(): Define as a null macro
      on other systems than Windows. TODO: remove this altogether and disable
      innodb_use_native_aio on Windows.
      
      os_aio_segment_wait_events[]: Initialize only if innodb_use_native_aio=0.
      
      log_write_flush_to_disk_low(): Invoke log_mutex_enter() at the end, to
      avoid race conditions when changing the system state. (No potential
      race condition existed before MySQL 5.7.)
      a13a636c
    • Marko Mäkelä's avatar
      MDEV-11802 innodb.innodb_bug14676111 fails · 13493078
      Marko Mäkelä authored
      The function trx_purge_stop() was calling os_event_reset(purge_sys->event)
      before calling rw_lock_x_lock(&purge_sys->latch). The os_event_set()
      call in srv_purge_coordinator_suspend() is protected by that X-latch.
      
      It would seem a good idea to consistently protect both os_event_set()
      and os_event_reset() calls with a common mutex or rw-lock in those
      cases where os_event_set() and os_event_reset() are used
      like condition variables, tied to changes of shared state.
      
      For each os_event_t, we try to document the mutex or rw-lock that is
      being used. For some events, frequent calls to os_event_set() seem to
      try to avoid hangs. Some events are never waited for infinitely, only
      timed waits, and os_event_set() is used for early termination of these
      waits.
      
      os_aio_simulated_put_read_threads_to_sleep(): Define as a null macro
      on other systems than Windows. TODO: remove this altogether and disable
      innodb_use_native_aio on Windows.
      
      os_aio_segment_wait_events[]: Initialize only if innodb_use_native_aio=0.
      13493078
  26. 14 Feb, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-12057 Embedded server shutdown hangs in InnoDB · 1b4b4f68
      Marko Mäkelä authored
      Ever since MDEV-5800 enabled indexed virtual columns for InnoDB,
      the InnoDB shutdown relied on close_connections() that would set
      thd->killed for the InnoDB purge threads. Alas, the embedded server
      shutdown is not invoking close_connections(), and thus InnoDB purge
      threads fail to initiate shutdown, causing a hang.
      
      innodb_inited: Remove. Use srv_was_started instead.
      
      innobase_fast_shutdown: Remove. Use srv_fast_shutdown instead.
      
      srv_running: Renamed from thd_destructor_myvar, and made global.
      The value NULL means that shutdown was requested or the purge threads
      should not be running because of innodb_read_only_mode=1.
      
      innobase_init(): Set srv_was_started after ensuring that srv_running
      was initialized. (In innodb_read_only mode, the purge threads are not
      started and we do not care if srv_running==NULL.)
      
      innobase_start_or_create_for_mysql(): Do not set srv_was_started.
      Let it be set by the only caller innobase_init().
      
      srv_purge_should_exit(): Check also srv_was_started and srv_running
      when evaluating thd->killed.
      1b4b4f68
  27. 05 Feb, 2017 1 commit
    • Marko Mäkelä's avatar
      Rewrite the innodb.log_file_size test with DBUG_EXECUTE_IF. · f1627045
      Marko Mäkelä authored
      Remove the debug parameter innodb_force_recovery_crash that was
      introduced into MySQL 5.6 by me in WL#6494 which allowed InnoDB
      to resize the redo log on startup.
      
      Let innodb.log_file_size actually start up the server, but ensure
      that the InnoDB storage engine refuses to start up in each of the
      scenarios.
      f1627045
  28. 04 Feb, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-11947 InnoDB purge workers fail to shut down · 9f0dbb31
      Marko Mäkelä authored
      srv_release_threads(): Actually wait for the threads to resume
      from suspension. On CentOS 5 and possibly other platforms,
      os_event_set() may be lost.
      
      srv_resume_thread(): A counterpart of srv_suspend_thread().
      Optionally wait for the event to be set, optionally with a timeout,
      and then release the thread from suspension.
      
      srv_free_slot(): Unconditionally suspend the thread. It is always
      in resumed state when this function is entered.
      
      srv_active_wake_master_thread_low(): Only call os_event_set().
      
      srv_purge_coordinator_suspend(): Use srv_resume_thread() instead
      of the complicated logic.
      9f0dbb31
  29. 03 Feb, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-11947 InnoDB purge workers fail to shut down · bc12d993
      Marko Mäkelä authored
      srv_release_threads(): Actually wait for the threads to resume
      from suspension. On CentOS 5 and possibly other platforms,
      os_event_set() may be lost.
      
      srv_resume_thread(): A counterpart of srv_suspend_thread().
      Optionally wait for the event to be set, optionally with a timeout,
      and then release the thread from suspension.
      
      srv_free_slot(): Unconditionally suspend the thread. It is always
      in resumed state when this function is entered.
      
      srv_active_wake_master_thread_low(): Only call os_event_set().
      
      srv_purge_coordinator_suspend(): Use srv_resume_thread() instead
      of the complicated logic.
      bc12d993
  30. 01 Feb, 2017 1 commit
    • Marko Mäkelä's avatar
      Shut down InnoDB after aborted startup. · 81b7fe9d
      Marko Mäkelä authored
      This fixes memory leaks in tests that cause InnoDB startup to fail.
      
      buf_pool_free_instance(): Also free buf_pool->flush_rbt, which would
      normally be freed when crash recovery finishes.
      
      fil_node_close_file(), fil_space_free_low(), fil_close_all_files():
      Relax some debug assertions to tolerate !srv_was_started.
      
      innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
      Changed the return type to void. Do not assume that all subsystems
      were started.
      
      que_init(), que_close(): Remove (empty functions).
      
      srv_init(), srv_general_init(): Remove as global functions.
      
      srv_free(): Allow srv_sys=NULL.
      
      srv_get_active_thread_type(): Only return SRV_PURGE if purge really
      is running.
      
      srv_shutdown_all_bg_threads(): Do not reset srv_start_state. It will
      be needed by innodb_shutdown().
      
      innobase_start_or_create_for_mysql(): Always call srv_boot() so that
      innodb_shutdown() can assume that it was called. Make more subsystems
      dependent on SRV_START_STATE_STAT.
      
      srv_shutdown_bg_undo_sources(): Require SRV_START_STATE_STAT.
      
      trx_sys_close(): Do not assume purge_sys!=NULL. Do not call
      buf_dblwr_free(), because the doublewrite buffer can exist while
      the transaction system does not.
      
      logs_empty_and_mark_files_at_shutdown(): Do a faster shutdown if
      !srv_was_started.
      
      recv_sys_close(): Invoke dblwr.pages.clear() which would normally
      be invoked by buf_dblwr_process().
      
      recv_recovery_from_checkpoint_start(): Always release log_sys->mutex.
      
      row_mysql_close(): Allow the subsystem not to exist.
      81b7fe9d
  31. 31 Jan, 2017 1 commit
    • Marko Mäkelä's avatar
      Rewrite the innodb.log_file_size test with DBUG_EXECUTE_IF. · 1293e5e5
      Marko Mäkelä authored
      Remove the debug parameter innodb_force_recovery_crash that was
      introduced into MySQL 5.6 by me in WL#6494 which allowed InnoDB
      to resize the redo log on startup.
      
      Let innodb.log_file_size actually start up the server, but ensure
      that the InnoDB storage engine refuses to start up in each of the
      scenarios.
      1293e5e5
  32. 24 Jan, 2017 1 commit
    • Jan Lindström's avatar
      MDEV-11254: innodb-use-trim has no effect in 10.2 · 6495806e
      Jan Lindström authored
      Problem was that implementation merged from 10.1 was incompatible
      with InnoDB 5.7.
      
      buf0buf.cc: Add functions to return should we punch hole and
      how big.
      
      buf0flu.cc: Add written page to IORequest
      
      fil0fil.cc: Remove unneeded status call and add test is
      sparse files and punch hole supported by file system when
      tablespace is created. Add call to get file system
      block size. Used file node is added to IORequest. Added
      functions to check is punch hole supported and setting
      punch hole.
      
      ha_innodb.cc: Remove unneeded status variables (trim512-32768)
      and trim_op_saved. Deprecate innodb_use_trim and
      set it ON by default. Add function to set innodb-use-trim
      dynamically.
      
      dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
      if punch hole operation fails.
      
      fil0fil.h: Add punch_hole variable to fil_space_t and
      block size to fil_node_t.
      
      os0api.h: Header to helper functions on buf0buf.cc and
      fil0fil.cc for os0file.h
      
      os0file.h: Remove unneeded m_block_size from IORequest
      and add bpage to IORequest to know actual size of
      the block and m_fil_node to know tablespace file
      system block size and does it support punch hole.
      
      os0file.cc: Add function punch_hole() to IORequest
      to do punch_hole operation,
      get the file system block size and determine
      does file system support sparse files (for punch hole).
      
      page0size.h: remove implicit copy disable and
      use this implicit copy to implement copy_from()
      function.
      
      buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
      os0file.h, os0file.cc, log0log.cc, log0recv.cc:
      Remove unneeded write_size parameter from fil_io
      calls.
      
      srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
      trim512-trim32678 status variables. Removed
      these from monitor tests.
      6495806e
  33. 18 Jan, 2017 1 commit
    • Marko Mäkelä's avatar
      Remove MYSQL_COMPRESSION. · 1eabad5d
      Marko Mäkelä authored
      The MariaDB 10.1 page_compression is incompatible with the Oracle
      implementation that was introduced in MySQL 5.7 later.
      
      Remove the Oracle implementation. Also remove the remaining traces of
      MYSQL_ENCRYPTION.
      
      This will also remove traces of PUNCH_HOLE until it is implemented
      better. The only effective call to os_file_punch_hole() was in
      fil_node_create_low() to test if the operation is supported for the file.
      
      In other words, it looks like page_compression is not working in
      MariaDB 10.2, because no code equivalent to the 10.1 os_file_trim()
      is enabled.
      1eabad5d
  34. 07 Jan, 2017 3 commits