1. 01 Mar, 2020 1 commit
    • Vladislav Vaintroub's avatar
      MDEV-21534 - Improve innodb redo log group commit performance · 30ea63b7
      Vladislav Vaintroub authored
      Introduce special synchronization primitive  group_commit_lock
      for more efficient synchronization of redo log writing and flushing.
      
      The goal is to reduce CPU consumption on log_write_up_to, to reduce
      the spurious wakeups, and improve the throughput in write-intensive
      benchmarks.
      30ea63b7
  2. 28 Feb, 2020 3 commits
  3. 27 Feb, 2020 9 commits
    • Marko Mäkelä's avatar
      Fix GCC -Wsign-compare · 8db62303
      Marko Mäkelä authored
      8db62303
    • Marko Mäkelä's avatar
      Fix GCC -Wparentheses · a263ca26
      Marko Mäkelä authored
      a263ca26
    • Marko Mäkelä's avatar
      MDEV-21724: Optimize page_cur_insert_low() redo logging · 138cbec5
      Marko Mäkelä authored
      Inserting a record into an index page involves updating multiple
      fields in the page header as well as updating the next-record links
      and potentially updating fields related to the sparse page directory.
      
      Let us cover the insert operations by higher-level log records, to avoid
      'redundant' logging about the writes.
      
      The code for applying the high-level log records will check the
      consistency of the page thoroughly, to avoid crashes during recovery.
      We will refuse to replay the inserts if any inconsistency is detected.
      With innodb_force_recovery=1, recovery will continue, but the affected
      pages may be more inconsistent if some changes were omitted.
      
      mrec_ext_t: Introduce the EXTENDED record subtypes
      INSERT_HEAP_REDUNDANT, INSERT_REUSE_REDUNDANT,
      INSERT_HEAP_DYNAMIC, INSERT_REUSE_DYNAMIC.
      The record will explicitly identify the page type and whether
      the space will be allocated from PAGE_HEAP_TOP or reused from
      the PAGE_FREE list. It will also tell how many bytes to copy
      from the preceding record header and payload, and how to
      initialize the rest of the record header and payload.
      
      mtr_t::page_insert(): Write the high-level log records.
      
      log_phys_t::apply(): Parse the high-level log records.
      
      page_apply_insert_redundant(), page_apply_insert_dynamic():
      Apply the high-level log records.
      
      page_dir_split_slot(): Introduce a variant that does not write log
      nor deal with ROW_FORMAT=COMPRESSED pages.
      
      page_mem_alloc_heap(): Remove the mtr_t parameter
      
      page_cur_insert_rec_low(): Write log only via mtr_t::page_insert().
      138cbec5
    • Marko Mäkelä's avatar
      MDEV-12353 Cleanup: Remove page_rec_get_base_extra_size() · dee6fb35
      Marko Mäkelä authored
      The function page_rec_get_base_extra_size() became dead code in
      commit 08ba3887.
      dee6fb35
    • Marko Mäkelä's avatar
      MDEV-12353: Improve page_cur_delete_rec() recovery · e15ae1cf
      Marko Mäkelä authored
      This is a follow-up to commit 572d2075
      where we introduced the EXTENDED log record subtypes
      DELETE_ROW_FORMAT_REDUNDANT and DELETE_ROW_FORMAT_DYNAMIC.
      
      log_phys_t::apply(): If corruption was noticed, stop applying the log
      unless innodb_force_recovery is set.
      e15ae1cf
    • Marko Mäkelä's avatar
      MDEV-12353: Make UNDO_APPEND more robust · 4431144a
      Marko Mäkelä authored
      This is a follow-up to commit 84e3f9ce
      that introduced the EXTENDED log record of UNDO_APPEND subtype.
      
      mtr_t::undo_append(): Accurately enforce the mtr_buf_t::MAX_DATA_SIZE
      limit. Also, replace mtr_buf_t::push() with simpler code, to append 1 byte
      to the log.
      
      log_phys_t::undo_append(): Return whether the page was found to
      be in an inconsistent state.
      
      log_phys_t::apply(): If corruption was noticed, stop applying log
      unless innodb_force_recovery is set.
      4431144a
    • Sergey Vojtovich's avatar
      cleanup trailing ws · a346ff35
      Sergey Vojtovich authored
      a346ff35
    • Daniel-Solo's avatar
      MDEV-10569: Add RELEASE_ALL_LOCKS function. Implementing the SQL · 127fee99
      Daniel-Solo authored
      function to release all named locks
      127fee99
    • Sergei Golubchik's avatar
      Revert "MDEV-17554 Auto-create new partition for system versioned tables with... · 98adcffe
      Sergei Golubchik authored
      Revert "MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT"
      
      This reverts commit 9894751a.
      This reverts commit f707c83f.
      98adcffe
  4. 25 Feb, 2020 2 commits
    • Aleksey Midenkov's avatar
      Compilation fix · 9894751a
      Aleksey Midenkov authored
      9894751a
    • Aleksey Midenkov's avatar
      MDEV-17554 Auto-create new partition for system versioned tables with history... · f707c83f
      Aleksey Midenkov authored
      MDEV-17554 Auto-create new partition for system versioned tables with history partitioned by INTERVAL/LIMIT
      
      When there are E empty partitions left, auto-create N new empty
      partitions for SYSTEM_TIME partitioning rotated by INTERVAL/LIMIT and
      marked by AUTO_INCREMENT keyword. Syntax change: AUTO_INCREMENT
      keyword (or shorter AUTO may be used instead) after LIMIT/INTERVAL
      clause.
      
      CREATE OR REPLACE TABLE t (x INT) WITH SYSTEM VERSIONING
      PARTITION BY SYSTEM_TIME LIMIT 100000 AUTO_INCREMENT;
      
      CREATE OR REPLACE TABLE t (x INT) WITH SYSTEM VERSIONING
      PARTITION BY SYSTEM_TIME INTERVAL 1 WEEK AUTO_INCREMENT;
      
      The current revision implements hard-coded values of 1 for E and N. As
      well as auto-creation threshold MinInterval = 1 hour, MinLimit = 1000.
      
      The name for newly added partition will be first chosen as "pX", where
      X is partition number and "p" is hard-coded name prefix. If this name
      is already occupied, the X will be incremented until the resulting
      name will be free to use.
      
      ALTER TABLE ADD PARTITION is now always fast. If there some history
      partition overflow occurs manual ALTER TABLE REBUILD PARTITION is
      needed.
      f707c83f
  5. 24 Feb, 2020 5 commits
  6. 22 Feb, 2020 1 commit
    • Marko Mäkelä's avatar
      MDEV-12353: Reduce log volume of page_cur_delete_rec() · 572d2075
      Marko Mäkelä authored
      mrec_ext_t: Introduce DELETE_ROW_FORMAT_REDUNDANT,
      DELETE_ROW_FORMAT_DYNAMIC.
      
      mtr_t::page_delete(): Write DELETE_ROW_FORMAT_REDUNDANT or
      DELETE_ROW_FORMAT_DYNAMIC log records. We log the byte offset
      of the preceding record, so that on recovery we can easily
      find everything to update. For DELETE_ROW_FORMAT_DYNAMIC,
      we must also write the header and data size of the record.
      
      We will retain the physical logging for ROW_FORMAT=COMPRESSED pages.
      
      page_zip_dir_balance_slot(): Renamed from page_dir_balance_slot(),
      and specialized for ROW_FORMAT=COMPRESSED only.
      
      page_rec_set_n_owned(), page_dir_slot_set_n_owned(),
      page_dir_balance_slot(): New variants that do not write any log.
      
      page_mem_free(): Take data_size, extra_size as parameters.
      Always zerofill the record payload.
      
      page_cur_delete_rec(): For other than ROW_FORMAT=COMPRESSED,
      only write log by mtr_t::page_delete().
      572d2075
  7. 21 Feb, 2020 10 commits
  8. 20 Feb, 2020 2 commits
    • Marko Mäkelä's avatar
      Cleanup: Remove dict_ind_redundant · 96901d95
      Marko Mäkelä authored
      There is no reason for the dummy index object dict_ind_redundant
      to exist any more. It was only being passed to btr_create().
      
      btr_create(): If !index, assume that a ROW_FORMAT=REDUNDANT
      table is being created.
      
      We could pass ibuf.index, dict_sys.sys_tables->indexes.start
      and so on, if those objects had been initialized before the
      function btr_create() is called.
      96901d95
    • Eugene Kosov's avatar
      MDEV-21774 Innodb, Windows : restore file sharing logic in Innodb · 6618fc29
      Eugene Kosov authored
      recv_sys_t opened redo log files along with log_sys_t. That's why I
      removed file sharing logic from InnoDB
      in 9ef2d29f
      But it was actually used to ensure that only one MariaDB instance
      will touch the same InnoDB files.
      
      os0file.cc: revert some changes done previously
      
      mapped_file_t::map(): now has arguments read_only, nvme
      
      file_io::open(): now has argument read_only
      
      class file_os_io: make final
      
      log_file_t::open(): now has argument read_only
      6618fc29
  9. 19 Feb, 2020 7 commits
    • Marko Mäkelä's avatar
      MDEV-12353: Reduce log volume by an UNDO_APPEND record · 84e3f9ce
      Marko Mäkelä authored
      We introduce an EXTENDED log record for appending an undo log record
      to an undo log page. This is equivalent to the MLOG_UNDO_INSERT record
      that was removed in commit f802c989,
      only using more compact encoding.
      
      mtr_t::log_write(): Fix a bug that affects longer log
      record writes in the !same_page && !have_offset case.
      Similar code is already implemented for the have_offset code path.
      The bug was unobservable before we started to write longer
      EXTENDED records. All !have_offset records (FREE_PAGE, INIT_PAGE,
      EXTENDED) that were written so far are short, and we never write
      RESERVED or OPTION records.
      
      mtr_t::undo_append(): Write an UNDO_APPEND record.
      
      log_phys_t::undo_append(): Apply an UNDO_APPEND record.
      
      trx_undo_page_set_next_prev_and_add(),
      trx_undo_page_report_modify(),
      trx_undo_page_report_rename():
      Invoke mtr_t::undo_append() instead of emitting WRITE records.
      84e3f9ce
    • Marko Mäkelä's avatar
      MDEV-12353: Reduce log volume by an UNDO_INIT record · 86f262f1
      Marko Mäkelä authored
      We introduce an EXTENDED log record for initializing an undo log page.
      The size of the record will be 2 bytes plus the optional page identifier.
      The entire undo page will be initialized, except the space that is
      already reserved for TRX_UNDO_SEG_HDR in trx_undo_seg_create().
      
      mtr_t::undo_create(): Write the UNDO_INIT record.
      
      trx_undo_page_init(): Initialize the undo page corresponding to the
      UNDO_INIT record. Unlike the former MLOG_UNDO_INIT record, we will
      initialize almost the entire page, including initializing the
      TRX_UNDO_PAGE_NODE to an empty list node, so that the subsequent call
      to flst_init() will avoid writing log for the undo page.
      86f262f1
    • Eugene Kosov's avatar
      revert accidental libmariadb change · 3ee100b0
      Eugene Kosov authored
      3ee100b0
    • Eugene Kosov's avatar
      fix libpmem InnoDB linking · 29bb3744
      Eugene Kosov authored
      29bb3744
    • Eugene Kosov's avatar
      remove unused function · e62e285f
      Eugene Kosov authored
      e62e285f
    • Eugene Kosov's avatar
      MDEV-14425 deprecate and ignore innodb_log_files_in_group · 9ef2d29f
      Eugene Kosov authored
      Now there can be only one log file instead of several which
      logically work as a single file.
      
      Possible names of redo log files: ib_logfile0,
      ib_logfile101 (for just created one)
      
      innodb_log_fiels_in_group: value of this variable is not used
      by InnoDB. Possible values are still 1..100, to not break upgrade
      
      LOG_FILE_NAME: add constant of value "ib_logfile0"
      LOG_FILE_NAME_PREFIX: add constant of value "ib_logfile"
      
      get_log_file_path(): convenience function that returns full
      path of a redo log file
      
      SRV_N_LOG_FILES_MAX: removed
      
      srv_n_log_files: we can't remove this for compatibility reasons,
      but now server doesn't use this variable
      
      log_sys_t::file::fd: now just one, not std::vector
      
      log_sys_t::log_capacity: removed word 'group'
      
      find_and_check_log_file(): part of logic from huge srv_start()
      moved here
      
      recv_sys_t::files: file descriptors of redo log files.
      There can be several of those in case we're upgrading
      from older MariaDB version.
      
      recv_sys_t::remove_extra_log_files: whether to remove
      ib_logfile{1,2,3...} after successfull upgrade.
      
      recv_sys_t::read(): open if needed and read from one
      of several log files
      
      recv_sys_t::files_size(): open if needed and return files count
      
      redo_file_sizes_are_correct(): check that redo log files
      sizes are equal. Just to log an error for a user.
      Corresponding check was moved from srv0start.cc
      
      namespace deprecated: put all deprecated variables here to
      prevent usage of it by us, developers
      9ef2d29f
    • Jan Lindström's avatar
      Update wsrep-lib submodule. · 8d7a8e45
      Jan Lindström authored
      8d7a8e45