1. 17 Apr, 2019 2 commits
    • Marko Mäkelä's avatar
      MDEV-12699 Improve crash recovery of corrupted data pages · 169c0099
      Marko Mäkelä authored
      InnoDB crash recovery used to read every data page for which
      redo log exists. This is unnecessary for those pages that are
      initialized by the redo log. If a newly created page is corrupted,
      recovery could unnecessarily fail. It would suffice to reinitialize
      the page based on the redo log records.
      
      To add insult to injury, InnoDB crash recovery could hang if it
      encountered a corrupted page. We will fix also that problem.
      InnoDB would normally refuse to start up if it encounters a
      corrupted page on recovery, but that can be overridden by
      setting innodb_force_recovery=1.
      
      Data pages are completely initialized by the records
      MLOG_INIT_FILE_PAGE2 and MLOG_ZIP_PAGE_COMPRESS.
      MariaDB 10.4 additionally recognizes MLOG_INIT_FREE_PAGE,
      which notifies that a page has been freed and its contents
      can be discarded (filled with zeroes).
      
      The record MLOG_INDEX_LOAD notifies that redo logging has
      been re-enabled after being disabled. We can avoid loading
      the page if all buffered redo log records predate the
      MLOG_INDEX_LOAD record.
      
      For the internal tables of FULLTEXT INDEX, no MLOG_INDEX_LOAD
      records were written before commit aa3f7a10.
      Hence, we will skip these optimizations for tables whose
      name starts with FTS_.
      
      This is joint work with Thirunarayanan Balathandayuthapani.
      
      fil_space_t::enable_lsn, file_name_t::enable_lsn: The LSN of the
      latest recovered MLOG_INDEX_LOAD record for a tablespace.
      
      mlog_init: Page initialization operations discovered during
      redo log scanning. FIXME: This really belongs in recv_sys->addr_hash,
      and should be removed in MDEV-19176.
      
      recv_addr_state: Add the new state RECV_WILL_NOT_READ to
      indicate that according to mlog_init, the page will be
      initialized based on redo log record contents.
      
      recv_add_to_hash_table(): Set the RECV_WILL_NOT_READ state
      if appropriate. For now, we do not treat MLOG_ZIP_PAGE_COMPRESS
      as page initialization. This works around bugs in the crash
      recovery of ROW_FORMAT=COMPRESSED tables.
      
      recv_mark_log_index_load(): Process a MLOG_INDEX_LOAD record
      by resetting the state to RECV_NOT_PROCESSED and by updating
      the fil_name_t::enable_lsn.
      
      recv_init_crash_recovery_spaces(): Copy fil_name_t::enable_lsn
      to fil_space_t::enable_lsn.
      
      recv_recover_page(): Add the parameter init_lsn, to ignore
      any log records that precede the page initialization.
      Add DBUG output about skipped operations.
      
      buf_page_create(): Initialize FIL_PAGE_LSN, so that
      recv_recover_page() will not wrongly skip applying
      the page-initialization record due to the field containing
      some newer LSN as a leftover from a different page.
      Do not invoke ibuf_merge_or_delete_for_page() during
      crash recovery.
      
      recv_apply_hashed_log_recs(): Remove some unnecessary lookups.
      Note if a corrupted page was found during recovery.
      After invoking buf_page_create(), do invoke
      ibuf_merge_or_delete_for_page() via mlog_init.ibuf_merge()
      in the last recovery batch.
      
      ibuf_merge_or_delete_for_page(): Relax a debug assertion.
      
      innobase_start_or_create_for_mysql(): Abort startup if
      a corrupted page was found during recovery. Corrupted pages
      will not be flagged if innodb_force_recovery is set.
      However, the recv_sys->found_corrupt_fs flag can be set
      regardless of innodb_force_recovery if file names are found
      to be incorrect (for example, multiple files with the same
      tablespace ID).
      169c0099
    • Marko Mäkelä's avatar
      MDEV-19241 InnoDB fails to write MLOG_INDEX_LOAD upon completing ALTER TABLE · 376bf4ed
      Marko Mäkelä authored
      Similar to what was done in commit aa3f7a10
      for FULLTEXT INDEX, we must ensure that MLOG_INDEX_LOAD records will always
      be written if redo logging was disabled.
      
      row_merge_build_indexes(): Invoke row_merge_write_redo() also when
      online operation is not being executed or an error occurs.
      In case of an error, invoke flush_observer->interrupted() so that
      the pages will not be flushed but merely evicted from the buffer pool.
      Before resuming redo logging, it is crucial for the correctness of
      mariabackup and InnoDB crash recovery to flush or evict all affected pages
      and to write MLOG_INDEX_LOAD records.
      376bf4ed
  2. 15 Apr, 2019 1 commit
  3. 10 Apr, 2019 2 commits
  4. 09 Apr, 2019 1 commit
  5. 08 Apr, 2019 6 commits
  6. 07 Apr, 2019 5 commits
  7. 06 Apr, 2019 9 commits
    • Marko Mäkelä's avatar
      MDEV-12699 preparation: Clean up recv_sys · 1d30b7b1
      Marko Mäkelä authored
      The recv_sys data structures are accessed not only from the thread
      that executes InnoDB plugin initialization, but also from the
      InnoDB I/O threads, which can invoke recv_recover_page().
      
      Assert that sufficient concurrency control is in place.
      Some code was accessing recv_sys data structures without
      holding recv_sys->mutex.
      
      recv_recover_page(bpage): Refactor the call from buf_page_io_complete()
      into a separate function that performs necessary steps. The
      main thread was unnecessarily releasing and reacquiring recv_sys->mutex.
      
      recv_recover_page(block,mtr,recv_addr): Pass more parameters from
      the caller. Avoid redundant lookups and computations. Eliminate some
      redundant variables.
      
      recv_get_fil_addr_struct(): Assert that recv_sys->mutex is being held.
      That was not always the case!
      
      recv_scan_log_recs(): Acquire recv_sys->mutex for the whole duration
      of the function. (While we are scanning and buffering redo log records,
      no pages can be read in.)
      
      recv_read_in_area(): Properly protect access with recv_sys->mutex.
      
      recv_apply_hashed_log_recs(): Check recv_addr->state only once,
      and continuously hold recv_sys->mutex. The mutex will be released
      and reacquired inside recv_recover_page() and recv_read_in_area(),
      allowing concurrent processing by buf_page_io_complete() in I/O threads.
      1d30b7b1
    • Marko Mäkelä's avatar
      MDEV-12699 preparation: Write MLOG_INDEX_LOAD for FTS_ tables · aa3f7a10
      Marko Mäkelä authored
      The record MLOG_INDEX_LOAD is supposed to be written to indicate that
      some page modifications bypassed redo logging, and that redo logging
      is now re-enabled. It was not written for fulltext indexes during
      ALTER TABLE.
      
      row_merge_write_redo(): Declare globally. Assert that the index
      is neither a spatial nor fulltext index.
      
      recv_mlog_index_load(): Observe a MLOG_INDEX_LOAD operation.
      
      recv_parse_log_recs(): Handle MLOG_INDEX_LOAD also in multi-record
      mini-transactions. Because of this omission, we should keep writing
      MLOG_INDEX_LOAD in single-record mini-transactions, because older
      versions of Mariabackup would fail.
      
      row_fts_merge_insert(): Write MLOG_INDEX_LOAD for the auxiliary
      tables of fulltext indexes.
      aa3f7a10
    • Marko Mäkelä's avatar
      MDEV-12699 preparation: Initialize the entire page on MLOG_ZIP_PAGE_COMPRESS · 45d338dc
      Marko Mäkelä authored
      The record MLOG_ZIP_PAGE_COMPRESS is similar to MLOG_INIT_FILE_PAGE2
      that it contains all the information needed to initialize the page.
      Like for the other record, do initialize the entire page on recovery.
      45d338dc
    • Marko Mäkelä's avatar
      buf_page_get_gen(): Allow BUF_GET_IF_IN_POOL with a dummy page_size · 1b95118c
      Marko Mäkelä authored
      The page_size argument to buf_page_get_gen() only matters when the
      page is going to be loaded into the buffer pool. Allow callers to
      pass a dummy parameter when using BUF_GET_IF_IN_POOL (which would
      return NULL if the block is not in the buffer pool).
      1b95118c
    • Marko Mäkelä's avatar
      Fix a crash in CHECK TABLE for corrupted encrypted root page · 80f29211
      Marko Mäkelä authored
      btr_root_get(): Ignore the root->page.encrypted flag.
      The purpose of this flag is questionable since
      commit 8c43f963.
      
      btr_validate_index(): Avoid crash if btr_root_get() returns NULL.
      80f29211
    • Marko Mäkelä's avatar
      MDEV-15528 preparation: Do not modify a freed page · 1d0380e0
      Marko Mäkelä authored
      btr_free_root(): Add the parameter bool invalidate.
      
      btr_free_root_invalidate(): Remove.
      1d0380e0
    • Marko Mäkelä's avatar
      Clean up the parsing of MLOG_INIT_FILE_PAGE2 · 56df18be
      Marko Mäkelä authored
      fsp_apply_init_file_page(): Renamed from fsp_init_file_page_low().
      
      fsp_parse_init_file_page(): Remove. The redo log record has no
      parameters.
      56df18be
    • Marko Mäkelä's avatar
      recv_recovery_is_on(): Add UNIV_UNLIKELY · 71f9552f
      Marko Mäkelä authored
      Normally, InnoDB is not in the process of executing crash recovery.
      Provide a hint to the compiler that the recovery-related code paths
      are rarely executed.
      71f9552f
    • Marko Mäkelä's avatar
      Re-record plugins.feedback_plugin_load · c56ae2df
      Marko Mäkelä authored
      c56ae2df
  8. 05 Apr, 2019 1 commit
  9. 04 Apr, 2019 3 commits
  10. 03 Apr, 2019 10 commits