1. 28 Apr, 2017 1 commit
    • Marko Mäkelä's avatar
      MDEV-12602 InnoDB: Failing assertion: space->n_pending_ops == 0 · b82c602d
      Marko Mäkelä authored
      This fixes a regression caused by MDEV-12428.
      When we introduced a variant of fil_space_acquire() that could
      increment space->n_pending_ops after space->stop_new_ops was set,
      the logic of fil_check_pending_operations() was broken.
      
      fil_space_t::n_pending_ios: A new field to track read or write
      access from the buffer pool routines immediately before a block
      write or after a block read in the file system.
      
      fil_space_acquire_for_io(), fil_space_release_for_io(): Similar
      to fil_space_acquire_silent() and fil_space_release(), but
      modify fil_space_t::n_pending_ios instead of fil_space_t::n_pending_ops.
      
      Adjust a number of places accordingly, and remove some redundant
      tablespace lookups.
      
      The following parts of this fix differ from the 10.2 version of this fix:
      
      buf_page_get_corrupt(): Add a tablespace parameter.
      
      In 10.2, we already had a two-phase process of freeing fil_space objects
      (first, fil_space_detach(), then release fil_system->mutex, and finally
      free the fil_space and fil_node objects).
      
      fil_space_free_and_mutex_exit(): Renamed from fil_space_free().
      Detach the tablespace from the fil_system cache, release the
      fil_system->mutex, and then wait for space->n_pending_ios to reach 0,
      to avoid accessing freed data in a concurrent thread.
      During the wait, future calls to fil_space_acquire_for_io() will
      not find this tablespace, and the count can only be decremented to 0,
      at which point it is safe to free the objects.
      
      fil_node_free_part1(), fil_node_free_part2(): Refactored from
      fil_node_free().
      b82c602d
  2. 27 Apr, 2017 27 commits
  3. 26 Apr, 2017 1 commit
    • Jan Lindström's avatar
      MDEV-12253: Buffer pool blocks are accessed after they have been freed · 765a4360
      Jan Lindström authored
      Problem was that bpage was referenced after it was already freed
      from LRU. Fixed by adding a new variable encrypted that is
      passed down to buf_page_check_corrupt() and used in
      buf_page_get_gen() to stop processing page read.
      
      This patch should also address following test failures and
      bugs:
      
      MDEV-12419: IMPORT should not look up tablespace in
      PageConverter::validate(). This is now removed.
      
      MDEV-10099: encryption.innodb_onlinealter_encryption fails
      sporadically in buildbot
      
      MDEV-11420: encryption.innodb_encryption-page-compression
      failed in buildbot
      
      MDEV-11222: encryption.encrypt_and_grep failed in buildbot on P8
      
      Removed dict_table_t::is_encrypted and dict_table_t::ibd_file_missing
      and replaced these with dict_table_t::file_unreadable. Table
      ibd file is missing if fil_get_space(space_id) returns NULL
      and encrypted if not. Removed dict_table_t::is_corrupted field.
      
      Ported FilSpace class from 10.2 and using that on buf_page_check_corrupt(),
      buf_page_decrypt_after_read(), buf_page_encrypt_before_write(),
      buf_dblwr_process(), buf_read_page(), dict_stats_save_defrag_stats().
      
      Added test cases when enrypted page could be read while doing
      redo log crash recovery. Also added test case for row compressed
      blobs.
      
      btr_cur_open_at_index_side_func(),
      btr_cur_open_at_rnd_pos_func(): Avoid referencing block that is
      NULL.
      
      buf_page_get_zip(): Issue error if page read fails.
      
      buf_page_get_gen(): Use dberr_t for error detection and
      do not reference bpage after we hare freed it.
      
      buf_mark_space_corrupt(): remove bpage from LRU also when
      it is encrypted.
      
      buf_page_check_corrupt(): @return DB_SUCCESS if page has
      been read and is not corrupted,
      DB_PAGE_CORRUPTED if page based on checksum check is corrupted,
      DB_DECRYPTION_FAILED if page post encryption checksum matches but
      after decryption normal page checksum does not match. In read
      case only DB_SUCCESS is possible.
      
      buf_page_io_complete(): use dberr_t for error handling.
      
      buf_flush_write_block_low(),
      buf_read_ahead_random(),
      buf_read_page_async(),
      buf_read_ahead_linear(),
      buf_read_ibuf_merge_pages(),
      buf_read_recv_pages(),
      fil_aio_wait():
              Issue error if page read fails.
      
      btr_pcur_move_to_next_page(): Do not reference page if it is
      NULL.
      
      Introduced dict_table_t::is_readable() and dict_index_t::is_readable()
      that will return true if tablespace exists and pages read from
      tablespace are not corrupted or page decryption failed.
      Removed buf_page_t::key_version. After page decryption the
      key version is not removed from page frame. For unencrypted
      pages, old key_version is removed at buf_page_encrypt_before_write()
      
      dict_stats_update_transient_for_index(),
      dict_stats_update_transient()
              Do not continue if table decryption failed or table
              is corrupted.
      
      dict0stats.cc: Introduced a dict_stats_report_error function
      to avoid code duplication.
      
      fil_parse_write_crypt_data():
              Check that key read from redo log entry is found from
              encryption plugin and if it is not, refuse to start.
      
      PageConverter::validate(): Removed access to fil_space_t as
      tablespace is not available during import.
      
      Fixed error code on innodb.innodb test.
      
      Merged test cased innodb-bad-key-change5 and innodb-bad-key-shutdown
      to innodb-bad-key-change2.  Removed innodb-bad-key-change5 test.
      Decreased unnecessary complexity on some long lasting tests.
      
      Removed fil_inc_pending_ops(), fil_decr_pending_ops(),
      fil_get_first_space(), fil_get_next_space(),
      fil_get_first_space_safe(), fil_get_next_space_safe()
      functions.
      
      fil_space_verify_crypt_checksum(): Fixed bug found using ASAN
      where FIL_PAGE_END_LSN_OLD_CHECKSUM field was incorrectly
      accessed from row compressed tables. Fixed out of page frame
      bug for row compressed tables in
      fil_space_verify_crypt_checksum() found using ASAN. Incorrect
      function was called for compressed table.
      
      Added new tests for discard, rename table and drop (we should allow them
      even when page decryption fails). Alter table rename is not allowed.
      Added test for restart with innodb-force-recovery=1 when page read on
      redo-recovery cant be decrypted. Added test for corrupted table where
      both page data and FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION is corrupted.
      
      Adjusted the test case innodb_bug14147491 so that it does not anymore
      expect crash. Instead table is just mostly not usable.
      
      fil0fil.h: fil_space_acquire_low is not visible function
      and fil_space_acquire and fil_space_acquire_silent are
      inline functions. FilSpace class uses fil_space_acquire_low
      directly.
      
      recv_apply_hashed_log_recs() does not return anything.
      765a4360
  4. 25 Apr, 2017 1 commit
  5. 24 Apr, 2017 2 commits
  6. 21 Apr, 2017 8 commits