1. 20 Mar, 2019 5 commits
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · 96f8793a
      Marko Mäkelä authored
      96f8793a
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · f4116613
      Marko Mäkelä authored
      f4116613
    • Marko Mäkelä's avatar
      MDEV-18981 Possible corruption when using FOREIGN KEY with virtual columns · 630199e7
      Marko Mäkelä authored
      row_ins_foreign_fill_virtual(): Construct update->old_vrow
      with ROW_COPY_DATA instead of ROW_COPY_POINTERS. With the latter,
      the object would be pointing to a buffer pool page frame. That page
      frame can become stale and invalid as soon as
      row_ins_foreign_check_on_constraint() invokes mtr_t::commit().
      
      Most of the time, the pointer target is not going to be overwritten
      by anything, and everything appears to work correctly.
      Buffer pool page replacement is highly unlikely, and any pessimistic
      operation that would overwrite the old location of the record is only
      slightly more likely. It is not known whether there is an actual bug.
      This came up while diagnosing MDEV-18879 in MariaDB 10.3.
      630199e7
    • Marko Mäkelä's avatar
      MDEV-18879/MDEV-18972 Corrupted record inserted by FOREIGN KEY operation · b47cec6c
      Marko Mäkelä authored
      row_ins_foreign_check_on_constraint(): When constructing
      cascade->historical_row for tables WITH SYSTEM VERSIONING,
      use the appropriate mode ROW_COPY_DATA, because the pointers
      will be stale after mtr_commit() is invoked.
      b47cec6c
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · 514b305d
      Marko Mäkelä authored
      The MDEV-17262 commit 26432e49
      was skipped. In Galera 4, the implementation would seem to require
      changes to the streaming replication.
      
      In the tests archive.rnd_pos main.profiling, disable_ps_protocol
      for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
      has been fixed.
      514b305d
  2. 19 Mar, 2019 7 commits
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · 117291db
      Marko Mäkelä authored
      117291db
    • Marko Mäkelä's avatar
      trx_purge_rseg_get_next_history_log(): Remove a parameter · 26e5bff0
      Marko Mäkelä authored
      Access purge_sys.rseg directly, instead of obscuring it with a parameter.
      26e5bff0
    • Marko Mäkelä's avatar
      MDEV-18084: Crash on UPDATE after upgrade from 10.0 or 10.1 · a77e2668
      Marko Mäkelä authored
      MariaDB Server 10.0 and 10.1 support non-indexed virtual columns,
      which are hidden from the storage engine. Starting with MDEV-5800
      in MariaDB 10.2.2, the virtual columns are visible to storage engines.
      
      calc_row_difference(): Follow up the MDEV-17199 fix, which forgot
      to increment num_v when skipping virtual columns in tables that
      were created before MariaDB 10.2.2. This caused a corruption of
      the update vector when an updated persistent column is preceded
      by virtual columns.
      a77e2668
    • Marko Mäkelä's avatar
      Replace innobase_is_v_fld() with Field::stored_in_db() · 1efda582
      Marko Mäkelä authored
      The macro innobase_is_v_fld() turns out to be equivalent with
      the opposite of Field::stored_in_db(). Remove the macro and
      invoke the member function directly.
      
      innodb_base_col_setup_for_stored(): Simplify a condition to only
      check Field::vcol_info.
      
      innobase_create_index_def(): Replace some redundant code with
      DBUG_ASSERT().
      1efda582
    • Marko Mäkelä's avatar
      MDEV-18960: Assertion !omits_virtual_cols(*form->s) on TRUNCATE · 9471dbaf
      Marko Mäkelä authored
      MariaDB before MDEV-5800 in version 10.2.2 did not support
      indexed virtual columns. Non-persistent virtual columns were
      hidden from storage engines. Only starting with MDEV-5800, InnoDB
      would create internal metadata on virtual columns.
      
      On TRUNCATE TABLE, an old .frm file from before MDEV-5800 may be
      used as the table schema. When the table is being re-created by
      InnoDB, the old schema must be used. That is, we may hide
      the existence of virtual columns from InnoDB.
      
      create_table_check_doc_id_col(): Remove the assertion that failed.
      This function can actually correctly deal with virtual columns
      that could have been created before MariaDB 10.2.2 introduced MDEV-5800.
      
      create_table_info_t::create_table_def(): Do not create metadata for
      virtual columns if the table definition was created before MariaDB 10.2.2.
      9471dbaf
    • Marko Mäkelä's avatar
      MDEV-18966 Transaction recovery may be broken after upgrade to 10.3 · cdb2208c
      Marko Mäkelä authored
      This bug was introduced by MDEV-12288, which made InnoDB use
      a single undo log for persistent transactions, instead of
      maintaining separate insert_undo and update_undo logs.
      
      trx_undo_reuse_cached(): Initialize the TRX_UNDO_PAGE_TYPE
      after reusing a cached undo log page for undo log.
      Failure to do so can cause trx_undo_mem_create_at_db_start()
      to misclassify new undo log records as TRX_UNDO_INSERT.
      This in turn would trigger an assertion failure in
      trx_roll_pop_top_rec_of_trx() due to undo==insert.
      cdb2208c
    • Marko Mäkelä's avatar
      trx_purge_add_undo_to_history(): Non-functional cleanup · 6893e994
      Marko Mäkelä authored
      Simplify the debug code, and use mach_read_from_4() instead of
      the wrapper function mtr_read_ulint().
      6893e994
  3. 18 Mar, 2019 9 commits
    • Daniel Black's avatar
      MDEV-18726: innodb buffer pool size not consistent with large pages · de51acd0
      Daniel Black authored
      Rather than add a small extra amount on the size of chunks, keep it
      of the specified size. The rest of the chunk initialization code
      adapts to this small size reduction. This has been made in the general
      case, not just large pages, to keep it simple.
      
      The chunks size is controlled by innodb-buffer-pool-chunk-size. In the
      code increasing this by a descriptor table size length makes it
      difficult with large pages. With innodb-buffer-pool-chunk-size set to 2M
      the code before this commit would of added a small amount extra to this
      value when it tried to allocate this. While not normally a problem it is
      with large pages, it now requires addition space, a whole extra large
      page. With a number of pools, or with 1G or 16G large pages this is
      quite significant.
      
      By removing this additional amount, DBAs can set
      innodb-buffer-pool-chunk size to the large page size, or a multiple of
      it, and actually get that amount allocated. Previously they had to fudge
      a value less.
      
      The innodb.test results show how this is fudged over a number of tests. With
      this change the values are just between 488 and 500 depending on architecture
      and build options.
      
      Tested with  --large-pages --innodb-buffer-pool-size=256M
      --innodb-buffer-pool-chunk-size=2M on x86_64 with 2M default large page
      size. Breaking before buf_pool init, one large page was allocated in
      MyISAM, by the end of the function 128 huge pages where allocated as
      expected. A further 16 pages where allocated for a 32M log buffer and
      during startup 1 page was allocated briefly to the redo log.
      de51acd0
    • Marko Mäkelä's avatar
      MDEV-18644: Support full_crc32 for page_compressed · 6b6fa3cd
      Marko Mäkelä authored
      This is a follow-up task to MDEV-12026, which introduced
      innodb_checksum_algorithm=full_crc32 and a simpler page format.
      MDEV-12026 did not enable full_crc32 for page_compressed tables,
      which we will be doing now.
      
      This is joint work with Thirunarayanan Balathandayuthapani.
      
      For innodb_checksum_algorithm=full_crc32 we change the
      page_compressed format as follows:
      
      FIL_PAGE_TYPE: The most significant bit will be set to indicate
      page_compressed format. The least significant bits will contain
      the compressed page size, rounded up to a multiple of 256 bytes.
      
      The checksum will be stored in the last 4 bytes of the page
      (whether it is the full page or a page_compressed page whose
      size is determined by FIL_PAGE_TYPE), covering all preceding
      bytes of the page. If encryption is used, then the page will
      be encrypted between compression and computing the checksum.
      For page_compressed, FIL_PAGE_LSN will not be repeated at
      the end of the page.
      
      FSP_SPACE_FLAGS (already implemented as part of MDEV-12026):
      We will store the innodb_compression_algorithm that may be used
      to compress pages. Previously, the choice of algorithm was written
      to each compressed data page separately, and one would be unable
      to know in advance which compression algorithm(s) are used.
      
      fil_space_t::full_crc32_page_compressed_len(): Determine if the
      page_compressed algorithm of the tablespace needs to know the
      exact length of the compressed data. If yes, we will reserve and
      write an extra byte for this right before the checksum.
      
      buf_page_is_compressed(): Determine if a page uses page_compressed
      (in any innodb_checksum_algorithm).
      
      fil_page_decompress(): Pass also fil_space_t::flags so that the
      format can be determined.
      
      buf_page_is_zeroes(): Check if a page is full of zero bytes.
      
      buf_page_full_crc32_is_corrupted(): Renamed from
      buf_encrypted_full_crc32_page_is_corrupted(). For full_crc32,
      we always simply validate the checksum to the page contents,
      while the physical page size is explicitly specified by an
      unencrypted part of the page header.
      
      buf_page_full_crc32_size(): Determine the size of a full_crc32 page.
      
      buf_dblwr_check_page_lsn(): Make this a debug-only function, because
      it involves potentially costly lookups of fil_space_t.
      
      create_table_info_t::check_table_options(),
      ha_innobase::check_if_supported_inplace_alter(): Do allow the creation
      of SPATIAL INDEX with full_crc32 also when page_compressed is used.
      
      commit_cache_norebuild(): Preserve the compression algorithm when
      updating the page_compression_level.
      
      dict_tf_to_fsp_flags(): Set the flags for page compression algorithm.
      FIXME: Maybe there should be a table option page_compression_algorithm
      and a session variable to back it?
      6b6fa3cd
    • Marko Mäkelä's avatar
      Follow-up fix to MDEV-12026: FIL_SPACE_FLAGS trump fil_space_t::flags · 2151aed4
      Marko Mäkelä authored
      Whenever we are reading the first page of a data file, we may have to
      adjust the provisionally created fil_space_t::flags to match what is
      actually inside the data files. In this way, we will never accidentally
      change the format of a data file.
      
      fil_node_t::read_page0(): After validating the FIL_SPACE_FLAGS,
      always assign them to space->flags.
      
      btr_root_adjust_on_import(), Datafile::validate_to_dd(),
      fil_space_for_table_exists_in_mem(): Adapt to the fix
      in fil_node_t::read_page0().
      
      fsp_flags_try_adjust(): Skip the adjustment if full_crc32 is being
      used. This adjustment was introduced in MDEV-11623 for upgrading
      from MariaDB 10.1.0 to 10.1.20, which used an accidentally changed
      format of FIL_SPACE_FLAGS. MariaDB before 10.4.3 never set the
      flag that now indicates the full_crc32 format.
      2151aed4
    • Marko Mäkelä's avatar
      MDEV-17482 InnoDB fails to say which fatal error fsync() returned · 00572a0b
      Marko Mäkelä authored
      os_file_fsync_posix(): If fsync() returns a fatal error,
      do include errno in the error message.
      
      In the future, we might handle fsync() or write or allocation failures
      on InnoDB data files a little more gracefully: flag the affected index
      or table as corrupted, and deny any subsequent writes to the table.
      
      If a write to the undo log or redo log fails, an alternative to
      killing the server could be to deny any writes to InnoDB tables
      until the server has been restarted.
      00572a0b
    • Marko Mäkelä's avatar
      1d728a98
    • Marko Mäkelä's avatar
      Post-merge fix after 0508d327 · e3618a33
      Marko Mäkelä authored
      e3618a33
    • Marko Mäkelä's avatar
      MDEV-18946 munmap of 1 byte during shutdown is EINVAL · 397b6b13
      Marko Mäkelä authored
      In MDEV-10814, a missing argument caused a later optional argument
      (bool true) to be treated as a size. The unmap of this memory occurs
      during shutdown and resizing innodb buffer pool. As a result the
      memory is lost but still allocated until shutdown is completed.
      397b6b13
    • sysprg's avatar
      MDEV-17262: mysql crashed on galera while node rejoined cluster (#895) · 26432e49
      sysprg authored
      This patch contains a fix for the MDEV-17262/17243 issues and
      new mtr test.
      
      These issues (MDEV-17262/17243) have two reasons:
      
      1) After an intermediate commit, a transaction loses its status
      of "transaction that registered in the MySQL for 2pc coordinator"
      (in the InnoDB) due to the fact that since version 10.2 the
      write_row() function (which located in the ha_innodb.cc) does
      not call trx_register_for_2pc(m_prebuilt->trx) during the processing
      of split transactions. It is necessary to restore this call inside
      the write_row() when an intermediate commit was made (for a split
      transaction).
      
      Similarly, we need to set the flag of the started transaction
      (m_prebuilt->sql_stat_start) after intermediate commit.
      
      The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the
      wsrep_load_data_split() function (which located in sql_load.cc)
      will also do this, but it will be too late. As a result, the call
      to the wsrep_append_keys() function from the InnoDB engine may be
      lost or function may be called with invalid transaction identifier.
      
      2) If a transaction with the LOAD DATA statement is divided into
      logical mini-transactions (of the 10K rows) and binlog is rotated,
      then in rare cases due to the wsrep handler re-registration at the
      boundary of the split, the last portion of data may be lost. Since
      splitting of the LOAD DATA into mini-transactions is technical,
      I believe that we should not allow these mini-transactions to fall
      into separate binlogs. Therefore, it is necessary to prohibit the
      rotation of binlog in the middle of processing LOAD DATA statement.
      
      https://jira.mariadb.org/browse/MDEV-17262 and
      https://jira.mariadb.org/browse/MDEV-17243
      26432e49
    • sachinsetia1001@gmail.com's avatar
  4. 17 Mar, 2019 6 commits
  5. 16 Mar, 2019 3 commits
    • Igor Babaev's avatar
      MDEV-18945 Assertion `fixed == 1' failed in Item_cond_and::val_int · 5e044f78
      Igor Babaev authored
      In the function make_cond_for_table_from_pred a call of ix_fields()
      missed checking of the return code. As a result an extracted constant
      condition could be not well formed and this caused an assertion failure.
      5e044f78
    • Daniel Black's avatar
      MDEV-18946: innodb: {de|}allocate_large_{dodump|dontdump} added · a9056a2b
      Daniel Black authored
      In 1dc78d35a0beb9620bae1f4841cc07389b425707 the arguments
      to a deallocate_large(dontdump=true) was passed a wrong value.
      
      To avoid accidential calling large memory function that have
      DODUMP/DONTDUMP options and missing arguments, the functions
      have been given distinct names.
      a9056a2b
    • Daniel Black's avatar
      MDEV-18946: innodb: buffer_pool - unallocate large pages requires size · 8678a105
      Daniel Black authored
      MDEV-10814 introduce a bug where the size argument to
      deallocate_large was passed true, evaluating to 1, as the size.
      
      When this is passed to munmap this resulted in EINVAL and the
      page not being released. This only occured the buf_pool_free_instance
      when called on shutdown so no impact as the process termination
      correctly frees the memory.
      8678a105
  6. 15 Mar, 2019 10 commits