1. 14 Jun, 2020 2 commits
    • Marko Mäkelä's avatar
      MDEV-22889: Disable innodb.innodb_force_recovery_rollback · ad5edf3c
      Marko Mäkelä authored
      The test case that was added for MDEV-21217
      (commit b68f1d84)
      should have only two possible outcomes for the locking SELECT statement:
      
      (1) The statement is blocked, and the test will eventually fail
      with a lock wait timeout. This is what I observed when the
      code fix for MDEV-21217 was missing.
      
      (2) The lock conflict will ensure that the statement will execute
      after the rollback has completed, and an empty table will be observed.
      This is the expected outcome with the recovery fix.
      
      What occasionally happens (in some of our CI environments only, so far)
      is that the locking SELECT will return all 1,000 rows of the table that
      had been inserted by the transaction that was never supposed to be
      committed. One possibility is that the transaction was unexpectedly
      committed when the server was killed.
      
      Let us disable the test until the reason of the failure has been
      determined and addressed.
      ad5edf3c
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 3dbc49f0
      Marko Mäkelä authored
      3dbc49f0
  2. 13 Jun, 2020 7 commits
    • Sergei Golubchik's avatar
      MDEV-22884 Assertion `grant_table || grant_table_role' failed on perfschema · 9ed08f35
      Sergei Golubchik authored
      when allowing access via perfschema callbacks, update
      the cached GRANT_INFO to match
      9ed08f35
    • Sergei Golubchik's avatar
      MDEV-21560 Assertion `grant_table || grant_table_role' failed in check_grant_all_columns · b58586aa
      Sergei Golubchik authored
      With RETURNING it can happen that the user has some privileges on
      the table (namely, DELETE), but later needs different privileges
      on individual columns (namely, SELECT).
      
      Do the same as in check_grant_column() - ER_COLUMNACCESS_DENIED_ERROR,
      not an assert.
      b58586aa
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · 80534093
      Marko Mäkelä authored
      80534093
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · d83a4432
      Marko Mäkelä authored
      d83a4432
    • Marko Mäkelä's avatar
      MDEV-21217 innodb_force_recovery=2 may wrongly abort rollback · b68f1d84
      Marko Mäkelä authored
      trx_roll_must_shutdown(): Correct the condition that detects
      the start of shutdown.
      b68f1d84
    • Marko Mäkelä's avatar
      MDEV-22190 InnoDB: Apparent corruption of an index page ... to be written · 574ef380
      Marko Mäkelä authored
      An InnoDB check for the validity of index pages would occasionally fail
      in the test encryption.innodb_encryption_discard_import.
      
      An analysis of a "rr replay" failure trace revealed that the problem
      basically is a combination of two old anomalies, and a recently
      implemented optimization in MariaDB 10.5.
      
      MDEV-15528 allows InnoDB to discard buffer pool pages that were freed.
      
      PageBulk::init() will disable the InnoDB validity check, because
      during native ALTER TABLE (rebuilding tables or creating indexes)
      we could write inconsistent index pages to data files.
      
      In the occasional test failure, page 8:6 would have been written
      from the buffer pool to the data file and subsequently freed.
      
      However, fil_crypt_thread may perform dummy writes to pages that
      have been freed. In case we are causing an inconsistent page to
      be re-encrypted on page flush, we should disable the check.
      
      In the analyzed "rr replay" trace, a fil_crypt_thread attempted
      to access page 8:6 twice after it had been freed.
      On the first call, buf_page_get_gen(..., BUF_PEEK_IF_IN_POOL, ...)
      returned NULL. The second call succeeded, and shortly thereafter,
      the server intentionally crashed due to writing the corrupted page.
      574ef380
    • Alexander Barkov's avatar
      MDEV-22268 virtual longlong Item_func_div::int_op(): Assertion `0' failed in Item_func_div::int_op · 6c30bc21
      Alexander Barkov authored
      Item_func_div::fix_length_and_dec_temporal() set the return data type to
      integer in case of @div_precision_increment==0 for temporal input with FSP=0.
      This caused Item_func_div to call int_op(), which is not implemented,
      so a crash on DBUG_ASSERT(0) happened.
      
      Fixing fix_length_and_dec_temporal() to set the result type to DECIMAL.
      6c30bc21
  3. 12 Jun, 2020 20 commits
    • Sidney Cammeresi's avatar
      when printing Item_in_optimizer, use precedence of wrapped Item · 114a8436
      Sidney Cammeresi authored
      when Item::print() is called with the QT_PARSABLE flag, WHERE i NOT IN
      (SELECT ...) gets printed as WHERE !i IN (SELECT ...) instead of WHERE
      !(i in (SELECT ...)) because Item_in_optimizer returns DEFAULT_PRECEDENCE.
      it should return the precedence of the inner operation.
      114a8436
    • Varun Gupta's avatar
      MDEV-22840: JSON_ARRAYAGG gives wrong results with NULL values and ORDER by clause · ab9bd628
      Varun Gupta authored
      The problem here is similar to the case with DISTINCT, the tree used for ORDER BY
      needs to also hold the null bytes of the record. This was not done for GROUP_CONCAT
      as NULLS are rejected by GROUP_CONCAT.
      
      Also introduced a comparator function for the order by tree to handle null
      values with JSON_ARRAYAGG.
      ab9bd628
    • Varun Gupta's avatar
      MDEV-22011: DISTINCT with JSON_ARRAYAGG gives wrong results · 0f6f0daa
      Varun Gupta authored
      For DISTINCT to be handled with JSON_ARRAYAGG, we need to make sure
      that the Unique tree also holds the NULL bytes of a table record
      inside the node of the tree. This behaviour for JSON_ARRAYAGG is
      different from GROUP_CONCAT because in GROUP_CONCAT we just reject
      NULL values for columns.
      
      Also introduced a comparator function for the unique tree to handle null
      values for distinct inside JSON_ARRAYAGG.
      0f6f0daa
    • Varun Gupta's avatar
      MDEV-11563: GROUP_CONCAT(DISTINCT ...) may produce a non-distinct list · a006e88c
      Varun Gupta authored
      Backported from MYSQL
      Bug #25331425: DISTINCT CLAUSE DOES NOT WORK IN GROUP_CONCAT
      Issue:
      ------
      The problem occurs when:
      1) GROUP_CONCAT (DISTINCT ....) is used in the query.
      2) Data size greater than value of system variable:
      tmp_table_size.
      
      The result would contain values that are non-unique.
      
      Root cause:
      -----------
      An in-memory structure is used to filter out non-unique
      values. When the data size exceeds tmp_table_size, the
      overflow is written to disk as a separate file. The
      expectation here is that when all such files are merged,
      the full set of unique values can be obtained.
      
      But the Item_func_group_concat::add function is in a bit of
      hurry. Even as it is adding values to the tree, it wants to
      decide if a value is unique and write it to the result
      buffer. This works fine if the configured maximum size is
      greater than the size of the data. But since tmp_table_size
      is set to a low value, the size of the tree is smaller and
      hence requires the creation of multiple copies on disk.
      
      Item_func_group_concat currently has no mechanism to merge
      all the copies on disk and then generate the result. This
      results in duplicate values.
      
      Solution:
      ---------
      In case of the DISTINCT clause, don't write to the result
      buffer immediately. Do the merge and only then put the
      unique values in the result buffer. This has be done in
      Item_func_group_concat::val_str.
      
      Note regarding result file changes:
      -----------------------------------
      Earlier when a unique value was seen in
      Item_func_group_concat::add, it was dumped to the output.
      So result is in the order stored in SE. But with this fix,
      we wait until all the data is read and the final set of
      unique values are written to output buffer. So the data
      appears in the sorted order.
      
      This only fixes the cases when we have DISTINCT without ORDER BY clause
      in GROUP_CONCAT.
      a006e88c
    • Sergei Petrunia's avatar
      MDEV-15101: Stop ANALYZE TABLE from flushing table definition cache · fd1755e4
      Sergei Petrunia authored
      Part#2: forgot to commit the adjustments for the testcases.
      fd1755e4
    • Marko Mäkelä's avatar
      MDEV-22867 Assertion instant.n_core_fields == n_core_fields failed · 43120009
      Marko Mäkelä authored
      This is a race condition where a table on which a 10.3-style
      instant ADD COLUMN is emptied during the execution of
      ALTER TABLE ... DROP COLUMN ..., DROP INDEX ..., ALGORITHM=NOCOPY.
      
      In commit 2c4844c9 the
      function instant_metadata_lock() would prevent this race condition.
      But, it would also hold a page latch on the leftmost leaf page of
      clustered index for the duration of a possible DROP INDEX operation.
      
      The race could be fixed by restoring the function
      instant_metadata_lock() that was removed in
      commit ea37b144
      but it would be more future-proof to prevent the
      dict_index_t::clear_instant_add() call from being issued at all.
      
      We at some point support DROP COLUMN ..., ADD INDEX ..., ALGORITHM=NOCOPY
      and that would spend a non-trivial amount of
      execution time in ha_innobase::inplace_alter(),
      making a server hang possible. Currently this is not supported
      and our added test case will notice when the support is introduced.
      
      dict_index_t::must_avoid_clear_instant_add(): Determine if
      a call to clear_instant_add() must be avoided.
      
      btr_discard_only_page_on_level(): Preserve the metadata record
      if must_avoid_clear_instant_add() holds.
      
      btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
      Do not remove the metadata record even if the table becomes empty
      but must_avoid_clear_instant_add() holds.
      
      btr_pcur_store_position(): Relax a debug assertion.
      
      This is joint work with Thirunarayanan Balathandayuthapani.
      43120009
    • Sergei Petrunia's avatar
      MDEV-15101: Stop ANALYZE TABLE from flushing table definition cache · d7d80689
      Sergei Petrunia authored
      Apply this patch from Percona Server (amended for 10.5):
      
      commit cd7201514fee78aaf7d3eb2b28d2573c76f53b84
      Author: Laurynas Biveinis <laurynas.biveinis@gmail.com>
      Date:   Tue Nov 14 06:34:19 2017 +0200
      
          Fix bug 1704195 / 87065 / TDB-83 (Stop ANALYZE TABLE from flushing table definition cache)
      
          Make ANALYZE TABLE stop flushing affected tables from the table
          definition cache, which has the effect of not blocking any subsequent
          new queries involving the table if there's a parallel long-running
          query:
      
          - new table flag HA_ONLINE_ANALYZE, return it for InnoDB and TokuDB
            tables;
          - in mysql_admin_table, if we are performing ANALYZE TABLE, and the
            table flag is set, do not remove the table from the table
            definition cache, do not invalidate query cache;
          - in partitioning handler, refresh the query optimizer statistics
            after ANALYZE if the underlying handler supports HA_ONLINE_ANALYZE;
          - new testcases main.percona_nonflushing_analyze_debug,
            parts.percona_nonflushing_abalyze_debug and a supporting debug sync
            point.
      
          For TokuDB, this change exposes bug TDB-83 (Index cardinality stats
          updated for handler::info(HA_STATUS_CONST), not often enough for
          tokudb_cardinality_scale_percent). TokuDB may return different
          rec_per_key values depending on dynamic variable
          tokudb_cardinality_scale_percent value. The server does not have a way
          of knowing that changing this variable invalidates the previous
          rec_per_key values in any opened table shares, and so does not call
          info(HA_STATUS_CONST) again. Fix by updating rec_per_key for both
          HA_STATUS_CONST and HA_STATUS_VARIABLE. This also forces a re-record
          of tokudb.bugs.db756_card_part_hash_1_pick, with the new output
          seeming to be more correct.
      d7d80689
    • Thirunarayanan Balathandayuthapani's avatar
    • Marko Mäkelä's avatar
      MDEV-22877 Avoid unnecessary buf_pool.page_hash S-latch acquisition · d2c593c2
      Marko Mäkelä authored
      MDEV-15053 did not remove all unnecessary buf_pool.page_hash S-latch
      acquisition. There are code paths where we are holding buf_pool.mutex
      (which will sufficiently protect buf_pool.page_hash against changes)
      and unnecessarily acquire the latch. Many invocations of
      buf_page_hash_get_locked() can be replaced with the much simpler
      buf_pool.page_hash_get_low().
      
      In the worst case the thread that is holding buf_pool.mutex will become
      a victim of MDEV-22871, suffering from a spurious reader-reader conflict
      with another thread that genuinely needs to acquire a buf_pool.page_hash
      S-latch.
      
      In many places, we were also evaluating page_id_t::fold() while holding
      buf_pool.mutex. Low-level functions such as buf_pool.page_hash_get_low()
      must get the page_id_t::fold() as a parameter.
      
      buf_buddy_relocate(): Defer the hash_lock acquisition to the critical
      section that starts by calling buf_page_t::can_relocate().
      d2c593c2
    • Sergei Golubchik's avatar
      more mysql_create_view link/unlink woes · 0b5dc626
      Sergei Golubchik authored
      0b5dc626
    • Sergei Golubchik's avatar
      MDEV-22878 galera.wsrep_strict_ddl hangs in 10.5 after merge · fb70eb77
      Sergei Golubchik authored
      if mysql_create_view is aborted when `view` isn't unlinked,
      it should not be linked back on cleanup
      fb70eb77
    • Andrei Elkin's avatar
      efa67ee0
    • Oleksandr Byelkin's avatar
    • Vicențiu Ciorbaru's avatar
      MDEV-22834: Disks plugin - change datatype to bigint · 8ec21afc
      Vicențiu Ciorbaru authored
      On large hard disks (> 2TB), the plugin won't function correctly, always
      showing 2 TB of available space due to integer overflow. Upgrade table
      fields to bigint to resolve this problem.
      8ec21afc
    • Andrei Elkin's avatar
      MDEV-21851: Error in BINLOG_BASE64_EVENT i s always error-logged as if it is done by Slave · e156a8da
      Andrei Elkin authored
      The prefix of error log message out of a failed BINLOG applying
      is corrected to be the sql command name.
      e156a8da
    • Aleksey Midenkov's avatar
      MDEV-22602 Disable UPDATE CASCADE for SQL constraints · 762bf7a0
      Aleksey Midenkov authored
      CHECK constraint is checked by check_expression() which walks its
      items and gets into Item_field::check_vcol_func_processor() to check
      for conformity with foreign key list.
      
      WITHOUT OVERLAPS is checked for same conformity in
      mysql_prepare_create_table().
      
      Long uniques are already impossible with InnoDB foreign keys. See
      ER_CANT_CREATE_TABLE in test case.
      
      2 accompanying bugs fixed (test main.constraints failed):
      
      1. check->name.str lived on SP execute mem_root while "check" obj
      itself lives on SP main mem_root. On second SP execute check->name.str
      had garbage data. Fixed by allocating from thd->stmt_arena->mem_root
      which is SP main mem_root.
      
      2. CHECK_CONSTRAINT_IF_NOT_EXISTS value was mixed with
      VCOL_FIELD_REF. VCOL_FIELD_REF is assigned in check_expression() and
      then detected as CHECK_CONSTRAINT_IF_NOT_EXISTS in
      handle_if_exists_options().
      
      Existing cases for MDEV-16932 in main.constraints cover both fixes.
      762bf7a0
    • Vicențiu Ciorbaru's avatar
      Fix wrong merge of commit d218d1aa · 2fd2fd77
      Vicențiu Ciorbaru authored
      2fd2fd77
    • Varun Gupta's avatar
      MDEV-22119: main.innodb_ext_key fails sporadically · 02c255d1
      Varun Gupta authored
      Made the test stable by adding more rows so the range scan is cheaper than table scan.
      02c255d1
    • Alexander Barkov's avatar
      MDEV-22499 Assertion `(uint) (table_check_constraints -... · f9e53a65
      Alexander Barkov authored
      MDEV-22499 Assertion `(uint) (table_check_constraints - share->check_constraints) == (uint) (share->table_check_constraints - share->field_check_constraints)' failed in TABLE_SHARE::init_from_binary_frm_image
      
      The patch for MDEV-22111 fixed MDEV-22499 as well.
      Adding tests only.
      f9e53a65
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-8139 Fix Scrubbing · c92f7e28
      Thirunarayanan Balathandayuthapani authored
      fil_space_t::freed_ranges: Store ranges of freed page numbers.
      
      fil_space_t::last_freed_lsn: Store the most recent LSN of
      freeing a page.
      
      fil_space_t::freed_mutex: Protects freed_ranges, last_freed_lsn.
      
      fil_space_create(): Initialize the freed_range mutex.
      
      fil_space_free_low(): Frees the freed_range mutex.
      
      range_set: Ranges of page numbers.
      
      buf_page_create(): Removes the page from freed_ranges when page
      is being reused.
      
      btr_free_root(): Remove the PAGE_INDEX_ID invalidation. Because
      btr_free_root() and dict_drop_index_tree() are executed in
      the same atomic mini-transaction, there is no need to
      invalidate the root page.
      
      buf_release_freed_page(): Split from buf_flush_freed_page().
      Skip any I/O
      
      buf_flush_freed_pages(): Get the freed ranges from tablespace and
      Write punch-hole or zeroes of the freed ranges.
      
      buf_flush_try_neighbors(): Handles the flushing of freed ranges.
      
      mtr_t::freed_pages: Variable to store the list of freed pages.
      
      mtr_t::add_freed_pages(): To add freed pages.
      
      mtr_t::clear_freed_pages(): To clear the freed pages.
      
      mtr_t::m_freed_in_system_tablespace: Variable to indicate whether page has
      been freed in system tablespace.
      
      mtr_t::m_trim_pages: Variable to indicate whether the space has been trimmed.
      
      mtr_t::commit(): Add the freed page and update the last freed lsn
      in the tablespace and clear the tablespace freed range if space is
      trimmed.
      
      file_name_t::freed_pages: Store the freed pages during recovery.
      
      file_name_t::add_freed_page(), file_name_t::remove_freed_page(): To
      add and remove freed page during recovery.
      
      store_freed_or_init_rec(): Store or remove the freed pages while
      encountering FREE_PAGE or INIT_PAGE redo log record.
      
      recv_init_crash_recovery_spaces(): Add the freed page encountered
      during recovery to respective tablespace.
      c92f7e28
  4. 11 Jun, 2020 11 commits
    • Sergei Golubchik's avatar
      post-fix for #1504 · 07d1c856
      Sergei Golubchik authored
      07d1c856
    • Sergei Golubchik's avatar
      MDEV-22812 "failed to create symbolic link" during the build · d3f47482
      Sergei Golubchik authored
      as cmake manual says
      
        If a sequential execution of multiple commands is required, use multiple
        ``execute_process()`` calls with a single ``COMMAND`` argument.
      d3f47482
    • Vicențiu Ciorbaru's avatar
      Merge branch '10.1' into 10.2 · 8c67ffff
      Vicențiu Ciorbaru authored
      8c67ffff
    • Varun Gupta's avatar
      MDEV-21831: Assertion `length == pack_length()' failed in... · 35acf39b
      Varun Gupta authored
      MDEV-21831: Assertion `length == pack_length()' failed in Field_inet6::sort_string upon INSERT into RocksDB table
      
      For INET6 columns the values are stored as BINARY columns and returned to the client in TEXT format.
      For rocksdb the indexes store mem-comparable images for columns, so use the pack_length() to store
      the mem-comparable form for INET6 columns. This would also remain consistent with CHAR columns.
      35acf39b
    • Marko Mäkelä's avatar
      MDEV-22850 Reduce buf_pool.page_hash latch contention · 757e756d
      Marko Mäkelä authored
      For reads, the buf_pool.page_hash is protected by buf_pool.mutex or
      by the hash_lock. There is no need to compute or acquire hash_lock
      if we are not modifying the buf_pool.page_hash.
      
      However, the buf_pool.page_hash latch must be held exclusively
      when changing buf_page_t::in_file(), or if we desire to prevent
      buf_page_t::can_relocate() or buf_page_t::buf_fix_count()
      from changing.
      
      rw_lock_lock_word_decr(): Add a comment that explains the polling logic.
      
      buf_page_t::set_state(): When in_file() is to be changed, assert that
      an exclusive buf_pool.page_hash latch is being held. Unfortunately
      we cannot assert this for set_state(BUF_BLOCK_REMOVE_HASH) because
      set_corrupt_id() may already have been called.
      
      buf_LRU_free_page(): Check buf_page_t::can_relocate() before
      aqcuiring the hash_lock.
      
      buf_block_t::initialise(): Initialize also page.buf_fix_count().
      
      buf_page_create(): Initialize buf_fix_count while not holding
      any mutex or hash_lock. Acquire the hash_lock only for the
      duration of inserting the block to the buf_pool.page_hash.
      
      buf_LRU_old_init(), buf_LRU_add_block(),
      buf_page_t::belongs_to_unzip_LRU(): Do not assert buf_page_t::in_file(),
      because buf_page_create() will invoke buf_LRU_add_block()
      before acquiring hash_lock and buf_page_t::set_state().
      
      buf_pool_t::validate(): Rely on the buf_pool.mutex and do not
      unnecessarily acquire any buf_pool.page_hash latches.
      
      buf_page_init_for_read(): Clarify that we must acquire the hash_lock
      upfront in order to prevent a race with buf_pool_t::watch_remove().
      757e756d
    • Alexander Barkov's avatar
      MDEV-21619 Server crash or assertion failures in my_datetime_to_str · e835881c
      Alexander Barkov authored
      Item_cache_datetime::decimals was always copied from example->decimals
      without limiting to 6 (maximum possible fractional digits), so
      val_str() later crashed on asserts inside my_time_to_str() and
      my_datetime_to_str().
      e835881c
    • Kentoku SHIBA's avatar
    • Marko Mäkelä's avatar
      MDEV-22863: Fix GCC 4.8.5 -Wconversion · c9f262ee
      Marko Mäkelä authored
      This regression was introduced in
      commit dd77f072 (MDEV-22841).
      c9f262ee
    • Marko Mäkelä's avatar
      MDEV-22865 compilation failure on win32-debug · 7de4458d
      Marko Mäkelä authored
      ut_filename_hash(): Add better casts to please the compiler:
      
      warning C4307: '*': integral constant overflow
      
      This regression was introduced in
      commit dd77f072 (MDEV-22841).
      7de4458d
    • Daniel Black's avatar
      MDEV-22864: cmake/libutils account for cmake-2.8.12.1 · d6af055c
      Daniel Black authored
      That doesn't support STRING(APPEND ..)
      d6af055c
    • Varun Gupta's avatar
      MDEV-22819: Wrong result or Assertion `ix > 0' failed in read_to_buffer upon... · ade0f40f
      Varun Gupta authored
      MDEV-22819: Wrong result or Assertion `ix > 0' failed in read_to_buffer upon select with GROUP BY and GROUP_CONCAT
      
      In the merge_buffers phase for sorting, the sort buffer size is divided between the number of chunks.
      The chunks have a start and end position (m_buffer_start and m_buffer_end).
      Then we read the as many records that fit in this buffer for a chunk of the file.
      The issue here was we were resetting the end of buffer(m_buffer_end) to the number of bytes that was
      read, this was causing a problem because with dynamic size of sort keys it is possible that later
      we would not be able to accommodate even one key inside a chunk of file.
      So the fix was to not reset the end of buffer for a chunk of file.
      ade0f40f