1. 28 Aug, 2024 3 commits
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 0e76c1ba
      Marko Mäkelä authored
      0e76c1ba
    • Marko Mäkelä's avatar
      MDEV-34802 Recovery fails to note some log corruption · 1ff6b6f0
      Marko Mäkelä authored
      recv_recovery_from_checkpoint_start(): Abort startup due to log
      corruption if we were unable to parse the entire log between
      the latest log checkpoint and the corresponding FILE_CHECKPOINT record.
      
      Also, reduce some code bloat related to log output and log_sys.mutex.
      
      Reviewed by: Debarun Banerjee
      1ff6b6f0
    • Marko Mäkelä's avatar
      MDEV-34803 innodb_lru_flush_size is no longer used · bda40ccb
      Marko Mäkelä authored
      In commit fa8a46eb (MDEV-33613)
      the parameter innodb_lru_flush_size ceased to have any effect.
      
      Let us declare the parameter as deprecated and additionally as
      MARIADB_REMOVED_OPTION, so that there will be a warning written
      to the error log in case the option is specified in the command line.
      
      Let us also do the same for the parameter
      innodb_purge_rseg_truncate_frequency
      that was deprecated&ignored earlier in MDEV-32050.
      
      Reviewed by: Debarun Banerjee
      bda40ccb
  2. 27 Aug, 2024 5 commits
  3. 26 Aug, 2024 9 commits
    • Kristian Nielsen's avatar
      Fix sporadic failure of test case rpl.rpl_start_stop_slave · 8642453c
      Kristian Nielsen authored
      The test was expecting the I/O thread to be in a specific state, but thread
      scheduling may cause it to not yet have reached that state. So just have a
      loop that waits for the expected state to occur.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      8642453c
    • Kristian Nielsen's avatar
    • Kristian Nielsen's avatar
      Fix sporadic failure of test case rpl.rpl_old_master · 214e6c5b
      Kristian Nielsen authored
      Remove the test for MDEV-14528. This is supposed to test that parallel
      replication from pre-10.0 master will update Seconds_Behind_Master. But
      after MDEV-12179 the SQL thread is blocked from even beginning to fetch
      events from the relay log due to FLUSH TABLES WITH READ LOCK, so the test
      case is no longer testing what is was intended to. And pre-10.0 versions are
      long since out of support, so does not seem worthwhile to try to rewrite the
      test to work another way.
      
      The root cause of the test failure is MDEV-34778. Briefly, depending on
      exact timing during slave stop, the rli->sql_thread_caught_up flag may end
      up with different value. If it ends up as "true", this causes
      Seconds_Behind_Master to be 0 during next slave start; and this caused test
      case timeout as the test was waiting for Seconds_Behind_Master to become
      non-zero.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      214e6c5b
    • Kristian Nielsen's avatar
      Fix sporadic test failure in rpl.rpl_create_drop_event · 7dc4ea56
      Kristian Nielsen authored
      Depending on timing, an extra event run could start just when the event
      scheduler is shut down and delay running until after the table has been
      dropped; this would cause the test to fail with a "table does not exist"
      error in the log.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      7dc4ea56
    • Kristian Nielsen's avatar
      Restore skiping rpl.rpl_mdev6020 under Valgrind · 33854d73
      Kristian Nielsen authored
      (Revert a change done by mistake when XtraDB was removed.)
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      33854d73
    • Kristian Nielsen's avatar
      MDEV-34696: do_gco_wait() completes too early on InnoDB dict stats updates · b4c2e239
      Kristian Nielsen authored
      Before doing mark_start_commit(), check that there is no pending deadlock
      kill. If there is a pending kill, we won't commit (we will abort, roll back,
      and retry). Then we should not mark the commit as started, since that could
      potentially make the following GCO start too early, before we completed the
      commit after the retry.
      
      This condition could trigger in some corner cases, where InnoDB would take
      temporarily table/row locks that are released again immediately, not held
      until the transaction commits. This happens with dict_stats updates and
      possibly auto-increment locks.
      
      Such locks can be passed to thd_rpl_deadlock_check() and cause a deadlock
      kill to be scheduled in the background. But since the blocking locks are
      held only temporarily, they can be released before the background kill
      happens. This way, the kill can be delayed until after mark_start_commit()
      has been called. Thus we need to check the synchronous indication
      rgi->killed_for_retry, not just the asynchroneous thd->killed.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      b4c2e239
    • Marko Mäkelä's avatar
      MDEV-34515: Reduce context switching in purge · 76f6b6d8
      Marko Mäkelä authored
      Before this patch, the InnoDB purge coordinator task submitted
      innodb_purge_threads-1 tasks even if there was not sufficient amount
      of work for all of them. For example, if there are undo log records
      only for 1 table, only 1 task can be employed, and that task had better
      be the purge coordinator.
      
      srv_purge_worker_task_low(): Split from purge_worker_callback().
      
      trx_purge_attach_undo_recs(): Remove the parameter n_purge_threads,
      and add the parameter n_work_items, to keep track of the amount of
      work.
      
      trx_purge(): Launch purge worker tasks only if necessary. The work of
      one thread will be executed by this purge coordinator thread.
      
      que_fork_scheduler_round_robin(): Merged to trx_purge().
      
      Thanks to Vladislav Vaintroub for supplying a prototype of this.
      
      Reviewed by: Debarun Banerjee
      76f6b6d8
    • Marko Mäkelä's avatar
      MDEV-34515: Contention between purge and workload · b7b9f3ce
      Marko Mäkelä authored
      In a Sysbench oltp_update_index workload that involves 1 table,
      a serious contention between the workload and the purge of history
      was observed. This was the worst when the table contained only 1 record.
      
      This turned out to be fixed by setting innodb_purge_batch_size=128,
      which corresponds to the number of usable persistent rollback segments.
      When we go above that, there would be contention between row_purge_poss_sec()
      and the workload, typically on the clustered index page latch, sometimes
      also on a secondary index page latch. It might be that with smaller
      batches, trx_sys.history_size() will end up pausing all concurrent
      transaction start/commit frequently enough so that purge will be able
      to make some progress, so that there would be less contention on the
      index page latches between purge and SQL execution.
      
      In commit aa719b50 (part of MDEV-32050)
      the interpretation of the parameter innodb_purge_batch_size was slightly
      changed. It would correspond to the maximum desired size of the
      purge_sys.pages cache. Before that change, the parameter was referring to
      a number of undo log pages, but the accounting might have been inaccurate.
      
      To avoid a regression, we will reduce the default value to
      innodb_purge_batch_size=127, which will also be compatible with
      innodb_undo_tablespaces>1 (which will disable rollback segment 0).
      
      Additionally, some logic in the purge and MVCC checks is simplified.
      The purge tasks will make use of purge_sys.pages when accessing undo
      log pages to find out if a secondary index record can be removed.
      If an undo page needs to be looked up in buf_pool.page_hash, we will
      merely buffer-fix it. This is correct, because the undo pages are
      append-only in nature. Holding purge_sys.latch or purge_sys.end_latch
      or the fact that the current thread is executing as a part of an
      in-progress purge batch will prevent the contents of the undo page from
      being freed and subsequently reused. The buffer-fix will prevent the
      page from being evicted form the buffer pool. Thanks to this logic,
      we can refer to the undo log record directly in the buffer pool page
      and avoid copying the record.
      
      buf_pool_t::page_fix(): Look up and buffer-fix a page. This is useful
      for accessing undo log pages, which are append-only by nature.
      There will be no need to deal with change buffer or ROW_FORMAT=COMPRESSED
      in that case.
      
      purge_sys_t::view_guard::view_guard(): Allow the type of guard to be
      acquired: end_latch, latch, or no latch (in case we are a purge thread).
      
      purge_sys_t::view_guard::get(): Read-only accessor to purge_sys.pages.
      
      purge_sys_t::get_page(): Invoke buf_pool_t::page_fix().
      
      row_vers_old_has_index_entry(): Replaced with row_purge_is_unsafe()
      and row_undo_mod_sec_unsafe().
      
      trx_undo_get_undo_rec(): Merged to trx_undo_prev_version_build().
      
      row_purge_poss_sec(): Add the parameter mtr and remove redundant
      or unused parameters sec_pcur, sec_mtr, is_tree. We will use the
      caller's mtr object but release any acquired page latches before
      returning.
      
      btr_cur_get_page(), page_cur_get_page(): Do not invoke page_align().
      
      row_purge_remove_sec_if_poss_leaf(): Return the value of PAGE_MAX_TRX_ID
      to be checked against the page in row_purge_remove_sec_if_poss_tree().
      If the secondary index page was not changed meanwhile, it will be
      unnecessary to invoke row_purge_poss_sec() again.
      
      trx_undo_prev_version_build(): Access any undo log pages using
      the caller's mini-transaction object.
      
      row_purge_vc_matches_cluster(): Moved to the only compilation unit that
      needs it.
      
      Reviewed by: Debarun Banerjee
      b7b9f3ce
    • Marko Mäkelä's avatar
      MDEV-34520 purge_sys_t::wait_FTS sleeps 10ms, even if it does not have to · d58734d7
      Marko Mäkelä authored
      There were two separate Atomic_counter<uint32_t>, purge_sys.m_SYS_paused
      and purge_sys.m_FTS_paused. In purge_sys.wait_FTS() we have to read both
      atomically. We used to use an overkill solution for this, acquiring
      purge_sys.latch and waiting 10 milliseconds between samples. To make
      matters worse, the 10-millisecond wait was unconditional, which would
      unnecessarily suspend the purge_coordinator_task every now and then.
      
      It turns out that we can fold both "reference counts" into a single
      Atomic_relaxed<uint32_t> and avoid the purge_sys.latch.
      To assess whether std::memory_order_relaxed is acceptable, we should
      consider the operations that read these "reference counts", that is,
      purge_sys_t::wait_FTS(bool) and purge_sys_t::must_wait_FTS().
      
      Outside debug assertions, purge_sys.must_wait_FTS() is only invoked in
      trx_purge_table_acquire(), which is covered by a shared dict_sys.latch.
      We would increment the counter as part of a DDL operation, but before
      acquiring an exclusive dict_sys.latch. So, a
      purge_sys_t::close_and_reopen() loop could be triggered slightly
      prematurely, before a problematic DDL operation is actually executed.
      Decrementing the counter is less of an issue; purge_sys.resume_FTS()
      or purge_sys.resume_SYS() would mostly be invoked while holding an
      exclusive dict_sys.latch; ha_innobase::delete_table() does it outside
      that critical section. Still, this would only cause some extra wait in
      the purge_coordinator_task, just like at the start of a DDL operation.
      
      There are two calls to purge_sys_t::wait_FTS(bool): in the above mentioned
      purge_sys_t::close_and_reopen() and in purge_sys_t::clone_oldest_view(),
      both invoked by the purge_coordinator_task. There is also a
      purge_sys.clone_oldest_view<true>() call at startup when no DDL operation
      can be in progress.
      
      purge_sys_t::m_SYS_paused: Merged into m_FTS_paused, using a new
      multiplier PAUSED_SYS = 65536.
      
      purge_sys_t::wait_FTS(): Remove an unnecessary sleep as well as the
      access to purge_sys.latch. It suffices to poll purge_sys.m_FTS_paused.
      
      purge_sys_t::stop_FTS(): Do not acquire purge_sys.latch.
      
      Reviewed by: Debarun Banerjee
      d58734d7
  4. 25 Aug, 2024 1 commit
    • Sergei Petrunia's avatar
      Trivial fix: Make test_if_cheaper_ordering() use actual_rec_per_key() · 9020baf1
      Sergei Petrunia authored
      Discovered this while working on MDEV-34720: test_if_cheaper_ordering()
      uses rec_per_key, while the original estimate for the access method
      is produced in best_access_path() by using actual_rec_per_key().
      
      Make test_if_cheaper_ordering() also use actual_rec_per_key().
      Also make several getter function "const" to make this compile.
      Also adjusted the testcase to handle this (the change backported from
      11.0)
      9020baf1
  5. 23 Aug, 2024 1 commit
    • Marko Mäkelä's avatar
      MDEV-34759: buf_page_get_low() is unnecessarily acquiring exclusive latch · 9db2b327
      Marko Mäkelä authored
      buf_page_ibuf_merge_try(): A new, separate function for invoking
      ibuf_merge_or_delete_for_page() when needed. Use the already requested
      page latch for determining if the call is necessary. If it is and
      if we are currently holding rw_latch==RW_S_LATCH, upgrading to an exclusive
      latch may involve waiting that another thread acquires and releases
      a U or X latch on the page. If we have to wait, we must recheck if the
      call to ibuf_merge_or_delete_for_page() is still needed. If the page
      turns out to be corrupted, we will release and fail the operation.
      Finally, the exclusive page latch will be downgraded to the originally
      requested latch.
      
      ssux_lock_impl::rd_u_upgrade_try(): Attempt to upgrade a shared lock to
      an update lock.
      
      sux_lock::s_x_upgrade_try(): Attempt to upgrade a shared lock to
      exclusive.
      
      sux_lock::s_x_upgrade(): Upgrade a shared lock to exclusive.
      Return whether a wait was elided.
      
      ssux_lock_impl::u_rd_downgrade(), sux_lock::u_s_downgrade():
      Downgrade an update lock to shared.
      9db2b327
  6. 22 Aug, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-34799: "Could not write packet" err message args off by 1 · 3e5e97b2
      Brandon Nesterenko authored
      MDEV-33582 (3541bd63) changed the "Could not write packet"
      error message in net_serv.cc to use the function
      sql_print_warning(), instead of my_printf_error(). The flags
      argument was not removed in this change though, so the old
      flags were printed in place of the file descriptor, and all
      other args are presenting for the wrong field (and length is
      never showed).
      
      This patch removes flags as a parameter to sql_print_warning().
      3e5e97b2
  7. 21 Aug, 2024 3 commits
    • Monty's avatar
      MDEV-34043 Drastically slower query performance between CentOS (2sec) and Rocky (48sec) · 1f040ae0
      Monty authored
      One cause of the slowdown is because the ftruncate call can be much
      slower on some systems.  ftruncate() is called by Aria for internal
      temporary tables, tables created by the optimizer, when the upper level
      asks Aria to delete the previous result set. This is needed when some
      content from previous tables changes.
      
      I have now changed Aria so that for internal temporary tables we don't
      call ftruncate() anymore for maria_delete_all_rows().
      
      I also had to update the Aria repair code to use the logical datafile
      size and not the on-disk datafile size, which may contain data from a
      previous result set.  The repair code is called to create indexes for
      the internal temporary table after it is filled.
      I also replaced a call to mysql_file_size() with a pwrite() in
      _ma_bitmap_create_first().
      
      Reviewer: Sergei Petrunia <sergey@mariadb.com>
      Tester: Dave Gosselin <dave.gosselin@mariadb.com>
      1f040ae0
    • Oleksandr Byelkin's avatar
      fix MDEV-34771 & MDEV-34776 · eadf0f63
      Oleksandr Byelkin authored
      removed duplicated methods
      eadf0f63
    • Oleksandr Byelkin's avatar
  8. 20 Aug, 2024 2 commits
  9. 19 Aug, 2024 5 commits
    • Oleksandr Byelkin's avatar
      MDEV-34776 Assertion failure in Item_string::do_build_clone · ae02999c
      Oleksandr Byelkin authored
      Added missed methods to Item_string child.
      ae02999c
    • Oleksandr Byelkin's avatar
      MDEV-34771 Types mismatch when cloning items causes debug assertion · fccfdc28
      Oleksandr Byelkin authored
      Missing methods added to Item_bin_string
      fccfdc28
    • Monty's avatar
      Sort result from table_statistics and index_statistics · db8ab4ac
      Monty authored
      This is needed as the order of rows are not deterministic,
      especially in future versions of table statistics.
      db8ab4ac
    • Monty's avatar
      Revert "mtr: remove not_valgrind_build" · e51d55a6
      Monty authored
      The original code is correct.
      
      valgrind and asan binaries should be built with a specialiced version of
      mem_root that makes it easier to find memory overwrites.
      This is what the BUILD scripts is doing.
      
      The specialiced mem_root code allocates a new block for every allocation
      which is visiable for any test that depenmds on the default original malloc
      size and usage.
      e51d55a6
    • Dmitry Shulga's avatar
      MDEV-34718: Trigger doesn't work correctly with bulk update · ba5482ff
      Dmitry Shulga authored
      Running an UPDATE statement in PS mode and having positional
      parameter(s) bound with an array of actual values (that is
      prepared to be run in bulk mode) results in incorrect behaviour
      in presence of on update trigger that also executes an UPDATE
      statement. The same is true for handling a DELETE statement in
      presence of on delete trigger. Typically, the visible effect of
      such incorrect behaviour is expressed in a wrong number of
      updated/deleted rows of a target table. Additionally, in case UPDATE
      statement, a number of modified rows and a state message returned
      by a statement contains wrong information about a number of modified rows.
      
      The reason for incorrect number of updated/deleted rows is that
      a data structure used for binding positional argument with its
      actual values is stored in THD (this is thd->bulk_param) and reused
      on processing every INSERT/UPDATE/DELETE statement. It leads to
      consuming actual values bound with top-level UPDATE/DELETE statement
      by other DML statements used by triggers' body.
      
      To fix the issue, reset the thd->bulk_param temporary to the value
      nullptr before invoking triggers and restore its value on finishing
      its execution.
      
      The second part of the problem relating with wrong value of affected
      rows reported by Connector/C API is caused by the fact that diagnostics
      area is reused by an original DML statement and a statement invoked
      by a trigger. This fact should be take into account on finalizing a
      state of diagnostics area on completion running of a statement.
      
      Important remark: in case the macros DBUG_OFF is on, call of the method
        Diagnostics_area::reset_diagnostics_area()
      results in reset of the data members
        m_affected_rows, m_statement_warn_count.
      Values of these data members of the class Diagnostics_area are used on
      sending OK and EOF messages. In case DML statement is executed in PS bulk
      mode such resetting results in sending wrong result values to a client
      for affected rows in case the DML statement fires a triggers. So, reset
      these data members only in case the current statement being processed
      is not run in bulk mode.
      ba5482ff
  10. 15 Aug, 2024 3 commits
  11. 14 Aug, 2024 3 commits
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 757c3681
      Marko Mäkelä authored
      757c3681
    • Marko Mäkelä's avatar
      MDEV-34755 g++- -Wstringop-truncation due to safe_strcpy() · ecd910ae
      Marko Mäkelä authored
      The #pragma that was removed in
      commit e255837e (MDEV-34266)
      turns out to be necessary for silencing all cases of
      -Wstringop-truncation.
      ecd910ae
    • Marko Mäkelä's avatar
      MDEV-34678 pthread_mutex_init() without pthread_mutex_destroy() · 4f8803c0
      Marko Mäkelä authored
      When SUX_LOCK_GENERIC is defined, the srw_mutex, srw_lock, sux_lock are
      implemented based on pthread_mutex_t and pthread_cond_t.  This is the
      only option for systems that lack a futex-like system call.
      
      In the SUX_LOCK_GENERIC mode, if pthread_mutex_init() is allocating
      some resources that need to be freed by pthread_mutex_destroy(),
      a memory leak could occur when we are repeatedly invoking
      pthread_mutex_init() without a pthread_mutex_destroy() in between.
      
      pthread_mutex_wrapper::initialized: A debug field to track whether
      pthread_mutex_init() has been invoked.  This also helps find bugs
      like the one that was fixed by
      commit 1c8af2ae (MDEV-34422);
      one simply needs to add -DSUX_LOCK_GENERIC to the CMAKE_CXX_FLAGS
      to catch that particular bug on the initial server bootstrap.
      
      buf_block_init(), buf_page_init_for_read(): Invoke block_lock::init()
      because buf_page_t::init() will no longer do that.
      
      buf_page_t::init(): Instead of invoking lock.init(), assert that it
      has already been invoked (the lock is vacant).
      
      add_fts_index(), build_fts_hidden_table(): Explicitly invoke
      index_lock::init() in order to avoid a pthread_mutex_destroy()
      invocation on an uninitialized object.
      
      srw_lock_debug::destroy(): Invoke readers_lock.destroy().
      
      trx_sys_t::create(): Invoke trx_rseg_t::init() on all rollback segments
      in order to guarantee a deterministic state for shutdown, even if
      InnoDB fails to start up.
      
      trx_rseg_array_init(), trx_temp_rseg_create(), trx_rseg_create():
      Invoke trx_rseg_t::destroy() before trx_rseg_t::init() in order to
      balance pthread_mutex_init() and pthread_mutex_destroy() calls.
      4f8803c0
  12. 13 Aug, 2024 1 commit
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-14231 MATCH() AGAINST( IN BOOLEAN MODE), results mismatch · b304ec30
      Thirunarayanan Balathandayuthapani authored
      - Added plugin_debug.test, multiple_index.test to innodb_fts suite
      from mysql-5.7.
      
      - commit c5b28e55 removes the warning
      for InnoDB rebuilding table to add FTS_DOC_ID
      
      - multiple_index test case  has MATCH(a) values are smaller
      than in MySQL because ROLLBACK updates the stat_n_rows.
      
      - st_mysql_ftparser_boolean_info structure conveys boolean
      metadata to mysql search engine for every word in the query.
      This structure misses the position value to store the correct
      offset of every word. So phrase search queries in plugin_debug
      test case with boolean mode for simple parser throws
      wrong result.
      b304ec30
  13. 12 Aug, 2024 3 commits
    • Julius Goryavsky's avatar
      MDEV-30686: Endless loop when trying to establish connection · 2c5d8376
      Julius Goryavsky authored
      With wsrep_sst_rsync, node goes into endless loop when trying
      to establish connection to donor for IST/SST if the database
      is bind on specific IP address, not the "*".
      
      This commit fixes this problem. Separate tests are not
      required - the problem can occur in normal configurations
      on a number of systems when selecting a bing address other
      than "*", especially on FreeBSD and with the IPv6 addresses.
      2c5d8376
    • Jan Lindström's avatar
      MDEV-34594 : Assertion `client_state.transaction().active()' failed in · cd8b8bb9
      Jan Lindström authored
      int wsrep_thd_append_key(THD*, const wsrep_key*, int, Wsrep_service_key_type)
      
      CREATE TABLE [SELECT|REPLACE SELECT] is CTAS and idea was that
      we force ROW format. However, it was not correctly enforced
      and keys were appended before wsrep transaction was started.
      
      At THD::decide_logging_format we should force used stmt binlog
      format to ROW in CTAS case and produce a warning if used
      binlog format was not ROW.
      
      At ha_innodb::update_row we should not append keys similarly
      as in ha_innodb::write_row if sql_command is SQLCOM_CREATE_TABLE.
      Improved error logging on ::write_row, ::update_row and ::delete_row
      if wsrep key append fails.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      cd8b8bb9
    • Alexander Barkov's avatar
      MDEV-34376 Wrong data types when mixing an utf8 *TEXT column and a short binary · 0e273510
      Alexander Barkov authored
      A mixture of a multi-byte *TEXT column and a short binary column
      produced a too large column.
      For example, COALESCE(tinytext_utf8mb4, short_varbinary)
      produced a BLOB column instead of an expected TINYBLOB.
      
      - Adding a virtual method Type_all_attributes::character_octet_length(),
        returning max_length by default.
      - Overriding Item_field::character_octet_length() to extract
        the octet length from the underlying Field.
      - Overriding Item_ref::character_octet_length() to extract
        the octet length from the references Item (e.g. as VIEW fields).
      - Fixing Type_numeric_attributes::find_max_octet_length() to
        take the octet length using the new method character_octet_length()
        instead of accessing max_length directly.
      0e273510