An error occurred fetching the project authors.
  1. 17 Sep, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-33500 (part 2): rpl.rpl_parallel_sbm can still fail · 68938d2b
      Brandon Nesterenko authored
      The failing test case validates Seconds_Behind_Master for a delayed
      slave, while STOP SLAVE is executed during a delay. The test fixes
      initially added to the test (commit b04c8575) added a table lock
      to ensure a transaction could not finish before validating the
      Seconds_Behind_Master field after SLAVE START, but did not address a
      possibility that the transaction could finish before running the
      STOP SLAVE command, which invalidates the validations for the rest
      of the test case. Specifically, this would result in 1) a timeout in
      “Waiting for table metadata lock” on the replica, which expects the
      transaction to retry after slave restart and hit a lock conflict on
      the locked tables (added in b04c8575), and 2) that
      Seconds_Behind_Master should have increased, but did not.
      
      The failure can be reproduced by synchronizing the slave to the master
      before the MDEV-32265 echo statement (i.e. before the SLAVE STOP).
      
      This patch fixes the test by adding a mechanism to use DEBUG_SYNC to
      synchronize a MASTER_DELAY, rather than continually increase the
      duration of the delay each time the test fails on buildbot. This is
      to ensure that on slow machines, a delay does not pass before the
      test gets a chance to validate results. Additionally, it decreases
      overall test time because the test can continue immediately after
      validation, thereby bypassing the remainder of a full delay for each
      transaction.
      68938d2b
  2. 08 Jul, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-32892: IO Thread Reports False Error When Stopped During Connecting to Primary · 744580d5
      Brandon Nesterenko authored
      The IO thread can report error code 2013 into the error log when it
      is stopped during the initial connection process to the primary, as
      well as when trying to read an event. However, because the IO thread
      is being stopped, its connection to the primary is force-killed by
      the signaling thread (see THD::awake_no_mutex()), and thereby these
      connection errors should be ignored.
      
      Reviewed By:
      ============
      Kristian Nielsen <knielsen@knielsen-hq.org>
      744580d5
  3. 06 May, 2024 1 commit
    • Julius Goryavsky's avatar
      MDEV-34071: Failure during the galera_3nodes_sr.GCF-336 test · 52c45332
      Julius Goryavsky authored
      This commit fixes sporadic failures in galera_3nodes_sr.GCF-336
      test. The following changes have been made here:
      
      1) A small addition to the test itself which should make
         it more deterministic by waiting for non-primary state
         before COMMIT;
      2) More careful handling of the wsrep_ready variable in
         the server code (it should always be protected with mutex).
      
      No additional tests are required.
      52c45332
  4. 05 May, 2024 1 commit
  5. 20 Apr, 2024 1 commit
    • Kristian Nielsen's avatar
      MDEV-19415: use-after-free on charsets_dir from slave connect · 57f6a1ca
      Kristian Nielsen authored
      The slave IO thread sets MYSQL_SET_CHARSET_DIR. The code for this option
      however is not thread-safe in sql-common/client.c. The value set is
      temporarily written to mysys global variable `charsets-dir` and can be seen
      by other threads running in parallel, which can result in use-after-free
      error.
      
      Problem was visible as random failures of test cases in suite multi_source
      with Valgrind or MSAN.
      
      Work-around by not setting this option for slave connect, it is redundant
      anyway as it is just setting the default value.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      57f6a1ca
  6. 13 Feb, 2024 1 commit
  7. 30 Jan, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-33327: rpl_seconds_behind_master_spike Sensitive to IO Thread Stop Position · c75905ca
      Brandon Nesterenko authored
      rpl.rpl_seconds_behind_master_spike uses the DEBUG_SYNC mechanism to
      count how many format descriptor events (FDEs) have been executed,
      to attempt to pause on a specific relay log FDE after executing
      transactions. However, depending on when the IO thread is stopped,
      it can send an extra FDE before sending the transactions, forcing
      the test to pause before executing any transactions, resulting in a
      table not existing, that is attempted to be read for COUNT.
      
      This patch fixes this by no longer counting FDEs, but rather by
      programmatically waiting until the SQL thread has executed the
      transaction and then automatically activating the DEBUG_SYNC point
      to trigger at the next relay log FDE.
      c75905ca
  8. 29 Jan, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-33327: rpl_seconds_behind_master_spike Sensitive to IO Thread Stop Position · e4f221a5
      Brandon Nesterenko authored
      rpl.rpl_seconds_behind_master_spike uses the DEBUG_SYNC mechanism to
      count how many format descriptor events (FDEs) have been executed,
      to attempt to pause on a specific relay log FDE after executing
      transactions. However, depending on when the IO thread is stopped,
      it can send an extra FDE before sending the transactions, forcing
      the test to pause before executing any transactions, resulting in a
      table not existing, that is attempted to be read for COUNT.
      
      This patch fixes this by no longer counting FDEs, but rather by
      programmatically waiting until the SQL thread has executed the
      transaction and then automatically activating the DEBUG_SYNC point
      to trigger at the next relay log FDE.
      e4f221a5
  9. 19 Dec, 2023 1 commit
  10. 11 Dec, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-10653: SHOW SLAVE STATUS Can Deadlock an Errored Slave · 8dad5148
      Brandon Nesterenko authored
      AKA rpl.rpl_parallel, binlog_encryption.rpl_parallel fails in
      buildbot with timeout in include
      
      A replication parallel worker thread can deadlock with another
      connection running SHOW SLAVE STATUS. That is, if the replication
      worker thread is in do_gco_wait() and is killed, it will already
      hold the LOCK_parallel_entry, and during error reporting, try to
      grab the err_lock. SHOW SLAVE STATUS, however, grabs these locks in
      reverse order. It will initially grab the err_lock, and then try to
      grab LOCK_parallel_entry. This leads to a deadlock when both threads
      have grabbed their first lock without the second.
      
      This patch implements the MDEV-31894 proposed fix to optimize the
      workers_idle() check to compare the last in-use relay log’s
      queued_count==dequeued_count for idleness. This removes the need for
      workers_idle() to grab LOCK_parallel_entry, as these values are
      atomically updated.
      
      Huge thanks to Kristian Nielsen for diagnosing the problem!
      
      Reviewed By:
      ============
      Kristian Nielsen <knielsen@knielsen-hq.org>
      Andrei Elkin <andrei.elkin@mariadb.com>
      8dad5148
  11. 28 Nov, 2023 1 commit
  12. 16 Nov, 2023 1 commit
  13. 23 Oct, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-32265: seconds_behind_master is inaccurate for Delayed replication · c5f776e9
      Brandon Nesterenko authored
      If a replica is actively delaying a transaction when restarted (STOP
      SLAVE/START SLAVE), when the sql thread is back up,
      Seconds_Behind_Master will present as 0 until the configured
      MASTER_DELAY has passed. That is, before the restart,
      last_master_timestamp is updated to the timestamp of the delayed
      event. Then after the restart, the negation of sql_thread_caught_up
      is skipped because the timestamp of the event has already been used
      for the last_master_timestamp, and their update is grouped together
      in the same conditional block.
      
      This patch fixes this by separating the negation of
      sql_thread_caught_up out of the timestamp-dependent block, so it is
      called any time an idle parallel slave queues an event to a worker.
      
      Note that sql_thread_caught_up is still left in the check for internal
      events, as SBM should remain idle in such case to not "magically" begin
      incrementing.
      
      Reviewed By:
      ============
      Andrei Elkin <andrei.elkin@mariadb.com>
      c5f776e9
  14. 13 Sep, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-31177: SHOW SLAVE STATUS Last_SQL_Errno Race Condition on Errored Slave Restart · 1407f999
      Brandon Nesterenko authored
      The SQL thread and a user connection executing SHOW SLAVE STATUS
      have a race condition on Last_SQL_Errno, such that a slave which
      previously errored and stopped, on its next start, SHOW SLAVE STATUS
      can show that the SQL Thread is running while the previous error is
      also showing.
      
      The fix is to move when the last error is cleared when the SQL
      thread starts to occur before setting the status of
      Slave_SQL_Running.
      
      Thanks to Kristian Nielson for his work diagnosing the problem!
      
      Reviewed By:
      ============
      Andrei Elkin <andrei.elkin@mariadb.com>
      Kristian Nielson <knielsen@knielsen-hq.org>
      1407f999
  15. 12 Sep, 2023 1 commit
    • sjaakola's avatar
      MDEV-31833 replication breaks when using optimistic replication and replica is a galera node · a3cbc44b
      sjaakola authored
      MariaDB async replication SQL thread was stopped for any failure
      in applying of replication events and error message logged for the failure
      was: "Node has dropped from cluster". The assumption was that event applying
      failure is always due to node dropping out.
      With optimistic parallel replication, event applying can fail for natural
      reasons and applying should be retried to handle the failure. This retry
      logic was never exercised because the slave SQL thread was stopped with first
      applying failure.
      
      To support optimistic parallel replication retrying logic this commit will
      now skip replication slave abort, if node remains in cluster (wsrep_ready==ON)
      and replication is configured for optimistic or aggressive retry logic.
      
      During the development of this fix, galera.galera_as_slave_nonprim test showed
      some problems. The test was analyzed, and it appears to need some attention.
      One excessive sleep command was removed in this commit, but it will need more
      fixes still to be fully deterministic. After this commit galera_as_slave_nonprim
      is successful, though.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      a3cbc44b
  16. 15 Aug, 2023 1 commit
    • Kristian Nielsen's avatar
      MDEV-31655: Parallel replication deadlock victim preference code errorneously removed · 900c4d69
      Kristian Nielsen authored
      Restore code to make InnoDB choose the second transaction as a deadlock
      victim if two transactions deadlock that need to commit in-order for
      parallel replication. This code was erroneously removed when VATS was
      implemented in InnoDB.
      
      Also add a test case for InnoDB choosing the right deadlock victim.
      Also fixes this bug, with testcase that reliably reproduces:
      
      MDEV-28776: rpl.rpl_mark_optimize_tbl_ddl fails with timeout on sync_with_master
      
      Note: This should be null-merged to 10.6, as a different fix is needed
      there due to InnoDB locking code changes.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      900c4d69
  17. 08 Aug, 2023 1 commit
    • Jan Lindström's avatar
      MDEV-31413 : Node has been dropped from the cluster on Startup / Shutdown with async replica · 277968aa
      Jan Lindström authored
      There was two related problems:
      
      (1) Galera node that is defined as a slave to async MariaDB
      master at restart might do SST (state stransfer) and
      part of that it will copy mysql.gtid_slave_pos table.
      Problem is that updates on that table are not replicated
      on a cluster. Therefore, table from donor that is not
      slave is copied and joiner looses gtid position it was
      and start executing events from wrong position of the binlog.
      This incorrect position could break replication and
      causes node to be dropped and requiring user action.
      
      (2) Slave sql thread might start executing events before
      galera is ready (wsrep_ready=ON) and that could also
      cause node to be dropped from the cluster.
      
      In this fix we enable replication of mysql.gtid_slave_pos
      table on a cluster. In this way all nodes in a cluster
      will know gtid slave position and even after SST joiner
      knows correct gtid position to start.
      
      Furthermore, we wait galera to be ready before slave
      sql thread executes any events to prevent too early
      execution.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      277968aa
  18. 25 Jul, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-30619: Parallel Slave SQL Thread Can Update Seconds_Behind_Master with Active Workers · 063f4ac2
      Brandon Nesterenko authored
      MDEV-31749 sporadic assert in MDEV-30619 new test
      
      If the workers of a parallel replica are busy (potentially with long
      queues), but the SQL thread has no events left to distribute (so it
      goes idle), then the next event that comes from the primary will
      update mi->last_master_timestamp with its timestamp, even if the
      workers have not yet finished.
      
      This patch changes the parallel replica logic which updates
      last_master_timestamp after idling from using solely sql_thread_caught_up
      (added in MDEV-29639) to using the latter with rli queued/dequeued
      event counters.
      That is, if  the queued count is equal to the dequeued count, it
      means all events have been processed and the replica is considered
      idle when the driver thread has also distributed all events.
      
      Low level details of the commit include
      - to make a more generalized test for Seconds_Behind_Master on
        the parallel replica, rpl_delayed_parallel_slave_sbm.test
        is renamed to rpl_parallel_sbm.test for this purpose.
      - pause_sql_thread_on_next_event usage was removed
        with the MDEV-30619 fixes. Rather than remove it, we adapt it
        to the needs of this test case
      - added test case to cover SBM spike of relay log read and LMT
        update that was fixed by MDEV-29639
      - rpl_seconds_behind_master_spike.test is made to use
        the negate_clock_diff_with_master debug eval.
      
      Reviewed By:
      ============
      Andrei Elkin <andrei.elkin@mariadb.com>
      063f4ac2
  19. 06 Jun, 2023 1 commit
  20. 05 Jun, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-13915: STOP SLAVE takes very long time on a busy system · 0a99d457
      Brandon Nesterenko authored
      The problem is that a parallel replica would not immediately stop
      running/queued transactions when issued STOP SLAVE. That is, it
      allowed the current group of transactions to run, and sometimes the
      transactions which belong to the next group could be started and run
      through commit after STOP SLAVE was issued too, if the last group
      had started committing. This would lead to long periods to wait for
      all waiting transactions to finish.
      
      This patch updates a parallel replica to try and abort immediately
      and roll-back any ongoing transactions. The exception to this is any
      transactions which are non-transactional (e.g. those modifying
      sequences or non-transactional tables), and any prior transactions,
      will be run to completion.
      
      The specifics are as follows:
      
       1. A new stage was added to SHOW PROCESSLIST output for the SQL
      Thread when it is waiting for a replica thread to either rollback or
      finish its transaction before stopping. This stage presents as
      “Waiting for worker thread to stop”
      
       2. Worker threads which error or are killed no longer perform GCO
      cleanup if there is a concurrently running prior transaction. This
      is because a worker thread scheduled to run in a future GCO could be
      killed and incorrectly perform cleanup of the active GCO.
      
       3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID
      binlog events to disallow adding it to transactions which modify
      both transactional and non-transactional engines when the binlogging
      configuration allow the modifications to exist in the same event,
      i.e. when using binlog_direct_non_trans_update == 0 and
      binlog_format == statement.
      
       4. A few existing MTR tests relied on the completion of certain
      transactions after issuing STOP SLAVE, and were re-recorded
      (potentially with added synchronizations) under the new rollback
      behavior.
      
      Reviewed By
      ===========
      Andrei Elkin <andrei.elkin@mariadb.com>
      0a99d457
  21. 27 Apr, 2023 1 commit
    • Andrei's avatar
      MDEV-29621: Replica stopped by locks on sequence · 55a53949
      Andrei authored
      When using binlog_row_image=FULL with sequence table inserts, a
      replica can deadlock because it treats full inserts in a sequence as DDL
      statements by getting an exclusive lock on the sequence table. It
      has been observed that with parallel replication, this exclusive
      lock on the sequence table can lead to a deadlock where one
      transaction has the exclusive lock and is waiting on a prior
      transaction to commit, whereas this prior transaction is waiting on
      the MDL lock.
      
      This fix for this is on the master side, to raise FL_DDL
      flag on the GTID of a full binlog_row_image write of a sequence table.
      This forces the slave to execute the statement serially so a deadlock
      cannot happen.
      
      A test verifies the deadlock also to prove it happen on the OLD (pre-fixes)
      slave.
      
      OLD (buggy master) -replication-> NEW (fixed slave) is provided.
      As the pre-fixes master's full row-image may represent both
      SELECT NEXT VALUE and INSERT, the parallel slave pessimistically
      waits for the prior transaction to have committed before to take on the
      critical part of the second (like INSERT in the test) event execution.
      The waiting exploits a parallel slave's retry mechanism which is
      controlled by `@@global.slave_transaction_retries`.
      
      Note that in order to avoid any persistent 'Deadlock found' 2013 error
      in OLD -> NEW, `slave_transaction_retries` may need to be set to a
      higher than the default value.
      START-SLAVE is an effective work-around if this still happens.
      55a53949
  22. 28 Mar, 2023 1 commit
    • Marko Mäkelä's avatar
      MDEV-30936 clang 15.0.7 -fsanitize=memory fails massively · dfa90257
      Marko Mäkelä authored
      handle_slave_io(), handle_slave_sql(), os_thread_exit():
      Remove a redundant pthread_exit(nullptr) call, because it
      would cause SIGSEGV.
      
      mysql_print_status(): Add MEM_MAKE_DEFINED() to work around
      some missing instrumentation around mallinfo2().
      
      que_graph_free_stat_list(): Invoke que_node_get_next(node) before
      que_graph_free_recursive(node). That is the logical and
      MSAN_OPTIONS=poison_in_dtor=1 compatible way of freeing memory.
      
      ins_node_t::~ins_node_t(): Invoke mem_heap_free(entry_sys_heap).
      
      que_graph_free_recursive(): Rely on ins_node_t::~ins_node_t().
      
      fts_t::~fts_t(): Invoke mem_heap_free(fts_heap).
      
      fts_free(): Replace with direct calls to fts_t::~fts_t().
      
      The failures in free_root() due to MSAN_OPTIONS=poison_in_dtor=1
      will be covered in MDEV-30942.
      dfa90257
  23. 09 Feb, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-30608: rpl.rpl_delayed_parallel_slave_sbm sometimes fails with... · eecd4f14
      Brandon Nesterenko authored
      MDEV-30608: rpl.rpl_delayed_parallel_slave_sbm sometimes fails with Seconds_Behind_Master should not have used second transaction timestamp
      
      One of the constraints added in the MDEV-29639 patch, is that only
      the first event after idling should update last_master_timestamp;
      and as long as the replica has more events to execute, the variable
      should not be updated. The corresponding test,
      rpl_delayed_parallel_slave_sbm.test, aims to verify this; however,
      if the IO thread takes too long to queue events, the SQL thread can
      appear to catch up too fast.
      
      This fix ensures that the relay log has been fully written before
      executing the events.
      
      Note that the underlying cause of this test failure needs to be
      addressed as a bug-fix, this is a temporary fix to stop test
      failures. To track work on the bug-fix for the underlying issue,
      please see MDEV-30619.
      eecd4f14
  24. 24 Jan, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-29639: Seconds_Behind_Master is incorrect for Delayed, Parallel Replicas · d69e8357
      Brandon Nesterenko authored
      Problem
      ========
      On a parallel, delayed replica, Seconds_Behind_Master will not be
      calculated until after MASTER_DELAY seconds have passed and the
      event has finished executing, resulting in potentially very large
      values of Seconds_Behind_Master (which could be much larger than the
      MASTER_DELAY parameter) for the entire duration the event is
      delayed. This contradicts the documented MASTER_DELAY behavior,
      which specifies how many seconds to withhold replicated events from
      execution.
      
      Solution
      ========
      After a parallel replica idles, the first event after idling should
      immediately update last_master_timestamp with the time that it began
      execution on the primary.
      
      Reviewed By
      ===========
      Andrei Elkin <andrei.elkin@mariadb.com>
      d69e8357
  25. 30 Nov, 2022 1 commit
  26. 22 Nov, 2022 2 commits
    • Julius Goryavsky's avatar
      MDEV-29817: Issues with handling options for SSL CRLs (and some others) · 1ebf0b73
      Julius Goryavsky authored
      This patch adds the correct setting of the "--tls-version" and
      "--ssl-verify-server-cert" options in the client-side utilities
      such as mysqltest, mysqlcheck and mysqlslap, as well as the correct
      setting of the "--ssl-crl" option when executing queries on the
      slave side, and also the correct option codes in the "sslopts-logopts.h"
      file (in the latter case, incorrect values are not a problem right
      now, but may cause subtle test failures in the future, if the option
      handling code changes).
      1ebf0b73
    • Julius Goryavsky's avatar
      MDEV-29817: Issues with handling options for SSL CRLs (and some others) · f0820400
      Julius Goryavsky authored
      This patch adds the correct setting of the "--ssl-verify-server-cert"
      option in the client-side utilities such as mysqlcheck and mysqlslap,
      as well as the correct setting of the "--ssl-crl" option when executing
      queries on the slave side, and also add the correct option codes in
      the "sslopts-logopts.h" file (in the latter case, incorrect values
      are not a problem right now, but may cause subtle test failures in
      the future, if the option handling code changes).
      f0820400
  27. 23 Sep, 2022 2 commits
    • Marko Mäkelä's avatar
      Fix build without either ENABLED_DEBUG_SYNC or DBUG_OFF · 3c92050d
      Marko Mäkelä authored
      There are separate flags DBUG_OFF for disabling the DBUG facility
      and ENABLED_DEBUG_SYNC for enabling the DEBUG_SYNC facility.
      Let us allow debug builds without DEBUG_SYNC.
      
      Note: For CMAKE_BUILD_TYPE=Debug, CMakeLists.txt will continue to
      define ENABLED_DEBUG_SYNC.
      3c92050d
    • Marko Mäkelä's avatar
      MDEV-29613 Improve WITH_DBUG_TRACE=OFF · a69cf6f0
      Marko Mäkelä authored
      In commit 28325b08
      a compile-time option was introduced to disable the macros
      DBUG_ENTER and DBUG_RETURN or DBUG_VOID_RETURN.
      
      The parameter name WITH_DBUG_TRACE would hint that it also
      covers DBUG_PRINT statements. Let us do that: WITH_DBUG_TRACE=OFF
      shall disable DBUG_PRINT() as well.
      
      A few InnoDB recovery tests used to check that some output from
      DBUG_PRINT("ib_log", ...) is present. We can live without those checks.
      
      Reviewed by: Vladislav Vaintroub
      a69cf6f0
  28. 03 Jun, 2022 1 commit
  29. 13 May, 2022 1 commit
    • Andrei's avatar
      MDEV-28550 improper handling of replication event group that contains · 726bd8c9
      Andrei authored
      GTID_LIST_EVENT or INCIDENT_EVENT.
      
      It's legal to have either of the two inside a group. E.g
        Gtid_event, Gtid_log_list_event, Query_1, ... Xid_log_event
      is permitted.
      However, the slave IO thread treated both
      as the terminal even when the group represents a DDL query.
      That causes a premature Gtid state update so the slave IO would think
      the whole group has been collected while in fact Query_1 etc are yet to process.
      
      Fixed with correcting a condition to compute the terminal event
      of the group.
      Tested with rpl_mysqlbinlog_slave_consistency (of 10.9) and
      rpl_gtid_errorlog.test.
      726bd8c9
  30. 28 Apr, 2022 1 commit
  31. 26 Apr, 2022 2 commits
  32. 25 Apr, 2022 1 commit
    • Andrei's avatar
      MDEV-27697 slave must recognize incomplete replication event group · 1bcdc3e9
      Andrei authored
      In cases of a faulty master or an incorrect binlog event producer, that slave is working with,
      sends an incomplete group of events slave must react with an error to not to log
      into the relay-log any new events that do not belong to the incomplete group.
      
      Fixed with extending received event properties check when slave connects to master
      in gtid mode.
      Specifically for the event that can be a part of a group its relay-logging is
      permitted only when its position within the group is validated.
      Otherwise slave IO thread stops with ER_SLAVE_RELAY_LOG_WRITE_FAILURE.
      1bcdc3e9
  33. 22 Apr, 2022 1 commit
    • Brandon Nesterenko's avatar
      MDEV-11853: semisync thread can be killed after sync binlog but before ACK in the sync state · a83c7ab1
      Brandon Nesterenko authored
      Problem:
      ========
      If a primary is shutdown during an active semi-sync connection
      during the period when the primary is awaiting an ACK, the primary
      hard kills the active communication thread and does not ensure the
      transaction was received by a replica. This can lead to an
      inconsistent replication state.
      
      Solution:
      ========
      During shutdown, the primary should wait for an ACK or timeout
      before hard killing a thread which is awaiting a communication. We
      extend the `SHUTDOWN WAIT FOR SLAVES` logic to identify and ignore
      any threads waiting for a semi-sync ACK in phase 1. Then, before
      stopping the ack receiver thread, the shutdown is delayed until all
      waiting semi-sync connections receive an ACK or time out. The
      connections are then killed in phase 2.
      
      Notes:
       1) There remains an unresolved corner case that affects this
      patch. MDEV-28141: Slave crashes with Packets out of order when
      connecting to a shutting down master. Specifically, If a slave is
      connecting to a master which is actively shutting down, the slave
      can crash with a "Packets out of order" assertion error. To get
      around this issue in the MTR tests, the primary will wait a small
      amount of time before phase 1 killing threads to let the replicas
      safely stop (if applicable).
       2) This patch also fixes MDEV-28114: Semi-sync Master ACK Receiver
      Thread Can Error on COM_QUIT
      
      Reviewed By
      ============
      Andrei Elkin <andrei.elkin@mariadb.com>
      a83c7ab1
  34. 04 Jan, 2022 1 commit
    • Brandon Nesterenko's avatar
      MDEV-16091: Seconds_Behind_Master spikes to millions of seconds · 96de6bfd
      Brandon Nesterenko authored
      Problem:
      ========
      A slave’s relay log format description event is used when
      calculating Seconds_Behind_Master (SBM). This forces the SBM
      value to spike when processing these events, as their creation
      date is set to the timestamp that the IO thread begins.
      
      Solution:
      ========
      When the slave generates a format description event, mark the
      event as a relay log event so it does not update the
      rli->last_master_timestamp variable.
      
      Reviewed By:
      ============
      Andrei Elkin <andrei.elkin@mariadb.com>
      96de6bfd
  35. 29 Oct, 2021 3 commits
    • sjaakola's avatar
      MDEV-23328 Server hang due to Galera lock conflict resolution · ef2dbb8d
      sjaakola authored
      Mutex order violation when wsrep bf thread kills a conflicting trx,
      the stack is
      
                wsrep_thd_LOCK()
                wsrep_kill_victim()
                lock_rec_other_has_conflicting()
                lock_clust_rec_read_check_and_lock()
                row_search_mvcc()
                ha_innobase::index_read()
                ha_innobase::rnd_pos()
                handler::ha_rnd_pos()
                handler::rnd_pos_by_record()
                handler::ha_rnd_pos_by_record()
                Rows_log_event::find_row()
                Update_rows_log_event::do_exec_row()
                Rows_log_event::do_apply_event()
                Log_event::apply_event()
                wsrep_apply_events()
      
      and mutexes are taken in the order
      
                lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
      
      When a normal KILL statement is executed, the stack is
      
                innobase_kill_query()
                kill_handlerton()
                plugin_foreach_with_mask()
                ha_kill_query()
                THD::awake()
                kill_one_thread()
      
              and mutexes are
      
                victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
      
      This patch is the plan D variant for fixing potetial mutex locking
      order exercised by BF aborting and KILL command execution.
      
      In this approach, KILL command is replicated as TOI operation.
      This guarantees total isolation for the KILL command execution
      in the first node: there is no concurrent replication applying
      and no concurrent DDL executing. Therefore there is no risk of
      BF aborting to happen in parallel with KILL command execution
      either. Potential mutex deadlocks between the different mutex
      access paths with KILL command execution and BF aborting cannot
      therefore happen.
      
      TOI replication is used, in this approach,  purely as means
      to provide isolated KILL command execution in the first node.
      KILL command should not (and must not) be applied in secondary
      nodes. In this patch, we make this sure by skipping KILL
      execution in secondary nodes, in applying phase, where we
      bail out if applier thread is trying to execute KILL command.
      This is effective, but skipping the applying of KILL command
      could happen much earlier as well.
      
      This also fixed unprotected calls to wsrep_thd_abort
      that will use wsrep_abort_transaction. This is fixed
      by holding THD::LOCK_thd_data while we abort transaction.
      Reviewed-by: default avatarJan Lindström <jan.lindstrom@mariadb.com>
      ef2dbb8d
    • Jan Lindström's avatar
      MDEV-25114: Crash: WSREP: invalid state ROLLED_BACK (FATAL) · d5bc0579
      Jan Lindström authored
      Revert "MDEV-23328 Server hang due to Galera lock conflict resolution"
      
      This reverts commit eac8341d.
      d5bc0579
    • sjaakola's avatar
      MDEV-23328 Server hang due to Galera lock conflict resolution · 5c230b21
      sjaakola authored
      Mutex order violation when wsrep bf thread kills a conflicting trx,
      the stack is
      
                wsrep_thd_LOCK()
                wsrep_kill_victim()
                lock_rec_other_has_conflicting()
                lock_clust_rec_read_check_and_lock()
                row_search_mvcc()
                ha_innobase::index_read()
                ha_innobase::rnd_pos()
                handler::ha_rnd_pos()
                handler::rnd_pos_by_record()
                handler::ha_rnd_pos_by_record()
                Rows_log_event::find_row()
                Update_rows_log_event::do_exec_row()
                Rows_log_event::do_apply_event()
                Log_event::apply_event()
                wsrep_apply_events()
      
      and mutexes are taken in the order
      
                lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
      
      When a normal KILL statement is executed, the stack is
      
                innobase_kill_query()
                kill_handlerton()
                plugin_foreach_with_mask()
                ha_kill_query()
                THD::awake()
                kill_one_thread()
      
              and mutexes are
      
                victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
      
      This patch is the plan D variant for fixing potetial mutex locking
      order exercised by BF aborting and KILL command execution.
      
      In this approach, KILL command is replicated as TOI operation.
      This guarantees total isolation for the KILL command execution
      in the first node: there is no concurrent replication applying
      and no concurrent DDL executing. Therefore there is no risk of
      BF aborting to happen in parallel with KILL command execution
      either. Potential mutex deadlocks between the different mutex
      access paths with KILL command execution and BF aborting cannot
      therefore happen.
      
      TOI replication is used, in this approach,  purely as means
      to provide isolated KILL command execution in the first node.
      KILL command should not (and must not) be applied in secondary
      nodes. In this patch, we make this sure by skipping KILL
      execution in secondary nodes, in applying phase, where we
      bail out if applier thread is trying to execute KILL command.
      This is effective, but skipping the applying of KILL command
      could happen much earlier as well.
      
      This also fixed unprotected calls to wsrep_thd_abort
      that will use wsrep_abort_transaction. This is fixed
      by holding THD::LOCK_thd_data while we abort transaction.
      Reviewed-by: default avatarJan Lindström <jan.lindstrom@mariadb.com>
      5c230b21