1. 18 Sep, 2024 6 commits
    • Andrei's avatar
      The tests are extended to cover a review note, as well as .. · f97cfb78
      Andrei authored
      .. a later spotted need to address parallel XAP_1 and T_2 in
      which T_2 does not find XAP_1 as prepared and still it should skip
      its locks.
      f97cfb78
    • Andrei's avatar
      "Eager" review notes are addressed and more · 52e6e0b3
      Andrei authored
      - non-unique index related examination evacuates thd_rpl_deadlock_check
        into a new thd_rpl_xa_non_uniq_index_hit
      - thd_rpl_xa_non_uniq_index_hit also covers a case of an ongoing
        XA-Prepare parent (whose locks are allowed to be skipped at
        scanning)
      - preparation for a simulated delay for next commit tests.
      
      An earlier added
        rpl_group_info::exec_flags
        remains. The reasons are that its alternative of a new HA_ error
        would be slave specific while, as pointed in review mail thread -
        the added bit flag may provide more service in future).
      52e6e0b3
    • Andrei's avatar
      MDEV-34481 optimize away waiting for owned by prepared xa non-unique index record · ab73829a
      Andrei authored
      This work partly implements a ROW binlog format MDEV-33455 part
      that is makes non-unique-index-only table XA replication safe in RBR.
      Existence of at least one non-null unique index has always guaranteed
      safety (no hang to error).
      
      Two transaction that update a non-unique-only index table could not be
      isolated on slave when on slave they used different non-unique indexes
      than on master.
      Unsolvable hang could be seen in case the first of the two is a prepared XA
      
      --connection slave_worker_1
      
        xa start 'xid'; /* ... lock here ... */ ; xa prepare 'xid'
      
      while the 2nd being of any kind including of normal type
      
      --connection slave_worker_2
      
        begin; /* ... get lock ... => wait/hang...error out  */
      
      was unable to wait up for the conflicting lock, even though
      the XA transaction did not really lock target records of the 2nd.
      
      This type of hang was caused by a chosen method the 2nd transaction
      employs to reach the target record, which is the index scan. The scanning
      orthodoxically just could not step over a record in the way that was
      locked by the XA.
      However as the in-the-way record can not be targeted by the 2nd
      transaction, otherwise the transactions would have sensed the conflict
      back on master *and* the other possibility of collecting extra locks by the
      'xid' on non-modified records is tacked by MDEV-33454/MDEV-34466,
      the non-unique index scanning server/slave-applier layer must not panic at
      seeing a timeout error from the engine. Instead the scanning would
      just proceed to next possibly free index records of the same key value and
      ultimately must reach the target one.
      More generally, on the way to its target all busy records belonging to
      earlier (binlog order) prepared XA transactions need not be tried locking
      by the current non-unique index scanning transaction.
      
      This patch implements the plan for Innodb.
      The server layer expects the engine to mark an attempt to wait for
      a conflicting lock that belongs to a transaction in prepared state.
      The engine won't exercise, need not to, the timeout wait.
      When marking is done the timeout error is ignored by the server
      and next index record is tried out.
      
      An mtr test checks a scenario in sequential and parallel modes.
      ab73829a
    • Andrei's avatar
      MDEV-34481 optimize away waiting for owned by prepared xa non-unique index · 9771d564
      Andrei authored
      Regression tests.
      9771d564
    • Vlad Lesin's avatar
      MDEV-34690 lock_rec_unlock_unmodified() causes deadlock · 852f2e03
      Vlad Lesin authored
      lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock()
      or under a combination of lock_sys.rd_lock() + record locks hash table
      cell latch. It also requests page latch to check if locked records were
      changed by the current transaction or not.
      
      Usually InnoDB requests page latch to find the certain record on the
      page, and then requests lock_sys and/or record lock hash cell latch to
      request record lock. lock_rec_unlock_unmodified() requests the latches
      in the opposite order, what causes deadlocks. One of the possible
      scenario for the deadlock is the following:
      
      thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table
                 cell latch, the latch is acquired;
      thread 2 - purge thread acquires page latch and tries to remove
                 delete-marked record, it invokes lock_update_delete(), which
                 requests locks hash table cell latch, held by thread 1;
      thread 1 - requests page latch, held by thread 2.
      
      To fix it we need to release lock_sys.latch and/or lock hash cell latch,
      acquire page latch and re-acquire lock_sys related latches.
      
      THIS COMMIT DOES NOT PASS RQG TESTING, DON'T PUSH IT WITHOUT FIXING.
      852f2e03
    • Vlad Lesin's avatar
      MDEV-34466 XA prepare don't release unmodified records for some cases · 1bf2e4b4
      Vlad Lesin authored
      There is no need to exclude exclusive non-gap locks from the procedure
      of locks releasing on XA PREPARE execution in
      lock_release_on_prepare_try() after commit
      17e59ed3 (MDEV-33454), because
      lock_rec_unlock_unmodified() should check if the record was modified
      with the XA, and release the lock if it was not.
      
      lock_release_on_prepare_try(): don't skip X-locks, let
      lock_rec_unlock_unmodified() to process them.
      
      lock_sec_rec_some_has_impl(): add template parameter for not acquiring
      trx_t::mutex for the case if a caller already holds the mutex, don't
      crash if lock's bitmap is clean.
      
      row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): add new argument
      to skip trx_t::mutex acquiring.
      
      rw_trx_hash_t::validate_element(): don't acquire trx_t::mutex if the
      current thread already holds it.
      
      Thanks to Andrei Elkin for finding the bug.
      Reviewed by Marko Mäkelä, Debarun Banerjee.
      1bf2e4b4
  2. 16 Sep, 2024 2 commits
  3. 15 Sep, 2024 12 commits
  4. 14 Sep, 2024 1 commit
    • Marko Mäkelä's avatar
      mtr_t::log_file_op(): Fix -Wnonnull · 4010dff0
      Marko Mäkelä authored
      GCC 12.2.0 could issue -Wnonnull for an unreachable call to
      strlen(new_path).  Let us prevent that by replacing the condition
      (type == FILE_RENAME) with the equivalent (new_path).
      This should also optimize the generated code, because the life time
      of the parameter "type" will be reduced.
      4010dff0
  5. 13 Sep, 2024 1 commit
    • Marko Mäkelä's avatar
      MDEV-34921 MemorySanitizer reports errors for non-debug builds · b331cde2
      Marko Mäkelä authored
      my_b_encr_write(): Initialize also block_length, and at the same time
      last_block_length, so that all 128 bits can be initialized with fewer
      writes. This fixes an error that was caught in the test
      encryption.tempfiles_encrypted.
      
      test_my_safe_print_str(): Skip a test that would attempt to
      display uninitialized data in the test unit.stacktrace.
      Previously, our CI did not build unit tests with MemorySanitizer.
      
      handle_delayed_insert(): Remove a redundant call to pthread_exit(0),
      which would for some reason cause MemorySanitizer in clang-19 to
      report a stack overflow in a RelWithDebInfo build. This fixes a
      failure of several tests.
      
      Reviewed by: Vladislav Vaintroub
      b331cde2
  6. 12 Sep, 2024 5 commits
  7. 11 Sep, 2024 5 commits
  8. 10 Sep, 2024 8 commits