1. 06 Jul, 2023 2 commits
    • Vlad Lesin's avatar
      MDEV-10962 Deadlock with 3 concurrent DELETEs by unique key · 1bfd3cc4
      Vlad Lesin authored
      PROBLEM:
      A deadlock was possible when a transaction tried to "upgrade" an already
      held Record Lock to Next Key Lock.
      
      SOLUTION:
      This patch is based on observations that:
      (1) a Next Key Lock is equivalent to Record Lock combined with Gap Lock
      (2) a GAP Lock never has to wait for any other lock
      In case we request a Next Key Lock, we check if we already own a Record
      Lock of equal or stronger mode, and if so, then we change the requested
      lock type to GAP Lock, which we either already have, or can be granted
      immediately, as GAP locks don't conflict with any other lock types.
      (We don't consider Insert Intention Locks a Gap Lock in above statements).
      
      The reason of why we don't upgrage Record Lock to Next Key Lock is the
      following.
      
      Imagine a transaction which does something like this:
      
      for each row {
          request lock in LOCK_X|LOCK_REC_NOT_GAP mode
          request lock in LOCK_S mode
      }
      
      If we upgraded lock from Record Lock to Next Key lock, there would be
      created only two lock_t structs for each page, one for
      LOCK_X|LOCK_REC_NOT_GAP mode and one for LOCK_S mode, and then used
      their bitmaps to mark all records from the same page.
      
      The situation would look like this:
      
      request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 1:
      // -> creates new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode and sets bit for
      // 1
      request lock in LOCK_S mode on row 1:
      // -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 1,
      // so it upgrades it to X
      request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 2:
      // -> creates a new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode (because we
      // don't have any after we've upgraded!) and sets bit for 2
      request lock in LOCK_S mode on row 2:
      // -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 2,
      // so it upgrades it to X
          ...etc...etc..
      
      Each iteration of the loop creates a new lock_t struct, and in the end we
      have a lot (one for each record!) of LOCK_X locks, each with single bit
      set in the bitmap. Soon we run out of space for lock_t structs.
      
      If we create LOCK_GAP instead of lock upgrading, the above scenario works
      like the following:
      
      // -> creates new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode and sets bit for
      // 1
      request lock in LOCK_S mode on row 1:
      // -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 1,
      // so it creates LOCK_S|LOCK_GAP only and sets bit for 1
      request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 2:
      // -> reuses the lock_t for LOCK_X|LOCK_REC_NOT_GAP by setting bit for 2
      request lock in LOCK_S mode on row 2:
      // -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 2,
      // so it reuses LOCK_S|LOCK_GAP setting bit for 2
      
      In the end we have just two locks per page, one for each mode:
      LOCK_X|LOCK_REC_NOT_GAP and LOCK_S|LOCK_GAP.
      Another benefit of this solution is that it avoids not-entirely
      const-correct, (and otherwise looking risky) "upgrading".
      
      The fix was ported from
      mysql/mysql-server@bfba840dfa7794b988c59c94658920dbe556075d
      mysql/mysql-server@75cefdb1f73b8f8ac8e22b10dfb5073adbdfdfb0
      
      Reviewed by: Marko Mäkelä
      1bfd3cc4
    • Alexander Barkov's avatar
      A cleanup for MDEV-30932 UBSAN: negation of -X cannot be represented in type .. · 19cdddf1
      Alexander Barkov authored
      "mtr --view-protocol func_math" failed because of a too long
      column names imlicitly generated for the underlying expressions.
      
      With --view-protocol they were replaced to "Name_exp_1".
      
      Adding column aliases for these expressions.
      19cdddf1
  2. 05 Jul, 2023 1 commit
  3. 04 Jul, 2023 2 commits
  4. 03 Jul, 2023 7 commits
  5. 29 Jun, 2023 3 commits
    • Sergei Golubchik's avatar
    • Alexander Barkov's avatar
      MDEV-30932 UBSAN: negation of -X cannot be represented in type .. · 67657a01
      Alexander Barkov authored
        'long long int'; cast to an unsigned type to negate this value ..
        to itself in Item_func_mul::int_op and Item_func_round::int_op
      
      Problems:
      
        The code in multiple places in the following methods:
          - Item_func_mul::int_op()
          - longlong Item_func_int_div::val_int()
          - Item_func_mod::int_op()
          - Item_func_round::int_op()
      
        did not properly check for corner values LONGLONG_MIN
        and (LONGLONG_MAX+1) before doing negation.
        This cuased UBSAN to complain about undefined behaviour.
      
      Fix summary:
      
        - Adding helper classes ULonglong, ULonglong_null, ULonglong_hybrid
          (in addition to their signed couterparts in sql/sql_type_int.h).
      
        - Moving the code performing multiplication of ulonglong numbers
          from Item_func_mul::int_op() to ULonglong_hybrid::ullmul().
      
        - Moving the code responsible for extracting absolute values
          from negative numbers to Longlong::abs().
          It makes sure to perform negation without undefinite behavior:
          LONGLONG_MIN is handled in a special way.
      
        - Moving negation related code to ULonglong::operator-().
          It makes sure to perform negation without undefinite behavior:
          (LONGLONG_MAX + 1) is handled in a special way.
      
        - Moving signed<=>unsigned conversion code to
          Longlong_hybrid::val_int() and ULonglong_hybrid::val_int().
      
        - Reusing old and new sql_type_int.h classes in multiple
          places in Item_func_xxx::int_op().
      
      Fix details (explain how sql_type_int.h classes are reused):
      
        - Instead of straight negation of negative "longlong" arguments
          *before* performing unsigned multiplication,
          Item_func_mul::int_op() now calls ULonglong_null::ullmul()
          using Longlong_hybrid_null::abs() to pass arguments.
          This fixes undefined behavior N1.
      
        - Instead of straight negation of "ulonglong" result
          *after* performing unsigned multiplication,
          Item_func_mul::int_op() now calls ULonglong_hybrid::val_int(),
          which recursively calls ULonglong::operator-().
          This fixes undefined behavior N2.
      
        - Removing duplicate negating code from Item_func_mod::int_op().
          Using ULonglong_hybrid::val_int() instead.
          This fixes undefinite behavior N3.
      
        - Removing literal "longlong" negation from Item_func_round::int_op().
          Using Longlong::abs() instead, which correctly handler LONGLONG_MIN.
          This fixes undefinite behavior N4.
      
        - Removing the duplicate (negation related) code from
          Item_func_int_div::val_int(). Reusing class ULonglong_hybrid.
          There were no undefinite behavior in here.
          However, this change allowed to reveal a bug in
          "-9223372036854775808 DIV 1".
          The removed negation code appeared to be incorrect when
          negating +9223372036854775808. It returned the "out of range" error.
          ULonglong_hybrid::operator-() now handles all values correctly
          and returns +9223372036854775808 as a negation for -9223372036854775808.
      
          Re-recording wrong results for
            SELECT -9223372036854775808 DIV  1;
          Now instead of "out of range", it returns -9223372036854775808,
          which is the smallest possible value for the expression data type
          (signed) BIGINT.
      
        - Removing "no UBSAN" branch from Item_func_splus::int_opt()
          and Item_func_minus::int_opt(), as it made UBSAN happy but
          in RelWithDebInfo some MTR tests started to fail.
      67657a01
    • Yuchen Pei's avatar
  6. 28 Jun, 2023 1 commit
  7. 27 Jun, 2023 3 commits
    • Sergei Golubchik's avatar
      mtr: fix the help text for debuggers · d214628a
      Sergei Golubchik authored
      d214628a
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-31086 MODIFY COLUMN can break FK constraints, and lead to unrestorable dumps · 5f09b53b
      Thirunarayanan Balathandayuthapani authored
      - When foreign_key_check is disabled, allowing to modify the
      column which is part of foreign key constraint can lead to
      refusal of TRUNCATE TABLE, OPTIMIZE TABLE later. So it make
      sense to block the column modify operation when foreign key
      is involved irrespective of foreign_key_check variable.
      
      Correct way to modify the charset of the column when fk is involved:
      
      SET foreign_key_checks=OFF;
      ALTER TABLE child DROP FOREIGN KEY fk, MODIFY m VARCHAR(200) CHARSET utf8mb4;
      ALTER TABLE parent MODIFY m VARCHAR(200) CHARSET utf8mb4;
      ALTER TABLE child ADD CONSTRAINT FOREIGN KEY (m) REFERENCES PARENT(m);
      SET foreign_key_checks=ON;
      
      fk_check_column_changes(): Remove the FOREIGN_KEY_CHECKS while
      checking the column change for foreign key constraint. This
      is the partial revert of commit 5f1f2fc0
      and it changes the behaviour of copy alter algorithm
      
      ha_innobase::prepare_inplace_alter_table(): Find the modified
      column and check whether it is part of existing and newly
      added foreign key constraint.
      5f09b53b
    • Yuchen Pei's avatar
      MDEV-29447 MDEV-26285 MDEV-31338 Refactor spider_db_mbase_util::open_item_func · 423c28f0
      Yuchen Pei authored
      spider_db_mbase_util::open_item_func() is a monster function.
      It is difficult to maintain while it is expected that we need to
      modify it when a new SQL function or a new func_type is added.
      
      We split the function into two distinct functions: one handles the
      case of str != NULL and the other handles the case of str == NULL.
      
      This refactoring was done in a conservative way because we do not
      have comprehensive tests on the function.
      
      It also fixes MDEV-29447 and MDEV-31338 where field items that are
      arguments of a func item may be used before created / initialised.
      
      Note this commit is adapted from a patch by Nayuta for MDEV-26285.
      423c28f0
  8. 26 Jun, 2023 1 commit
  9. 22 Jun, 2023 2 commits
  10. 19 Jun, 2023 1 commit
  11. 08 Jun, 2023 2 commits
  12. 07 Jun, 2023 3 commits
  13. 06 Jun, 2023 2 commits
  14. 05 Jun, 2023 2 commits
    • Brandon Nesterenko's avatar
      MDEV-13915: STOP SLAVE takes very long time on a busy system · 0a99d457
      Brandon Nesterenko authored
      The problem is that a parallel replica would not immediately stop
      running/queued transactions when issued STOP SLAVE. That is, it
      allowed the current group of transactions to run, and sometimes the
      transactions which belong to the next group could be started and run
      through commit after STOP SLAVE was issued too, if the last group
      had started committing. This would lead to long periods to wait for
      all waiting transactions to finish.
      
      This patch updates a parallel replica to try and abort immediately
      and roll-back any ongoing transactions. The exception to this is any
      transactions which are non-transactional (e.g. those modifying
      sequences or non-transactional tables), and any prior transactions,
      will be run to completion.
      
      The specifics are as follows:
      
       1. A new stage was added to SHOW PROCESSLIST output for the SQL
      Thread when it is waiting for a replica thread to either rollback or
      finish its transaction before stopping. This stage presents as
      “Waiting for worker thread to stop”
      
       2. Worker threads which error or are killed no longer perform GCO
      cleanup if there is a concurrently running prior transaction. This
      is because a worker thread scheduled to run in a future GCO could be
      killed and incorrectly perform cleanup of the active GCO.
      
       3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID
      binlog events to disallow adding it to transactions which modify
      both transactional and non-transactional engines when the binlogging
      configuration allow the modifications to exist in the same event,
      i.e. when using binlog_direct_non_trans_update == 0 and
      binlog_format == statement.
      
       4. A few existing MTR tests relied on the completion of certain
      transactions after issuing STOP SLAVE, and were re-recorded
      (potentially with added synchronizations) under the new rollback
      behavior.
      
      Reviewed By
      ===========
      Andrei Elkin <andrei.elkin@mariadb.com>
      0a99d457
    • Sergei Petrunia's avatar
      MDEV-31403: Server crashes in st_join_table::choose_best_splitting · 928012a2
      Sergei Petrunia authored
      The code in choose_best_splitting() assumed that the join prefix is
      in join->positions[].
      
      This is not necessarily the case. This function might be called when
      the join prefix is in join->best_positions[], too.
      Follow the approach from best_access_path(), which calls this function:
      pass the current join prefix as an argument,
      "const POSITION *join_positions" and use that.
      928012a2
  15. 04 Jun, 2023 1 commit
  16. 03 Jun, 2023 5 commits
  17. 02 Jun, 2023 2 commits