1. 11 Mar, 2024 3 commits
  2. 08 Mar, 2024 1 commit
    • Daniele Sciascia's avatar
      MDEV-33540 Avoid writes to TRX_SYS page during mariabackup operations · 648d2da8
      Daniele Sciascia authored
      Fix a scenario where `mariabackup --prepare` fails with assertion
      `!m_modifications || !recv_no_log_write'  in `mtr_t::commit()`. This
      happens if the prepare step of the backup encounters a data directory
      which happens to store wsrep xid position in TRX SYS page (this is no
      longer the case since 10.3.5). And since MDEV-17458,
      `trx_rseg_array_init()` handles this case by copying the xid position
      to rollback segments, before clearing the xid from TRX SYS page.
      However, this step should be avoided when `trx_rseg_array_init()` is
      invoked from mariabackup. The relevant code was surrounded by the
      condition `srv_operation == SRV_OPERATION_NORMAL`. An additional check
      ensures that we are not trying to copy a xid position which has
      already zeroed.
      648d2da8
  3. 07 Mar, 2024 1 commit
    • mariadb-DebarunBanerjee's avatar
      MDEV-33593 Auto increment deadlock error causes ASSERT in subsequent save point · afe96329
      mariadb-DebarunBanerjee authored
      The issue here is ha_innobase::get_auto_increment() could cause a
      deadlock involving auto-increment lock and rollback the transaction
      implicitly. For such cases, storage engines usually call
      thd_mark_transaction_to_rollback() to inform SQL engine about it which
      in turn takes appropriate actions and close the transaction. In innodb,
      we call it while converting Innodb error code to MySQL.
      
      However, since ::innobase_get_autoinc() returns void, we skip the call
      for error code conversion and also miss marking the transaction for
      rollback for deadlock error. We assert eventually while releasing a
      savepoint as the transaction state is not active.
      
      Since convert_error_code_to_mysql() is handling some generic error
      handling part, like invoking the callback when needed, we should call
      that function in ha_innobase::get_auto_increment() even if we don't
      return the resulting mysql error code back.
      afe96329
  4. 06 Mar, 2024 3 commits
  5. 04 Mar, 2024 1 commit
  6. 03 Mar, 2024 1 commit
  7. 02 Mar, 2024 1 commit
  8. 01 Mar, 2024 2 commits
    • Monty's avatar
      Fixed random failure in main.kill_processlist-6619 · 8b3f470c
      Monty authored
      The problem was that SHOW PROCESSLIST was done before the command of
      the default connection was cleared.
      
      Reviewer: Sergei Golubchik <serg@mariadb.org>
      8b3f470c
    • Tony Chen's avatar
      MDEV-26923 Check all invalid config options · 32546877
      Tony Chen authored
      Previously, the behavior was to error out on the first invalid option
      encountered. With this change, a best effort approach is made so that
      all invalid options processed will be printed before exiting.
      
      There is a caveat. The options are processed many times at varying
      stages of server startup because the server is not aware of all valid
      options immediately (e.g. plugins have to be loaded first before the
      server knows what are the available plugin options). So, there are some
      options that the server can determine are invalid "early" on, and there
      are some options that the server cannot determine are invalid until
      "later" on. For example, the server can determine an option such as
      `--a` is an ambiguous option very early on but an option such as
      `--this-does-not-match-any-option` cannot be labelled as invalid until
      the server is aware of all available options.
      
      Thus, it is possible that the server will still fail before printing out
      all "invalid" options. You can see this by passing `--a
      --obvious-invalid-option`.
      
      Test cases were added to `mysqld_option_err.test` to validate that
      multiple invalid options will be displayed in the error message.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer Amazon Web
      Services.
      32546877
  9. 27 Feb, 2024 4 commits
    • mariadb-DebarunBanerjee's avatar
      MDEV-33011 mariabackup --backup: FATAL ERROR: ... Can't open datafile cool_down/t3 · 96966976
      mariadb-DebarunBanerjee authored
      The root cause is the WAL logging of file operation when the actual
      operation fails afterwards. It creates a situation with a log entry for
      a operation that would always fail. I could simulate both the backup
      scenario error and Innodb recovery failure exploiting the weakness.
      
      We are following WAL for file rename operation and once logged the
      operation must eventually complete successfully, or it is a major
      catastrophe. Right now, we fail for rename and handle it as normal error
      and it is the problem.
      
      I created a patch to address RENAME operation to a non existing schema
      where the destination schema directory is missing. The patch checks for
      the missing schema before logging in an attempt to avoid the failure
      after WAL log is written/flushed. I also checked that the schema cannot
      be dropped or there cannot be any race with other rename to the same
      file. This is protected by the MDL lock in SQL today.
      
      The patch should this be a good improvement over the current situation
      and solves the issue at hand.
      96966976
    • Julius Goryavsky's avatar
      galera: correction after wsrep-lib update · c9b0c006
      Julius Goryavsky authored
      Correction to ensure compatibility with the updated wsrep-lib library.
      c9b0c006
    • Julius Goryavsky's avatar
      galera: wsrep-lib submodule update · 87abae46
      Julius Goryavsky authored
      87abae46
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-19044 Alter table corrupts while applying the modification log · 57cc8605
      Thirunarayanan Balathandayuthapani authored
      Problem:
      ========
      - InnoDB reads the length of the variable length field wrongly
      while applying the modification log of instant table.
      
      Solution:
      ========
      rec_init_offsets_comp_ordinary(): For the temporary instant
      file record, InnoDB should read the length of the variable length
      field from the record itself.
      57cc8605
  10. 26 Feb, 2024 2 commits
    • Igor Babaev's avatar
      MDEV-31276 Wrong warnings on 2-nd execution of PS for query with GROUP_CONCAT · 8778a83e
      Igor Babaev authored
      If a query with GROUP_CONCAT is executed then the server reports a warning
      every time when the length of the result of this function exceeds the set
      value of the system variable group_concat_max_len. This bug led to the set
      of warnings from the second execution of the prepared statement that did
      not coincide with the one from the first execution if the executed query
      was a grouping query over a join of tables using GROUP_CONCAT function and
      join cache was not allowed to be employed.
      The descrepancy of the sets of warnings was due to lack of cleanup for
      Item_func_group_concat::row_count after execution of the query.
      
      Approved by Oleksandr Byelkin <sanja@mariadb.com>
      8778a83e
    • Alexander Barkov's avatar
      MDEV-32975 Default charset doesn't work with PHP MySQLi extension · 1b37cb71
      Alexander Barkov authored
      When sending the server default collation ID to the client
      in the handshake packet, translate a 2-byte collation ID
      to the ID of the default collation for the character set.
      1b37cb71
  11. 23 Feb, 2024 1 commit
    • Alexander Barkov's avatar
      MDEV-33496 Out of range error in AVG(YEAR(datetime)) due to a wrong data type · e63311c2
      Alexander Barkov authored
      Functions extracting non-negative datetime components:
      
      - YEAR(dt),        EXTRACT(YEAR FROM dt)
      - QUARTER(td),     EXTRACT(QUARTER FROM dt)
      - MONTH(dt),       EXTRACT(MONTH FROM dt)
      - WEEK(dt),        EXTRACT(WEEK FROM dt)
      - HOUR(dt),
      - MINUTE(dt),
      - SECOND(dt),
      - MICROSECOND(dt),
      - DAYOFYEAR(dt)
      - EXTRACT(YEAR_MONTH FROM dt)
      
      did not set their max_length properly, so in the DECIMAL
      context they created a too small DECIMAL column, which
      led to the 'Out of range value' error.
      
      The problem is that most of these functions historically
      returned the signed INT data type.
      
      There were two simple ways to fix these functions:
      1. Add +1 to max_length.
         But this would also change their size in the string context
         and create too long VARCHAR columns, with +1 excessive size.
      
      2. Preserve max_length, but change the data type from INT to INT UNSIGNED.
         But this would break backward compatibility.
         Also, using UNSIGNED is generally not desirable,
         it's better to stay with signed when possible.
      
      This fix implements another solution, which it makes all these functions
      work well in all contexts: int, decimal, string.
      
      Fix details:
      
      - Adding a new special class Type_handler_long_ge0 - the data type
        handler for expressions which:
        * should look like normal signed INT
        * but which known not to return negative values
        Expressions handled by Type_handler_long_ge0 store in Item::max_length
        only the number of digits, without adding +1 for the sign.
      
      - Fixing Item_extract to use Type_handler_long_ge0
        for non-negative datetime components:
         YEAR, YEAR_MONTH, QUARTER, MONTH, WEEK
      
      - Adding a new abstract class Item_long_ge0_func, for functions
        returning non-negative datetime components.
        Item_long_ge0_func uses Type_handler_long_ge0 as the type handler.
        The class hierarchy now looks as follows:
      
      Item_long_ge0_func
        Item_long_func_date_field
          Item_func_to_days
          Item_func_dayofmonth
          Item_func_dayofyear
          Item_func_quarter
          Item_func_year
        Item_long_func_time_field
          Item_func_hour
          Item_func_minute
          Item_func_second
          Item_func_microsecond
      
      - Cleanup: EXTRACT(QUARTER FROM dt) created an excessive VARCHAR column
        in string context. Changing its length from 2 to 1.
      e63311c2
  12. 21 Feb, 2024 2 commits
    • Igor Babaev's avatar
      MDEV-31277 Wrong result on 2-nd execution of PS to select from view using derived · d57c44f6
      Igor Babaev authored
      As a result of this bug the second execution of the prepared statement
      created for select from materialized view could return a wrong result set if
      - the specification of the view used a left join
      - an inner table the left join was a mergeable derived table
      - the derived table contained a constant column.
      
      The problem appeared because the flag 'maybe-null' of the wrapper
      Item_direct_view_ref constructed for the constant field of the mergeable
      derived table was not set to 'true' on the second execution of the
      prepared statement.
      
      The patch always sets this flag properly when calling the function
      Item_direct_view_ref::set_null_ref-table(). The latter is invoked in
      Item_direct_view_ref constructor if it is created for some reference of
      a constant column belonging to a mergeable derived table.
      
      Approved by Oleksandr Byelkin <sanja@mariadb.com>
      d57c44f6
    • Yuchen Pei's avatar
      0f0da95d
  13. 20 Feb, 2024 5 commits
  14. 19 Feb, 2024 1 commit
  15. 18 Feb, 2024 1 commit
    • Vladislav Vaintroub's avatar
      MDEV-33482: Optimize WolfSSL for improved performance · d510f805
      Vladislav Vaintroub authored
      - Use "new" math library WOLFSSL_SP_MATH_ALL, which is now  promoted by
        WolfSSL for faster performance. "fastmath" we  used previously is going
        to be deprecated, it was not really always fast.
      - Optimize common RSA math operations with WOLFSSL_HAVE_SP_RSA
      - Incorporate assembly optimizations, currently for Intel x64 only
      
      This patch significantly reduces execution time for SSL tests like
      main.ssl-big and main.ssl_connect, which now run 2 to 3 times faster.
      
      Notably, when this patch is applied to 11.4, server startup in with
      ephemeral certificates becomes approximately 10x faster due to optimized
      wolfSSL_EVP_PKEY_keygen().
      
      Additionally, refactored WolfSSL by removing old workarounds and
      consolidating wolfssl and wolfcrypt into a single library wolfssl, just
      like it was done in WolfSSL's own CMake.
      d510f805
  16. 16 Feb, 2024 4 commits
    • Xiaotong Niu's avatar
      MDEV-28430: Fix memory barrier missing of lf_alloc on Arm64 · 8a505980
      Xiaotong Niu authored
      When testing MariaDB on Arm64, a stall issue will occur, jira link:
      https://jira.mariadb.org/browse/MDEV-28430.
      
      The stall occurs because of an unexpected circular reference in the
      LF_PINS->purgatory list which is traversed in lf_pinbox_real_free().
      
      We found that on Arm64, ABA problem in LF_ALLOCATOR->top list was not
      solved, and various undefined problems will occur, including circular
      reference in LF_PINS->purgatory list.
      
      The following codes are used to solve ABA problem, code copied
      from below link.
      https://github.com/MariaDB/server/blob/cb4c2713553c5f522d2a4ebf186c6505384c748d/mysys/lf_alloc-pin.c#L501-#L505
      
           do
           {
      503     node= allocator->top;
      504     lf_pin(pins, 0, node);
      505  } while (node != allocator->top && LF_BACKOFF());
      
      1. ABA problem on Arm64
      Combine the below steps to analyze how ABA problem occur on Arm64, the
      relevant codes in steps are simplified, code line numbers below are in
      MariaDB v10.4.
      ------------------------------------------------------------------------
      Abnormal case.
      Initial state: pin = 0, top = A, top list: A->B
      
      T1                              T2
                                      step1. write top=B //seq-cst, #L517
                                      step2. write A->next= "any"
                                      step3. read pin==0 //relaxed, #L295
      step1. write pin=A  //seq-cst, #L504
      step2. read old value of top==A  //relaxed, #L505
      step3. next=A->next="any" //#L517
                                      step4. write A->next=B,top=A //#L420-435
      step4. CAS(top,A,next) //#L517
      step5. write pin=0     //#L521
      ------------------------------------------------------------------------
      Above case is due to T1.step2 reading the old value of top, causing
      "T1.step3, T1.step4" and "T2.step4" to occur at the same time, in other
      words, they are not mutually exclusive.
      
      It may happen that T2.step4 is sandwiched between T1.step3 and T1.step4,
      which cause top to be updated to "any", which may be in-use or invalid
      address.
      
      2. Analyze above issue with Dekker's algorithm
      Above problem can be mapped to Dekker's algorithm, link is as below
      https://en.wikipedia.org/wiki/Dekker%27s_algorithm.
      The following extracts the read and write operations on 'top' and 'pin',
      and maps them to Dekker's algorithm to analyze the root cause.
      ------------------------------------------------------------------------
      Initial state: top = A, pin = 0
      T1                                    T2
      store_seq_cst(pin, A) // write pin    store_seq_cst(top, B)  //write top
      rt= load_relaxed(top) // read top     rp= load_relaxed(pin)  //read pin
      
      if (rt == A && rp == 0) printf("oops\n"); // will "oops" be printed?
      ------------------------------------------------------------------------
      How T1 and T2 enter their critical section:
      (1) T1, write pin, if T1 reads that top has not been updated, T1 enter
      its critical section(T1.step3 and T1.step4, try to obtain 'A', #L517),
      otherwise just give up (T1 without priority).
      (2) T2, write top, if T2 reads that pin has not been updated, T2 enter
      critical section(T2.step4, try to add 'A' to top list again, #L420-435),
      otherwise wait until pin!=A (T2 with priority).
      
      In the previous code, due to load 'top' and 'pin' with relaxed semantic,
      on arm and ppc, there is no guarantee that the above critical sections
      are mutually exclusive, in other words, "oops" will be printed.
      
      This bug only happens on arm and ppc, not x86. On current x86
      implementation, load is always seq-cst (relaxed and seq-cst load
      generates same machine code), as shown in https://godbolt.org/z/sEzMvnjd9
      
      3. Fix method
      Add sequential-consistency semantic to read 'top' in #L505(T1.step2),
      Add sequential-consistency semantic to read "el->pin[i]" in #L295
      and #L320.
      
      4. Issue reproduce
      Add "delay" after #L503 in lf_alloc-pin.c, When run unit.lf, can quickly
      get segment fault because "top" point to an invalid address. For detail,
      see comment area of below link.
      https://jira.mariadb.org/browse/MDEV-28430.
      
      5. Futher improvement
      To make this code more robust and safe on all platforms, we recommend
      replacing volatile with C11 atomics and to fix all data races. This will
      also make the code easier to reason.
      Signed-off-by: default avatarXiaotong Niu <xiaotong.niu@arm.com>
      8a505980
    • Kristian Nielsen's avatar
      MDEV-33468: Crash due to missing stack overrun check in two recursive functions · 5707f1ef
      Kristian Nielsen authored
      Thanks to Yury Chaikou for finding this problem (and the fix).
      Reviewed-by: default avatarMonty <monty@mariadb.org>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      5707f1ef
    • Kristian Nielsen's avatar
      MDEV-33443: Unsafe use of LOCK_thd_kill in my_malloc_size_cb_func() · fdaa7a96
      Kristian Nielsen authored
      my_malloc_size_cb_func() can be called from contexts where it is not safe to
      wait for LOCK_thd_kill, for example while holding LOCK_plugin. This could
      lead to (probably very unlikely) deadlock of the server.
      
      Fix by skipping the enforcement of --max-session-mem-used in the rare cases
      when LOCK_thd_kill cannot be obtained. The limit will instead be enforced on
      the following memory allocation. This does not significantly degrade the
      behaviour of --max-session-mem-used; that limit is in any case only enforced
      "softly", not taking effect until the next point at which the thread does a
      check_killed().
      Reviewed-by: default avatarMonty <monty@mariadb.org>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      fdaa7a96
    • Kristian Nielsen's avatar
      MDEV-33426: Aria temptables wrong thread-specific memory accounting in slave thread · c73c6aea
      Kristian Nielsen authored
      Aria temporary tables account allocated memory as specific to the current
      THD. But this fails for slave threads, where the temporary tables need to be
      detached from any specific THD.
      
      Introduce a new flag to mark temporary tables in replication as "global",
      and use that inside Aria to not account memory allocations as thread
      specific for such tables.
      
      Based on original suggestion by Monty.
      Reviewed-by: default avatarMonty <monty@mariadb.org>
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      c73c6aea
  17. 15 Feb, 2024 1 commit
    • Daniel Black's avatar
      MDEV-33301 memlock with systemd still not working · 8a1904d7
      Daniel Black authored
      CapabilityBoundingSet included CAP_IPC_LOCK in MDEV-9095, however
      it requires that the executable has the capability marked in extended
      attributes also.
      
      The alternate to this is raising the RLIMIT_MEMLOCK for the service/
      process to be able to complete the mlockall system call. This needs to
      be adjusted to whatever the MariaDB server was going to allocate.
      Rather than leave the non-obvious mapping of settings and tuning,
      add the capability so its easier for the user.
      
      We set the capability, if possible, but may never be used depending
      on user settings. As such in the Debian postinst script, don't
      complain if this fails.
      
      The CAP_IPC_LOCK also facilitates the mmaping of huge memory pages.
      (see man mmap), like mariadb uses with --large-pages.
      8a1904d7
  18. 14 Feb, 2024 1 commit
    • Dave Gosselin's avatar
      MDEV-30975 Wrong result with cross Join given join order · 0185ac64
      Dave Gosselin authored
      For queries with derived tables populated having some side-effect, we
      will fill such a derived table more than once, but without clearing
      its rows.  Consequently it will have duplicate rows.
      An example query exhibiting the problem is
        SELECT STRAIGHT_JOIN c1 FROM t1 JOIN (SELECT @A := 0) x;
      Since mysql_derived_fill will, for UNCACHEABLE_DEPENDENT tables, drop
      all rows and repopulate, we relax the condition at line 1204: rather
      than assume all uncacheable values prevent early return, we now
      allow an early return for uncacheable values other than
      UNCACHEABLE_DEPENDENT.  In general, we only populate derived tables
      once unless they're dependent tables.
      0185ac64
  19. 13 Feb, 2024 5 commits