1. 07 Jun, 2023 1 commit
  2. 06 Jun, 2023 2 commits
  3. 05 Jun, 2023 1 commit
    • Brandon Nesterenko's avatar
      MDEV-13915: STOP SLAVE takes very long time on a busy system · 0a99d457
      Brandon Nesterenko authored
      The problem is that a parallel replica would not immediately stop
      running/queued transactions when issued STOP SLAVE. That is, it
      allowed the current group of transactions to run, and sometimes the
      transactions which belong to the next group could be started and run
      through commit after STOP SLAVE was issued too, if the last group
      had started committing. This would lead to long periods to wait for
      all waiting transactions to finish.
      
      This patch updates a parallel replica to try and abort immediately
      and roll-back any ongoing transactions. The exception to this is any
      transactions which are non-transactional (e.g. those modifying
      sequences or non-transactional tables), and any prior transactions,
      will be run to completion.
      
      The specifics are as follows:
      
       1. A new stage was added to SHOW PROCESSLIST output for the SQL
      Thread when it is waiting for a replica thread to either rollback or
      finish its transaction before stopping. This stage presents as
      “Waiting for worker thread to stop”
      
       2. Worker threads which error or are killed no longer perform GCO
      cleanup if there is a concurrently running prior transaction. This
      is because a worker thread scheduled to run in a future GCO could be
      killed and incorrectly perform cleanup of the active GCO.
      
       3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID
      binlog events to disallow adding it to transactions which modify
      both transactional and non-transactional engines when the binlogging
      configuration allow the modifications to exist in the same event,
      i.e. when using binlog_direct_non_trans_update == 0 and
      binlog_format == statement.
      
       4. A few existing MTR tests relied on the completion of certain
      transactions after issuing STOP SLAVE, and were re-recorded
      (potentially with added synchronizations) under the new rollback
      behavior.
      
      Reviewed By
      ===========
      Andrei Elkin <andrei.elkin@mariadb.com>
      0a99d457
  4. 04 Jun, 2023 1 commit
  5. 02 Jun, 2023 9 commits
  6. 01 Jun, 2023 1 commit
  7. 31 May, 2023 2 commits
  8. 27 May, 2023 1 commit
  9. 25 May, 2023 1 commit
  10. 24 May, 2023 1 commit
  11. 22 May, 2023 2 commits
    • Jan Lindström's avatar
      MDEV-30197 : Missing DBUG_RETURN or DBUG_VOID_RETURN macro in function... · 9f909e54
      Jan Lindström authored
      MDEV-30197 : Missing DBUG_RETURN or DBUG_VOID_RETURN macro in function "Wsrep_schema::restore_view()"
      
      Here user is starting server with unsupported client charset.
      We need to create wsrep_schema tables using explicit latin1
      charset to avoid errors in restoring view.
      9f909e54
    • Daniele Sciascia's avatar
      MDEV-30855 Remove test galera.galera_bf_abort_group_commit · 1ac00c5e
      Daniele Sciascia authored
      This test was re-enabled in commit 0174a9ff, and
      has been failing since then.
      The test is configured such that Galera runs with commit ordering
      disabled, a configuration which is which was meant for testing the
      performance penalty of commit ordering (not meant to be used in
      practice).
      Moreover, we have test galera_sr.galera_sr_bf_abort, which is
      identical, but runs with commit ordering enabled.
      No reasons to keep the failing test around.
      1ac00c5e
  12. 21 May, 2023 1 commit
    • Teemu Ollakka's avatar
      MDEV-29293 MariaDB stuck on starting commit state · 6966d7fe
      Teemu Ollakka authored
      This is a backport from 10.5.
      
      The problem seems to be a deadlock between KILL command execution
      and BF abort issued by an applier, where:
      * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data.
      * Applier has innodb side global lock mutex and victim trx mutex.
      * KILL is calling innobase_kill_query, and is blocked by innodb
        global lock mutex.
      * Applier is in wsrep_innobase_kill_one_trx and is blocked by
        victim's LOCK_thd_kill.
      
      The fix in this commit removes the TOI replication of KILL command
      and makes KILL execution less intrusive operation. Aborting the
      victim happens now by using awake_no_mutex() and ha_abort_transaction().
      If the KILL happens when the transaction is committing, the
      KILL operation is postponed to happen after the statement
      has completed in order to avoid KILL to interrupt commit
      processing.
      
      Notable changes in this commit:
      * wsrep client connections's error state may remain sticky after
        client connection is closed. This error message will then pop
        up for the next client session issuing first SQL statement.
        This problem raised with test galera.galera_bf_kill.
        The fix is to reset wsrep client error state, before a THD is
        reused for next connetion.
      * Release THD locks in wsrep_abort_transaction when locking
        innodb mutexes. This guarantees same locking order as with applier
        BF aborting.
      * BF abort from MDL was changed to do BF abort on server/wsrep-lib
        side first, and only then do the BF abort on InnoDB side. This
        removes the need to call back from InnoDB for BF aborts which originate
        from MDL and simplifies the locking.
      * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h.
        The manipulation of the wsrep_aborter can be done solely on
        server side. Moreover, it is now debug only variable and
        could be excluded from optimized builds.
      * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more
        fine grained locking for SR BF abort which may require locking
        of victim LOCK_thd_kill. Added explicit call for
        wsrep_thd_kill_LOCK/UNLOCK where appropriate.
      * Wsrep-lib was updated to version which allows external
        locking for BF abort calls.
      
      Changes to MTR tests:
      * Disable galera_bf_abort_group_commit. This test is going to
        be removed (MDEV-30855).
      * Record galera_gcache_recover_manytrx as result file was incomplete.
        Trivial change.
      * Make galera_create_table_as_select more deterministic:
        Wait until CTAS execution has reached MDL wait for multi-master
        conflict case. Expected error from multi-master conflict is
        ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open
        wsrep transaction when it is waiting for MDL, query gets interrupted
        instead of BF aborted. This should be addressed in separate task.
      * A new test galera_kill_group_commit to verify correct behavior
        when KILL is executed while the transaction is committing.
      Co-authored-by: default avatarSeppo Jaakola <seppo.jaakola@iki.fi>
      Co-authored-by: default avatarJan Lindström <jan.lindstrom@galeracluster.com>
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      6966d7fe
  13. 20 May, 2023 1 commit
    • Oleg Smirnov's avatar
      MDEV-30143 Segfault on select query using index for group-by and filesort · 60f0765b
      Oleg Smirnov authored
      The problem was trying to access JOIN_TAB::select which is set to NULL
      when using the filesort. The correct way is accessing either
      JOIN_TAB::select or JOIN_TAB::filesort->select depending on whether
      the filesort is used.
      This commit introduces member function JOIN_TAB::get_sql_select()
      encapsulating that check so the code duplication is eliminated.
      
      The new condition (s->table->quick_keys.is_set(best_key->key))
      was added to  best_access_path() to eliminate a Valgrind error.
      The cause of that error was using TRASH_ALLOC(quick_key_parts)
      instead of bzero(quick_key_parts); hence, accessing
      s->table->quick_key_parts[best_key->key]) without prior checking
      for quick_keys.is_set() might have caused reading "dirty" memory
      60f0765b
  14. 19 May, 2023 5 commits
    • Sergei Petrunia's avatar
      Fix ./mtr --view-protocol opt_trace · 131ef14a
      Sergei Petrunia authored
      Follow the approach taken in the rest of the test.
      131ef14a
    • Vlad Lesin's avatar
      MDEV-31185 rw_trx_hash_t::find() unpins pins too early · b54e7b0c
      Vlad Lesin authored
      rw_trx_hash_t::find() acquires element->mutex, then unpins pins, used for
      lf_hash element search. After that the "element" can be deallocated and
      reused by some other thread.
      
      If we take a look rw_trx_hash_t::insert()->lf_hash_insert()->lf_alloc_new()
      calls, we will not find any element->mutex acquisition, as it was not
      initialized yet before it's allocation. rw_trx_hash_t::insert() can reuse
      the chunk, unpinned in rw_trx_hash_t::find().
      
      The scenario is the following:
      
      1. Thread 1 have just executed lf_hash_search() in
      rw_trx_hash_t::find(), but have not acquired element->mutex yet.
      2. Thread 2 have removed the element from hash table with
      rw_trx_hash_t::erase() call.
      3. Thread 1 acquired element->mutex and unpinned pin 2 pin with
      lf_hash_search_unpin(pins) call.
      4. Some thread purged memory of the element.
      5. Thread 3 reused the memory for the element, filled element->id,
      element->trx.
      6. Thread 1 crashes with failed "DBUG_ASSERT(trx_id == trx->id)"
      assertion.
      
      Note that trx_t objects are also reused, see the code around trx_pools
      for details.
      
      The fix is to invoke "lf_hash_search_unpin(pins);" after element->trx is
      stored in local variable in rw_trx_hash_t::find().
      
      Reviewed by: Nikita Malyavin, Marko Mäkelä.
      b54e7b0c
    • Robin Newhouse's avatar
      All-green GitLab CI in 10.4 branch · f4ce1e48
      Robin Newhouse authored
      Note to mergers: Do not merge this commit to 10.5+. An additional PR
      will be created for the 10.5 branch which is compatible with later
      branches.
      
      Include cppcheck and FlawFinder for SAST scanning.
      
      From 10.6, cherry-picked 12bf5c46 (Remove unused French translations in
      Connect engine) and c6072ed9 (Ensure that source files contain only
      valid UTF8 encodings). Necessary for FlawFinder to execute and useful
      anyway.
      
      Removing MSAN build and test as it was not introduced until 10.5 and
      does not successfully build.
      
      Remove failing upgrade test since Fedora installs MariaDB 10.5 and the
      10.5->10.4 upgrade rightfully complains
      
      Add to skiplist failing test: main.func_math (MDEV-20966)
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer
      Amazon Web Services, Inc.
      f4ce1e48
    • anson1014's avatar
      Ensure that source files contain only valid UTF8 encodings (#2188) · 1db4fc54
      anson1014 authored
      Modern software (including text editors, static analysis software,
      and web-based code review interfaces) often requires source code files
      to be interpretable via a consistent character encoding, with UTF-8 or
      ASCII (a strict subset of UTF-8) as the default. Several of the MariaDB
      source files contain bytes that are not valid in either the UTF-8 or
      ASCII encodings, but instead represent strings encoded in the
      ISO-8859-1/Latin-1 or ISO-8859-2/Latin-2 encodings.
      
      These inconsistent encodings may prevent software from correctly
      presenting or processing such files. Converting all source files to
      valid UTF8 characters will ensure correct handling.
      
      Comments written in Czech were replaced with lightly-corrected
      translations from Google Translate. Additionally, comments describing
      the proper handling of special characters were changed so that the
      comments are now purely UTF8.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer
      Amazon Web Services, Inc.
      Co-authored-by: default avatarAndrew Hutchings <andrew@linuxjedi.co.uk>
      1db4fc54
    • anson1014's avatar
      Remove unused French translations in Connect engine (#2252) · c205f6c1
      anson1014 authored
      These files are currently not being used nor compiled in MariaDB. The
      use of large lists of 'case' statements in these source files are also
      not a great way to represent translated strings. This git history can
      be referred to when a better translation interface can be implemented
      in the future.
      
      Therefore, these files can be removed to cleanup the MariaDB codebase.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer
      Amazon Web Services, Inc.
      c205f6c1
  15. 16 May, 2023 2 commits
  16. 15 May, 2023 3 commits
  17. 12 May, 2023 6 commits
    • Mikhail Chalov's avatar
      Fix insecure use of strcpy, strcat and sprintf in Connect · 2ff01e76
      Mikhail Chalov authored
      Old style C functions `strcpy()`, `strcat()` and `sprintf()` are vulnerable to
      security issues due to lacking memory boundary checks. Replace these in the
      Connect storage engine with safe new and/or custom functions such as
      `snprintf()` `safe_strcpy()` and `safe_strcat()`.
      
      With this change FlawFinder and other static security analyzers report 287
      fewer findings.
      
      All new code of the whole pull request, including one or several files that are
      either new files or modified ones, are contributed under the BSD-new license. I
      am contributing on behalf of my employer Amazon Web Services, Inc.
      2ff01e76
    • Alexander Barkov's avatar
      MDEV-31250 ROW variables do not get assigned from subselects · b3cdb612
      Alexander Barkov authored
      ROW variables did not get assigned from subselects in these contexts:
      
      BEGIN
        DECLARE r ROW TYPE OF t1;
        SET r=(SELECT * FROM t1 WHERE a=1);
      END;
      
      BEGIN
        DECLARE r ROW TYPE OF t1 DEFAULT (SELECT * FROM t1 WHERE a=1);
      END;
      
      All fields of the ROW variable remained NULL.
      b3cdb612
    • Igor Babaev's avatar
      MDEV-31240 Crash with condition pushable into derived and containing outer reference · 0474466b
      Igor Babaev authored
      This bug could affect queries containing a subquery over splittable derived
      tables and having an outer references in its WHERE clause. If such subquery
      contained an equality condition whose left part was a reference to a column
      of the derived table and the right part referred only to outer columns
      then the server crashed in the function st_join_table::choose_best_splitting()
      The crashing code was added in the commit ce7ffe61
      that made the code of the function sensitive to presence of the flag
      OUTER_REF_TABLE_BIT in the KEYUSE_EXT::needed_in_prefix fields.
      
      The field needed_in_prefix of the KEYUSE_EXT structure should not contain
      table maps with OUTER_REF_TABLE_BIT or RAND_TABLE_BIT.
      
      Note that this fix is quite conservative: for affected queries it just
      returns the query plans that were used before the above mentioned commit.
      In fact the equalities causing crashes should be pushed into derived tables
      without any usage of split optimization.
      
      Approved by Sergei Petrunia <sergey@mariadb.com>
      0474466b
    • Jan Lindström's avatar
      MDEV-28433 : Server crashes when wsrep_sst_donor and wsrep_cluster_address set to NULL · f102b595
      Jan Lindström authored
      Do not allow setting wsrep_sst_donor as NULL as it is
      incorrect value. User can use value '' (default) that represents
      same as NULL. Setting wsrep_cluster_address to NULL is
      already handled correctly.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      f102b595
    • Daniele Sciascia's avatar
      MDEV-30473 Remove test galera.MDEV-27713 · 7d55eb00
      Daniele Sciascia authored
      Remove test galera.MDEV-27713. This test relies on GET_LOCK() and has
      stopped working since commit 844ddb11 (see MDEV-30473). This commit
      disabled GET_LOCK() in combination with Galera.
      Signed-off-by: default avatarJulius Goryavsky <julius.goryavsky@mariadb.com>
      7d55eb00
    • Julius Goryavsky's avatar
      3a7b3113