1. 15 Feb, 2021 3 commits
  2. 14 Feb, 2021 1 commit
    • Sergei Golubchik's avatar
      updating @@wsrep_cluster_address deadlocks · 26965387
      Sergei Golubchik authored
      wsrep_cluster_address_update() causes LOCK_wsrep_slave_threads
      to be locked under LOCK_wsrep_cluster_config, while normally
      the order should be the opposite.
      
      Fix: don't protect @@wsrep_cluster_address value with the
      LOCK_wsrep_cluster_config, LOCK_global_system_variables is enough.
      
      Only protect wsrep reinitialization with the LOCK_wsrep_cluster_config.
      And make it use a local copy of the global @@wsrep_cluster_address.
      
      Also, introduce a helper function that checks whether
      wsrep_cluster_address is set and also asserts that it can be safely
      read by the caller.
      26965387
  3. 12 Feb, 2021 7 commits
    • Sergei Golubchik's avatar
      fix a 3-way deadlock in galera_sr.galera-features#56 · b91e77cf
      Sergei Golubchik authored
      rarely (try --repeat 1000), the following happens:
      
      * from wsrep_bf_abort (when a thread is being killed), wsrep-lib
      starts streaming_rollback that wants to
      convert_streaming_client_to_applier. wsrep_create_streaming_applier
      creates a new THD(). All while the other THD is being killed,
      so under LOCK_thd_kill and LOCK_thd_data. In particular, THD::init()
      takes LOCK_global_system_variables under LOCK_thd_kill.
      
      * updating @@wsrep_slave_threads takes LOCK_global_system_variables
      and LOCK_wsrep_cluster_config (in that order) and invokes
      wsrep_slave_threads_update() that takes LOCK_wsrep_slave_threads
      
      * wsrep_replication_process() takes LOCK_wsrep_slave_threads and
      invokes wsrep_close_applier(), that does thd->set_killed() which
      takes LOCK_thd_kill.
      
      et voilà.
      
      As a fix I copied a workaround from wsrep_cluster_address_update()
      to wsrep_slave_threads_update(). It seems to be safe: without mutexes
      a race condition is possible and a concurrent SET might change
      wsrep_slave_threads, but wsrep_slave_threads_update() always verifies
      if there's a need to do something, so it will not run twice in this case,
      it'll be a no-op.
      b91e77cf
    • Sergei Golubchik's avatar
      remove find_thread_with_thd_data_lock_callback · 259b9452
      Sergei Golubchik authored
      let the caller take the lock if needed
      259b9452
    • Sergei Golubchik's avatar
      MDEV-23328 Server hang due to Galera lock conflict resolution · eac8341d
      Sergei Golubchik authored
      adaptation of 29bbcac0 for 10.4
      eac8341d
    • Sergei Golubchik's avatar
      don't take mutexes conditionally · 9703cffa
      Sergei Golubchik authored
      9703cffa
    • Sergei Golubchik's avatar
      cleanup: THD::abort_current_cond_wait() · 259a1902
      Sergei Golubchik authored
      * reuse the loop in THD::abort_current_cond_wait, don't duplicate it
      * find_thread_by_id should return whatever it has found, it's the
        caller's task not to kill COM_DAEMON (if the caller's a killer)
      
      and other minor changes
      259a1902
    • Elena Stepanova's avatar
      List of unstable tests for 10.4.18 release · cbbcc8fa
      Elena Stepanova authored
      Test code modifications and new failures from buildbot registered
      only for the main suite. The rest was updated partially,
      based on the status of existing JIRA items
      cbbcc8fa
    • Sergei Golubchik's avatar
      Merge branch 'bb-10.3-release' into bb-10.4-release · 00a313ec
      Sergei Golubchik authored
      Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
      was null-merged. 10.4 version of the fix is coming up separately
      00a313ec
  4. 09 Feb, 2021 1 commit
  5. 08 Feb, 2021 4 commits
    • Monty's avatar
      MDEV-24087 s3.replication_partition fails in buildbot wiht replication failure · ffc5d064
      Monty authored
      A few of the failures was because of missing sync_slave_to_master in
      the test suite.
      
      However, the biggest reason for most faulures was that in case of
      ALTER PARTITION the master writes the query to the binary log before
      it has updated the .frm and .par files. This causes a problem for an
      S3 slave as it will start execute the ALTER PARTITION but get old .frm and
      .par files from S3 which causes "open table" to fail, either with an error
      or in some case with a crash.
      Fixed
      ffc5d064
    • Monty's avatar
      Make maria_data_root const char* · bd5ac038
      Monty authored
      This allow one to remove some casts like:
      maria_data_root= (char *)".";
      
      It also removes warnings from icc.
      bd5ac038
    • Monty's avatar
      Added 'const' to arguments in get_one_option and find_typeset() · 5d6ad2ad
      Monty authored
      One should not change the program arguments!
      This change also reduces warnings from the icc compiler.
      
      Almost all changes are just syntax changes (adding const to
      'get_one_option function' declarations).
      
      Other changes:
      - Added a few cast of 'argument' from 'const char*' to 'char *'. This
        was mainly in calls to 'external' functions we don't have control of.
      - Ensure that all reset of 'password command line argument' are similar.
        (In almost all cases it was just adding a comment and a cast)
      - In mysqlbinlog.cc and mysqld.cc there was a few cases that changed
        the command line argument. These places where changed to instead allocate
        the option in a MEM_ROOT to avoid changing the argument. Some of this
        code was changed to ensure that different programs did parsing the
        same way. Added a test case for the changes in mysqlbinlog.cc
      - Changed a few variables that took their value from command line options
        from 'char *' to 'const char *'.
      5d6ad2ad
    • Monty's avatar
      Ensure that mysqlbinlog frees all memory at exit · e30a3048
      Monty authored
      e30a3048
  6. 07 Feb, 2021 3 commits
  7. 06 Feb, 2021 1 commit
  8. 05 Feb, 2021 6 commits
  9. 03 Feb, 2021 1 commit
    • Monty's avatar
      MDEV-24750 Various corruptions caused by Aria subsystem... · eacefbca
      Monty authored
      The test case was setting aria_sort_buffer_size to MAX_ULONGLONG-1
      which was not handled gracefully by my_malloc() or safemalloc().
      Fixed by ensuring that the malloc functions returns 0 if the size
      is too big.
      I also added some protection to Aria repair:
      - Limit sort_buffer_size to 16G (after that a bigger sort buffer will
        not help that much anyway)
      - Limit sort_buffer_size also according to sort file size. This will
        help by not allocating less memory if someone sets the buffer size too
        high.
      eacefbca
  10. 02 Feb, 2021 6 commits
  11. 01 Feb, 2021 7 commits