1. 27 Jan, 2021 9 commits
    • Marko Mäkelä's avatar
      b32f057d
    • Marko Mäkelä's avatar
      462cb666
    • Marko Mäkelä's avatar
      MDEV-24671: Replace lock_wait_timeout_task with mysql_cond_timedwait() · e71e6133
      Marko Mäkelä authored
      lock_wait(): Replaces lock_wait_suspend_thread(). Wait for the lock to
      be granted or the transaction to be killed using mysql_cond_timedwait()
      or mysql_cond_wait().
      
      lock_wait_end(): Replaces que_thr_end_lock_wait() and
      lock_wait_release_thread_if_suspended().
      
      lock_wait_timeout_task: Remove. The operating system kernel will
      resume the mysql_cond_timedwait() in lock_wait(). An added benefit
      is that innodb_lock_wait_timeout no longer has a 'jitter' of 1 second,
      which was caused by this wake-up task waking up only once per second,
      and then waking up any threads for which the timeout (which was only
      measured in seconds) was exceeded.
      
      innobase_kill_query(): Set trx->error_state=DB_INTERRUPTED,
      so that a call trx_is_interrupted(trx) in lock_wait() can be avoided.
      
      We will protect things more consistently with lock_sys.wait_mutex,
      which will be moved below lock_sys.mutex in the latching order.
      
      trx_lock_t::cond: Condition variable for !wait_lock, used with
      lock_sys.wait_mutex.
      
      srv_slot_t: Remove. Replaced by trx_lock_t::cond,
      
      lock_grant_after_reset(): Merged to to lock_grant().
      
      lock_rec_get_index_name(): Remove.
      
      lock_sys_t: Introduce wait_pending, wait_count, wait_time, wait_time_max
      that are protected by wait_mutex.
      
      trx_lock_t::que_state: Remove.
      
      que_thr_state_t: Remove QUE_THR_COMMAND_WAIT, QUE_THR_LOCK_WAIT.
      
      que_thr_t: Remove is_active, start_running(), stop_no_error().
      
      que_fork_t::n_active_thrs, trx_lock_t::n_active_thrs: Remove.
      e71e6133
    • Marko Mäkelä's avatar
      Cleanups: · 7f1ab8f7
      Marko Mäkelä authored
      que_thr_t::fork_type: Remove.
      
      QUE_THR_SUSPENDED, TRX_QUE_COMMITTING: Remove.
      
      Cleanup lock_cancel_waiting_and_release()
      7f1ab8f7
    • Marko Mäkelä's avatar
      ff3f07ce
    • Marko Mäkelä's avatar
      Cleanup the lock creation · 898dcf93
      Marko Mäkelä authored
      LOCK_MAX_N_STEPS_IN_DEADLOCK_CHECK, LOCK_MAX_DEPTH_IN_DEADLOCK_CHECK,
      LOCK_RELEASE_INTERVAL: Replace with the bare use of the constants.
      
      lock_rec_create_low(): Remove LOCK_PAGE_BITMAP_MARGIN altogether.
      We already have REDZONE_SIZE as a 'safety margin' in AddressSanitizer
      builds, to catch any out-of-bounds access.
      
      lock_prdt_add_to_queue(): Avoid a useless search when enqueueing
      a waiting lock request.
      
      lock_prdt_lock(): Reduce the size of the trx->mutex critical section.
      898dcf93
    • Marko Mäkelä's avatar
      Cleanup: Remove trx_get_id_for_print() · 469da6c3
      Marko Mäkelä authored
      Any transaction that has requested a lock must have trx->id!=0.
      
      trx_print_low(): Distinguish non-locking or inactive transaction
      objects by displaying the pointer in parentheses.
      
      fill_trx_row(): Do not try to map trx->id to a pointer-based value.
      469da6c3
    • Vladislav Vaintroub's avatar
      MDEV-23959 GSSAPI plugin - support AD or local group name , and SIDs on Windows · 7ebabea5
      Vladislav Vaintroub authored
      Support membership tests in SSPI with special prefix form
      
      CREATE USER u IDENTIFIED WITH gssapi AS "GROUP:<group_name>"
      or
      CREATE USER u IDENTIFIED WITH gssapi AS "SID:<sid>"
      
      If user is created as one of the above, after successful SSPI handshake,
      this will happen
      
      1) If "GROUP:" prefix is used, then <group_name> is translated to SID
      using LookupAccountName() API
      
      2) SSPI user is checked for  SID membership with
      ImpersonateSecurityContext() and CheckMembership() APIs
      
      Note, that it <group>/<sid> do not need strictly to refer to an actual
      group.
      Identity test is also supported, e.g  "GROUP:<users_name>" or
      "SID:<user_sid>" will work too.
      
      
      Well-known SIDs (in SDDL syntax) appear to be supported such as
      "SID:WD" will refer to World/Everyone (== "SID:S-1-1-0")
      or
      "SID:BA" will refer to Administrators (== "SID:S-1-5-32-544")
      
      In UAC environments, for successful checks against Administrators group,
      elevation(Run As Administrator) might be necessary, since CheckMembership()
      needs groups to be marked as enabled in the token group list.
      7ebabea5
    • Vladislav Vaintroub's avatar
      MDEV-24685 - remove IO thread states output from SHOW ENGINE INNODB STATUS · c310f4c3
      Vladislav Vaintroub authored
      There are no IO threads anymore.
      c310f4c3
  2. 26 Jan, 2021 1 commit
    • mkaruza's avatar
      MDEV-20008: Galera strict mode · 95a2bca0
      mkaruza authored
      Added new enum variable `wsrep_mode` which can be used to turn on WSREP
      features which are not part of default behaviour.
      Added enum `BINLOG_ROW_FORMAT_ONLY`, `REQUIRED_PRIMARY_KEY` and
      `STRICT_REPLICATION`. `wsrep-mode=STRICT_REPLICATION` behaves
      like variable `wsrep_strict_ddl`.
      
      Variable wsrep_strict_ddl is deprecated and if set we use
      new wsrep_mode setting instead.
      
      Reviewed and improved by: Jan Lindström <jan.lindstrom@mariadb.com>
      95a2bca0
  3. 25 Jan, 2021 13 commits
    • Marko Mäkelä's avatar
    • Marko Mäkelä's avatar
      MDEV-515 Reduce InnoDB undo logging for insert into empty table · 3cef4f8f
      Marko Mäkelä authored
      We implement an idea that was suggested by Michael 'Monty' Widenius
      in October 2017: When InnoDB is inserting into an empty table or partition,
      we can write a single undo log record TRX_UNDO_EMPTY, which will cause
      ROLLBACK to clear the table.
      
      For this to work, the insert into an empty table or partition must be
      covered by an exclusive table lock that will be held until the transaction
      has been committed or rolled back, or the INSERT operation has been
      rolled back (and the table is empty again), in lock_table_x_unlock().
      
      Clustered index records that are covered by the TRX_UNDO_EMPTY record
      will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot
      be distinguished from what MDEV-12288 leaves behind after purging the
      history of row-logged operations.
      
      Concurrent non-locking reads must be adjusted: If the read view was
      created before the INSERT into an empty table, then we must continue
      to imagine that the table is empty, and not try to read any records.
      If the read view was created after the INSERT was committed, then
      all records must be visible normally. To implement this, we introduce
      the field dict_table_t::bulk_trx_id.
      
      This special handling only applies to the very first INSERT statement
      of a transaction for the empty table or partition. If a subsequent
      statement in the transaction is modifying the initially empty table again,
      we must enable row-level undo logging, so that we will be able to
      roll back to the start of the statement in case of an error (such as
      duplicate key).
      
      INSERT IGNORE will continue to use row-level logging and locking, because
      implementing it would require the ability to roll back the latest row.
      Since the undo log that we write only allows us to roll back the entire
      statement, we cannot support INSERT IGNORE. We will introduce a
      handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage
      engines that INSERT IGNORE is being executed.
      
      In many test cases, we add an extra record to the table, so that during
      the 'interesting' part of the test, row-level locking and logging will
      be used.
      
      Replicas will continue to use row-level logging and locking until
      MDEV-24622 has been addressed. Likewise, this optimization will be
      disabled in Galera cluster until MDEV-24623 enables it.
      
      dict_table_t::bulk_trx_id: The latest active or committed transaction
      that initiated an insert into an empty table or partition.
      Protected by exclusive table lock and a clustered index leaf page latch.
      
      ins_node_t::bulk_insert: Whether bulk insert was initiated.
      
      trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert).
      Unlike earlier, this collection will cover also temporary tables.
      
      trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(),
      is_bulk_insert(), was_bulk_insert().
      
      trx_undo_report_row_operation(): Before accessing any undo log pages,
      invoke trx->mod_tables.emplace() in order to determine whether undo
      logging was disabled, or whether this is the first INSERT and we are
      supposed to write a TRX_UNDO_EMPTY record.
      
      row_ins_clust_index_entry_low(): If we are inserting into an empty
      clustered index leaf page, set the ins_node_t::bulk_insert flag for
      the subsequent trx_undo_report_row_operation() call.
      
      lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock():
      Remove the redundant parameter 'flags' that can be checked in the caller.
      
      btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write
      DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation().
      
      trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT),
      ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that
      the next statement will not be covered by table-level undo logging.
      
      ReadView::changes_visible(trx_id_t) const: New accessor for the case
      where the trx_id_t is not read from a potentially corrupted index page
      but directly from the memory. In this case, we can skip a sanity check.
      
      row_sel(), row_sel_try_search_shortcut(), row_search_mvcc():
      row_sel_try_search_shortcut_for_mysql(),
      row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id.
      
      row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees().
      
      lock_sec_rec_cons_read_sees(): Replaced with lower-level code.
      
      btr_root_page_init(): Refactored from btr_create().
      
      dict_index_t::clear(), dict_table_t::clear(): Empty an index or table,
      for the ROLLBACK of an INSERT operation.
      
      ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT
      into an empty table.
      
      This is joint work with Thirunarayanan Balathandayuthapani,
      who created a working prototype.
      Thanks to Matthias Leich for extensive testing.
      3cef4f8f
    • Marko Mäkelä's avatar
      MDEV-24642 Assertion r->emplace... failed in sux_lock::s_lock_register() · 7aed5eb7
      Marko Mäkelä authored
      In commit 03ca6495 (MDEV-24142)
      we replaced a debug data structure that holds information about
      S-latch holders with a std::set, which does not allow duplicates.
      
      The assertion failed in btr_search_guess_on_hash() in an
      s_lock_try() operation.
      
      The reason why recursive S-latch requests are not normally allowed
      is that if some other thread has enqueued a waiting X-lock, then
      further S-latch requests will block until the exclusive lock has been
      granted and released. If a thread were already holding one S-latch
      while waiting for the X-latch to be granted and released by another
      thread, the two threads would deadlock.
      
      However, the nonblocking s_lock_try() is perfectly fine;
      it will immediately return failure in case of conflict.
      
      sux_lock::readers: Use std::unordered_multiset instead of std::set.
      
      sux_lock::s_lock_register(): Allow 'duplicate' requests. Blocking-mode
      latch acquisitions are already covered by !have_s() assertions.
      
      sux_lock::s_unlock(): Erase only one element from readers.
      
      buf_page_try_get(): Revert to s_lock_try(). It had been previously
      changed to the more intrusive u_lock_try() in response to the
      debug check failing.
      7aed5eb7
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · e9fc6105
      Marko Mäkelä authored
      e9fc6105
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 927a8823
      Marko Mäkelä authored
      927a8823
    • Marko Mäkelä's avatar
      e626f511
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · 5db38276
      Marko Mäkelä authored
      5db38276
    • Marko Mäkelä's avatar
      75538f94
    • Marko Mäkelä's avatar
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 46234f03
      Marko Mäkelä authored
      46234f03
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 961c7938
      Marko Mäkelä authored
      961c7938
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · 3467f637
      Marko Mäkelä authored
      3467f637
    • Marko Mäkelä's avatar
      MDEV-24653 Assertion block->page.id.page_no() == index->page failed in innobase_add_instant_try() · eaeb8ec4
      Marko Mäkelä authored
      We may end up with an empty leaf page (containing only an ADD COLUMN
      metadata record) that is not the root page.
      
      innobase_add_instant_try(): Disable an optimization for a non-canonical
      empty table that contains a metadata record somewhere else than in
      the root page.
      
      btr_pcur_store_position(): Tolerate a non-canonical empty table.
      eaeb8ec4
  4. 23 Jan, 2021 2 commits
    • Marko Mäkelä's avatar
      MDEV-24661: Disable an unstable test · 5adcb2e7
      Marko Mäkelä authored
      5adcb2e7
    • Marko Mäkelä's avatar
      MDEV-24659 Assertion !fsp_is_system_temporary(bpage->id().space()) failed in... · 84b8f529
      Marko Mäkelä authored
      MDEV-24659 Assertion !fsp_is_system_temporary(bpage->id().space()) failed in buf_flush_relocate_on_flush_list()
      
      When commit 5eb53955 (MDEV-12227)
      removed the pages of temporary tables from the buf_pool.flush_list,
      an adjustment to the buffer pool resizing was forgotten.
      
      buf_pool_t::realloc(): Do not invoke buf_flush_relocate_on_flush_list()
      for pages that belong to the temporary tablespace. Also, deduplicate
      some code at the end.
      
      buf_page_t::set_corrupt_id(): Tolerate oldest_modification()==1
      (the dummy value) for temporary tablespace pages. The revised
      buf_pool_t::realloc() may invoke this on dirty temporary tablespace pages.
      84b8f529
  5. 22 Jan, 2021 5 commits
  6. 21 Jan, 2021 6 commits
    • Sergei Golubchik's avatar
      MDEV-24593 Signal 11 when group by primary key of table joined to information_schema.columns · 4e503aec
      Sergei Golubchik authored
      I_S tables were materialized too late, an attempt to use table
      statistics before the table was created caused a crash.
      
      Let's move table creation up. it only needs read_set to
      be calculated properly, this happens in JOIN::optimize_inner(),
      after semijoin transformation.
      
      Note that tables are not populated at that point, so most of the
      statistics would make no sense anyway. But at least field sizes
      will be correct. And it won't crash.
      4e503aec
    • Sergei Golubchik's avatar
      remove now-unused rdiff file · 61feb568
      Sergei Golubchik authored
      61feb568
    • Monty's avatar
      MDEV-24452 ALTER TABLE event take infinite time which for example breaks mysql_upgrade · 6eb1eed5
      Monty authored
      The problem was that update_timing_fields_for_event() didn't release all
      MDL locks it took.
      6eb1eed5
    • Karel Picman's avatar
      1936b3c8
    • Jan Lindström's avatar
      MDEV-24596 : Assertion `state_ == s_exec || state_ == s_quitting' failed in... · be5fce16
      Jan Lindström authored
      MDEV-24596 : Assertion `state_ == s_exec || state_ == s_quitting' failed in wsrep::client_state::disable_streaming
      
      There were multiple problems here
      * wsrep_trx_fragment_size should not be set when wsrep is disabled or provider is not loaded
      * wsrep_trx_fragment_unit should not be set when wsrep is disabled or provider is not loaded
      * wsrep_debug has no effect if wsrep is disabled or provider is not loaded
      * wsrep_start_position should not be set when wsrep is disabled or provider is not loaded any other value than default
      * wsrep_start_position should be changed only when we are joiner or initialized
      * wsrep_start_position should be allowed to set only a value that exits, thus
      we need to add error handling to wsrep_sst_complete
      be5fce16
    • Hartmut Holzgraefe's avatar
      MDEV-10271: add master host/port info to slave thread exit messages · fa14c423
      Hartmut Holzgraefe authored
      Sample log error message generated:
      
      mysql-test/var/log/mysqld.2.err:2021-01-21 13:02:30 8 [Note] Slave SQL thread exiting, replication stopped in log 'master-bin.000001' at position 329, master: 127.0.0.1:16000
      mysql-test/var/log/mysqld.2.err:2021-01-21 13:02:30 7 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 329, master 127.0.0.1:16000
      mysql-test/var/log/mysqld.2.err:2021-01-21 13:02:30 12 [Note] Slave SQL thread exiting, replication stopped in log 'master-bin.000001' at position 329; GTID position '', master: 127.0.0.1:16000
      
      Reviewer: knielsen@knielsen-hq.org, Andrei and Sachin
      fa14c423
  7. 20 Jan, 2021 1 commit
    • sjaakola's avatar
      MDEV-21153 Replica nodes crash due to indexed virtual columns and FK cascading delete · 9377e9ba
      sjaakola authored
      Fix for MDEV-23033 fixes a problem in replication applying of transactions, which contain cascading foreign key delete for a table, which has indexed virtual column.
      This fix adds slave_fk_event_map flag for table, to mark when the prelocking is needed for applying of a transaction.
      See commit 608b0ee5 for more details.
      However, this fix is targeted for async replication only, Rows_log_event::do_apply_event() has condition to rule out galera replication from the fix domain, and use cases suffering from MDEV-23033 and related MDEV-21153 will fail in galera cluster.
      
      The fix in this commit removes the condition to rule out the setting of slave_fk_event_map flag from galera replication, and makes the fix in MDEV-23033 effective for galera replication as well.
      
      However, the above fix has caused regressions for some galera_sr suite tests, which run tests for streaming replication.
      This regression can be observed e.g. by: /mtr galera_sr.galera_sr_multirow_rollback  --mysqld=--slave_run_triggers_for_rbr=yes
      These galera_sr suite tests were failing in last phase of replication applying, where actual transaction is already applied, and streaming replication related meta data needs to be updated in wsrep system tables.
      Opening the wsrep system tables failed for corrupt data in THD::lex:query_tables_list. The fix in this commit uses back query table list for the duration of fragment update operation.
      
      Finally, a mtr test for virtual column support has been added. galera.galera_virtual_column.test has as first test a scenario from MDEV-21153
      
      new fix
      Reviewed-by: default avatarJan Lindström <jan.lindstrom@mariadb.com>
      9377e9ba
  8. 19 Jan, 2021 3 commits
    • sjaakola's avatar
      MDEV-21153 Replica nodes crash due to indexed virtual columns and FK cascading delete · 7d04ce6a
      sjaakola authored
      Fix for MDEV-23033 fixes a problem in replication applying of transactions, which contain cascading foreign key delete for a table, which has indexed virtual column.
      This fix adds slave_fk_event_map flag for table, to mark when the prelocking is needed for applying of a transaction.
      See commit 608b0ee5 for more details.
      However, this fix is targeted for async replication only, Rows_log_event::do_apply_event() has condition to rule out galera replication from the fix domain, and use cases suffering from MDEV-23033 and related MDEV-21153 will fail in galera cluster.
      
      The fix in this commit removes the condition to rule out the setting of slave_fk_event_map flag from galera replication, and makes the fix in MDEV-23033 effective for galera replication as well.
      
      Finally, a mtr test for virtual column support has been added. galera.galera_virtual_column.test has as first test a scenario from MDEV-21153
      Reviewed-by: default avatarJan Lindström <jan.lindstrom@mariadb.com>
      7d04ce6a
    • Dmitry Shulga's avatar
      MDEV-24577: Fix warnings generated during compilation of... · 8bcddb02
      Dmitry Shulga authored
      MDEV-24577: Fix warnings generated during compilation of plugin/auth_pam/testing/pam_mariadb_mtr.c on FreeBSD
      
      Compiler warnings generated on building MariaDB server for BSD has the same
      reason as in case building is performed on MacOS. Both platforms do use
      clang as a C/C++ compiler. So, fix the compiler warnings in case the compiler
      is clang doesn't matter what kind of building platform do we use for building.
      
      This is a follow-up patch for the following bug reports:
        MDEV-23564: CMAKE failing due to deprecated Apple GSS method
        MDEV-23935: Fix warnings generated during compilation of
                    plugin/auth_pam/testing/pam_mariadb_mtr.c on MacOS
      8bcddb02
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · 049811ec
      Marko Mäkelä authored
      049811ec