1. 02 Feb, 2021 2 commits
  2. 01 Feb, 2021 3 commits
  3. 30 Jan, 2021 3 commits
  4. 29 Jan, 2021 5 commits
    • Sergei Petrunia's avatar
      MDEV-24739: Assertion `root->weight >= ...' failed in SEL_ARG::tree_delete · 73c43ee9
      Sergei Petrunia authored
      Also update the SEL_ARG graph weight in:
      - sel_add()
      - SEL_ARG::clone()
      
      Make key_{and,or}_with_limit() to also verify weight for the arguments
      (There is no single point to verify SEL_ARG graphs constructed from
      conditions that are not AND-OR formulas, so we hope that those are
      connected with AND/OR and do it here).
      73c43ee9
    • Marko Mäkelä's avatar
      MDEV-24661: Remove the test innodb.innodb_wl6326_big · a70a47f2
      Marko Mäkelä authored
      The purpose of the test was to ensure that the SX (update) mode of
      index tree and buffer page latches are being used.
      
      The test has become unstable, possibly due to changes related to
      buf_pool.mutex and buf_pool.page_hash, or to the use of MDL in the
      purge of transaction history.
      
      In 10.6, the test depends on instrumentation that was refactored
      or removed in MDEV-24142.
      
      The use of different latching modes can better be indirectly observed
      through high-concurrency benchmarks. For MDEV-14637, a performance test
      was conducted where the finer-grained latching and
      BTR_CUR_FINE_HISTORY_LENGTH were removed. It caused a 20% performance
      regression for UPDATE and somewhat smaller for INSERT.
      
      Any new problem with latching granularity should be easily caught by
      performance testing, or by stress tests with Random Query Generator.
      a70a47f2
    • Vladislav Vaintroub's avatar
      MDEV-24685 - remove IO thread states output from SHOW ENGINE INNODB STATUS · d8373fea
      Vladislav Vaintroub authored
      There are no IO threads anymore.
      d8373fea
    • Sergei Petrunia's avatar
      MDEV-9750: Quick memory exhaustion with 'extended_keys=on' ... · c3672038
      Sergei Petrunia authored
      (Variant #5, full patch, for 10.5)
      
      Do not produce SEL_ARG graphs that would yield huge numbers of ranges.
      Introduce a concept of SEL_ARG graph's "weight". If we are about to
      produce a graph whose "weight" exceeds the limit, remove the parts
      of SEL_ARG graph that represent the biggest key parts. Do so until
      the graph's is within the limit.
      
      Includes
      - debug code to verify SEL_ARG graph weight
      - A user-visible @@optimizer_max_sel_arg_weight to control the optimization
      - Logging the optimization into the optimizer trace.
      c3672038
    • sjaakola's avatar
      MDEV-24721 galera.mysql-wsrep-bugs-607 test failure · a2eb974b
      sjaakola authored
      The implementation for MDEV-17048 apperas to be direct copy from mysql version.
      The group commit works differently in mariadb and the assert in wsrep_unregister_from_group_commit() is too strict.
      
      The reason is that in: Wsrep_high_priority_service::log_dummy_write_set(), the transaction will undergo full rollback:
          {
            cs.before_rollback();
            cs.after_rollback();
          }
      
      After that, the client's transaction state is set to be:  wsrep::transaction::s_aborted.
      The execution then continues execution by:
      
      ...
       wsrep_register_for_group_commit(m_thd);
      ...
       wsrep_unregister_from_group_commit(m_thd);
      
      The bogus assert in wsrep_unregister_from_group_commit() allows only transactions states of :s_ordered_commit or s_aborting.
      
      As the fix, I brought back the same assert as is present in MariaDB 10.4 version.
      a2eb974b
  5. 28 Jan, 2021 6 commits
    • Anel Husakovic's avatar
      MDEV-24093: Detect during mysql_upgrade if type_mysql_json.so is needed and load it · 85130c5a
      Anel Husakovic authored
      a. The change makes `mariadb-upgrade` detect if `MYSQL_JSON` data type is needed.
      b. Install the data type if it's not installed.
      c. Uninstalls the data type once finished.
      d. Create `.opt` and `.inc` files `have_type_mysql_json` and adapt the
      tests
      
      Reviewed by: vicentiu@mariadb.org
      85130c5a
    • Marko Mäkelä's avatar
      MDEV-24715 Assertion !node->table->skip_alter_undo in CREATE...REPLACE SELECT · c411393a
      Marko Mäkelä authored
      In commit 3cef4f8f (MDEV-515)
      we inadvertently broke CREATE TABLE...REPLACE SELECT statements
      by wrongly disabling row-level undo logging.
      
      select_create::prepare(): Only invoke extra(HA_EXTRA_BEGIN_ALTER_COPY)
      if no special treatment of duplicates is needed.
      c411393a
    • Marko Mäkelä's avatar
      MDEV-24564 Statistics are lost after ALTER TABLE · 6d1f1b61
      Marko Mäkelä authored
      Ever since commit 007f68c3,
      ALTER TABLE no longer invokes handler::open() after
      handler::commit_inplace_alter_table().
      
      ha_innobase::reload_statistics(): Reload or recompute statistics
      after ALTER TABLE.
      
      innodb_notify_tabledef_changed(): A new function to invoke
      ha_innobase::reload_statistics().
      
      handlerton::notify_tabledef_changed(): Add the parameter handler*
      so that ha_innobase::reload_statistics() can be invoked.
      
      ha_partition::notify_tabledef_changed(),
      partition_notify_tabledef_changed(): Pass through the call
      to any partitions or subpartitions.
      
      This is based on code that was supplied by Monty.
      6d1f1b61
    • Vlad Lesin's avatar
      744e9752
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-24695 Encryption modifies a freed page · 6e80a34d
      Thirunarayanan Balathandayuthapani authored
      During recovery, InnoDB fails if it tries to apply a FREE_PAGE
      and WRITE record to the page. InnoDB encryption thread accesses
      the freed page and writes redo log for it.
      
      This is similar to commit deadec4e (MDEV-24569)
      InnoDB is missing buf_page_free() while freeing the segment.
      To avoid accessing of freed page in buffer pool, InnoDB should
      mark the pages as FREED while freeing the segment. Also to
      avoid reading of freed page, InnoDB should check the
      allocation bitmap page.
      
      fseg_free_step(): Mark the page in buffer pool as FREED
      
      fseg_free_step_not_header(): Mark the page in buffer pool as FREED
      
      buf_dump(): Ignore the freed pages while dumping the buffer pool content
      
      fil_crypt_get_page_throttle_func(): Skip the rotation for FREED page
      to avoid the assert failure during recovery
      
      fil_crypt_rotate_page(): Skip the rotation for the freed page
      
      Reviewed-by: Marko Mäkelä
      6e80a34d
    • Marko Mäkelä's avatar
      c6308355
  6. 27 Jan, 2021 17 commits
    • Marko Mäkelä's avatar
      Cleanup: Remove many C-style lock_get_ accessors · 68b28193
      Marko Mäkelä authored
      Let us prefer member functions to the old C-style accessor functions.
      Also, prefer bitwise AND operations for checking multiple flags.
      68b28193
    • Marko Mäkelä's avatar
      Cleanup: Remove lock_get_size() · cbb0a60c
      Marko Mäkelä authored
      cbb0a60c
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-24693 LeakSanitizer: detected memory leaks in mem_heap_create_block_func... · 700ae20d
      Thirunarayanan Balathandayuthapani authored
      MDEV-24693 LeakSanitizer: detected memory leaks in mem_heap_create_block_func / fts_optimize_create_msg
      
      - This issue is caused by the commit bf1f9b59
      (MDEV-24638). Delay the creation of SYNC message in
      fts_optimize_request_sync_table. So that InnoDB can avoid creating
      the message if the table already has SYNC message in fts_optimize_wq queue
      700ae20d
    • Marko Mäkelä's avatar
      MDEV-24700 Assertion "lock not found"==0 in lock_table_x_unlock() · 5dd028f8
      Marko Mäkelä authored
      After an ignored INSERT IGNORE statement into an empty table, we would
      wrongly use the MDEV-515 table-level undo logging for a subsequent
      REPLACE statement.
      
      ha_innobase::reset_template(): Clear m_prebuilt->ins_node->bulk_insert
      on every statement boundary.
      
      ha_innobase::start_stmt(): Invoke end_bulk_insert().
      
      ha_innobase::extra(): Avoid accessing m_prebuilt->trx. Do not call
      thd_to_trx(). Invoke end_bulk_insert() and try to reset bulk_insert
      when changing the REPLACE or IGNORE settings.
      
      trx_mod_table_time_t::WAS_BULK: Use a distinct value from BULK.
      
      trx_undo_report_row_operation(): Add debug assertions.
      
      Note: Some calls to end_bulk_insert() may be redundant, but statement
      boundaries are not always clear in the API (especially in the
      presence of LOCK TABLES or stored procedures).
      5dd028f8
    • Marko Mäkelä's avatar
      MDEV-20612: Speed up lock_table_other_has_incompatible() · 121d0f7f
      Marko Mäkelä authored
      dict_table_t::n_lock_x_or_s: Keep track of LOCK_S or LOCK_X on the table.
      
      lock_table_other_has_incompatible(): In the likely case that no
      transaction is waiting for or holding LOCK_S or LOCK_X on the table,
      return early: conflicts cannot exist.
      
      This is based on the idea of Zhai Weixiang, who reported MySQL Bug #72948.
      
      lock_table_has_to_wait_in_queue(), lock_table_dequeue():
      Extend the optimization, inspired by
      mysql/mysql-server@bb7191d6cbe47e15923143e194c03406cff9024b
      by Jakub Łopuszański.
      121d0f7f
    • Marko Mäkelä's avatar
    • Marko Mäkelä's avatar
      b32f057d
    • Marko Mäkelä's avatar
      462cb666
    • Marko Mäkelä's avatar
      MDEV-24671: Replace lock_wait_timeout_task with mysql_cond_timedwait() · e71e6133
      Marko Mäkelä authored
      lock_wait(): Replaces lock_wait_suspend_thread(). Wait for the lock to
      be granted or the transaction to be killed using mysql_cond_timedwait()
      or mysql_cond_wait().
      
      lock_wait_end(): Replaces que_thr_end_lock_wait() and
      lock_wait_release_thread_if_suspended().
      
      lock_wait_timeout_task: Remove. The operating system kernel will
      resume the mysql_cond_timedwait() in lock_wait(). An added benefit
      is that innodb_lock_wait_timeout no longer has a 'jitter' of 1 second,
      which was caused by this wake-up task waking up only once per second,
      and then waking up any threads for which the timeout (which was only
      measured in seconds) was exceeded.
      
      innobase_kill_query(): Set trx->error_state=DB_INTERRUPTED,
      so that a call trx_is_interrupted(trx) in lock_wait() can be avoided.
      
      We will protect things more consistently with lock_sys.wait_mutex,
      which will be moved below lock_sys.mutex in the latching order.
      
      trx_lock_t::cond: Condition variable for !wait_lock, used with
      lock_sys.wait_mutex.
      
      srv_slot_t: Remove. Replaced by trx_lock_t::cond,
      
      lock_grant_after_reset(): Merged to to lock_grant().
      
      lock_rec_get_index_name(): Remove.
      
      lock_sys_t: Introduce wait_pending, wait_count, wait_time, wait_time_max
      that are protected by wait_mutex.
      
      trx_lock_t::que_state: Remove.
      
      que_thr_state_t: Remove QUE_THR_COMMAND_WAIT, QUE_THR_LOCK_WAIT.
      
      que_thr_t: Remove is_active, start_running(), stop_no_error().
      
      que_fork_t::n_active_thrs, trx_lock_t::n_active_thrs: Remove.
      e71e6133
    • Marko Mäkelä's avatar
      Cleanups: · 7f1ab8f7
      Marko Mäkelä authored
      que_thr_t::fork_type: Remove.
      
      QUE_THR_SUSPENDED, TRX_QUE_COMMITTING: Remove.
      
      Cleanup lock_cancel_waiting_and_release()
      7f1ab8f7
    • Marko Mäkelä's avatar
      ff3f07ce
    • Marko Mäkelä's avatar
      Cleanup the lock creation · 898dcf93
      Marko Mäkelä authored
      LOCK_MAX_N_STEPS_IN_DEADLOCK_CHECK, LOCK_MAX_DEPTH_IN_DEADLOCK_CHECK,
      LOCK_RELEASE_INTERVAL: Replace with the bare use of the constants.
      
      lock_rec_create_low(): Remove LOCK_PAGE_BITMAP_MARGIN altogether.
      We already have REDZONE_SIZE as a 'safety margin' in AddressSanitizer
      builds, to catch any out-of-bounds access.
      
      lock_prdt_add_to_queue(): Avoid a useless search when enqueueing
      a waiting lock request.
      
      lock_prdt_lock(): Reduce the size of the trx->mutex critical section.
      898dcf93
    • Marko Mäkelä's avatar
      Cleanup: Remove trx_get_id_for_print() · 469da6c3
      Marko Mäkelä authored
      Any transaction that has requested a lock must have trx->id!=0.
      
      trx_print_low(): Distinguish non-locking or inactive transaction
      objects by displaying the pointer in parentheses.
      
      fill_trx_row(): Do not try to map trx->id to a pointer-based value.
      469da6c3
    • Vladislav Vaintroub's avatar
      MDEV-23959 GSSAPI plugin - support AD or local group name , and SIDs on Windows · 7ebabea5
      Vladislav Vaintroub authored
      Support membership tests in SSPI with special prefix form
      
      CREATE USER u IDENTIFIED WITH gssapi AS "GROUP:<group_name>"
      or
      CREATE USER u IDENTIFIED WITH gssapi AS "SID:<sid>"
      
      If user is created as one of the above, after successful SSPI handshake,
      this will happen
      
      1) If "GROUP:" prefix is used, then <group_name> is translated to SID
      using LookupAccountName() API
      
      2) SSPI user is checked for  SID membership with
      ImpersonateSecurityContext() and CheckMembership() APIs
      
      Note, that it <group>/<sid> do not need strictly to refer to an actual
      group.
      Identity test is also supported, e.g  "GROUP:<users_name>" or
      "SID:<user_sid>" will work too.
      
      
      Well-known SIDs (in SDDL syntax) appear to be supported such as
      "SID:WD" will refer to World/Everyone (== "SID:S-1-1-0")
      or
      "SID:BA" will refer to Administrators (== "SID:S-1-5-32-544")
      
      In UAC environments, for successful checks against Administrators group,
      elevation(Run As Administrator) might be necessary, since CheckMembership()
      needs groups to be marked as enabled in the token group list.
      7ebabea5
    • Vladislav Vaintroub's avatar
      MDEV-24685 - remove IO thread states output from SHOW ENGINE INNODB STATUS · c310f4c3
      Vladislav Vaintroub authored
      There are no IO threads anymore.
      c310f4c3
    • Jan Lindström's avatar
      Update disabled.def in suites · 30379d90
      Jan Lindström authored
      * galera
      * galera_sr
      * galera_3nodes
      30379d90
    • Jan Lindström's avatar
  7. 26 Jan, 2021 2 commits
    • Roman Nozdrin's avatar
      0565d199
    • mkaruza's avatar
      MDEV-20008: Galera strict mode · 95a2bca0
      mkaruza authored
      Added new enum variable `wsrep_mode` which can be used to turn on WSREP
      features which are not part of default behaviour.
      Added enum `BINLOG_ROW_FORMAT_ONLY`, `REQUIRED_PRIMARY_KEY` and
      `STRICT_REPLICATION`. `wsrep-mode=STRICT_REPLICATION` behaves
      like variable `wsrep_strict_ddl`.
      
      Variable wsrep_strict_ddl is deprecated and if set we use
      new wsrep_mode setting instead.
      
      Reviewed and improved by: Jan Lindström <jan.lindstrom@mariadb.com>
      95a2bca0
  8. 25 Jan, 2021 2 commits
    • Marko Mäkelä's avatar
    • Marko Mäkelä's avatar
      MDEV-515 Reduce InnoDB undo logging for insert into empty table · 3cef4f8f
      Marko Mäkelä authored
      We implement an idea that was suggested by Michael 'Monty' Widenius
      in October 2017: When InnoDB is inserting into an empty table or partition,
      we can write a single undo log record TRX_UNDO_EMPTY, which will cause
      ROLLBACK to clear the table.
      
      For this to work, the insert into an empty table or partition must be
      covered by an exclusive table lock that will be held until the transaction
      has been committed or rolled back, or the INSERT operation has been
      rolled back (and the table is empty again), in lock_table_x_unlock().
      
      Clustered index records that are covered by the TRX_UNDO_EMPTY record
      will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot
      be distinguished from what MDEV-12288 leaves behind after purging the
      history of row-logged operations.
      
      Concurrent non-locking reads must be adjusted: If the read view was
      created before the INSERT into an empty table, then we must continue
      to imagine that the table is empty, and not try to read any records.
      If the read view was created after the INSERT was committed, then
      all records must be visible normally. To implement this, we introduce
      the field dict_table_t::bulk_trx_id.
      
      This special handling only applies to the very first INSERT statement
      of a transaction for the empty table or partition. If a subsequent
      statement in the transaction is modifying the initially empty table again,
      we must enable row-level undo logging, so that we will be able to
      roll back to the start of the statement in case of an error (such as
      duplicate key).
      
      INSERT IGNORE will continue to use row-level logging and locking, because
      implementing it would require the ability to roll back the latest row.
      Since the undo log that we write only allows us to roll back the entire
      statement, we cannot support INSERT IGNORE. We will introduce a
      handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage
      engines that INSERT IGNORE is being executed.
      
      In many test cases, we add an extra record to the table, so that during
      the 'interesting' part of the test, row-level locking and logging will
      be used.
      
      Replicas will continue to use row-level logging and locking until
      MDEV-24622 has been addressed. Likewise, this optimization will be
      disabled in Galera cluster until MDEV-24623 enables it.
      
      dict_table_t::bulk_trx_id: The latest active or committed transaction
      that initiated an insert into an empty table or partition.
      Protected by exclusive table lock and a clustered index leaf page latch.
      
      ins_node_t::bulk_insert: Whether bulk insert was initiated.
      
      trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert).
      Unlike earlier, this collection will cover also temporary tables.
      
      trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(),
      is_bulk_insert(), was_bulk_insert().
      
      trx_undo_report_row_operation(): Before accessing any undo log pages,
      invoke trx->mod_tables.emplace() in order to determine whether undo
      logging was disabled, or whether this is the first INSERT and we are
      supposed to write a TRX_UNDO_EMPTY record.
      
      row_ins_clust_index_entry_low(): If we are inserting into an empty
      clustered index leaf page, set the ins_node_t::bulk_insert flag for
      the subsequent trx_undo_report_row_operation() call.
      
      lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock():
      Remove the redundant parameter 'flags' that can be checked in the caller.
      
      btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write
      DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation().
      
      trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT),
      ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that
      the next statement will not be covered by table-level undo logging.
      
      ReadView::changes_visible(trx_id_t) const: New accessor for the case
      where the trx_id_t is not read from a potentially corrupted index page
      but directly from the memory. In this case, we can skip a sanity check.
      
      row_sel(), row_sel_try_search_shortcut(), row_search_mvcc():
      row_sel_try_search_shortcut_for_mysql(),
      row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id.
      
      row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees().
      
      lock_sec_rec_cons_read_sees(): Replaced with lower-level code.
      
      btr_root_page_init(): Refactored from btr_create().
      
      dict_index_t::clear(), dict_table_t::clear(): Empty an index or table,
      for the ROLLBACK of an INSERT operation.
      
      ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT
      into an empty table.
      
      This is joint work with Thirunarayanan Balathandayuthapani,
      who created a working prototype.
      Thanks to Matthias Leich for extensive testing.
      3cef4f8f