1. 02 Feb, 2023 7 commits
    • Sergei Petrunia's avatar
      d9d9c90a
    • Sergei Petrunia's avatar
      Fix compile on Windows · 2a79abcd
      Sergei Petrunia authored
      2a79abcd
    • Monty's avatar
      Update cost for hash and cached joins · 95698097
      Monty authored
      The old code did not't correctly add TIME_FOR_COMPARE to rows that are
      part of the scan that will be compared with the attached where clause.
      
      Now the cost calculation for hash join and full join cache join are
      identical except for HASH_FANOUT (10%)
      
      The cost for a join with keys is now also uniform.
      The total cost for a using a key for lookup is calculated in one place as:
      
      (cost_of_finding_rows_through_key(records) + records/TIME_FOR_COMPARE)*
      record_count_of_previous_row_combinations + startup_cost
      
      startup_cost is the cost of a creating a temporary table (if needed)
      
      Best_cost now includes the cost of comparing all WHERE clauses and also
      cost of joining with previous row combinations.
      
      Other things:
      - Optimizer trace is now printing the total costs, including testing the
        WHERE clause (TIME_FOR_COMPARE) and comparing with all previous rows.
      - In optimizer trace, include also total cost of query together with the
        final join order. This makes it easier to find out where the cost was
        calculated.
      - Old code used filter even if the cost for it was higher than not using a
        filter. This is not corrected.
      - When rebasing on 10.11, I noticed some changes to access_cost_factor
        calculation. These changes was not picked as the coming changes
        to filtering will make that code obsolete.
      95698097
    • Monty's avatar
      Adjust costs for doing index scan in cost_group_min_max() · 6fa74517
      Monty authored
      The idea is that when doing a tree dive (once per group), we need to
      compare key values, which is fast.  For each new group, we have to
      compare the full where clause for the row.
      Compared to original code, the cost of group_min_max() has slightly
      increased which affects some test with only a few rows.
      main.group_min_max and main.distinct have been modified to show the
      effect of the change.
      
      The patch also adjust the number of groups in case of quick selects:
      - For simple WHERE clauses, ensure that we have at least as many groups
        as we have conditions on the used group-by key parts.
        The assumption is that each condition will create at least one group.
      - Ensure that there are no more groups than rows found by quick_select
      
      Test changes:
      - For some small tables there has been a change of
        Using index for group-by -> Using index for group-by (scanning)
        Range -> Index and Using index for group-by -> Using index
      6fa74517
    • Monty's avatar
      Return >= 1 from matching_candidates_in_table if records > 0.0 · bc9805e9
      Monty authored
      Having rows >= 1.0 helps ensure that when we calculate total rows of joins
      the number of resulting rows will not be less after the join.
      
      Changes in test cases:
      - Join order change for some tables with few records
      - 'Filtered' is much higher for tables with few rows, as 1 row is a high
        procent of a table with few rows.
      bc9805e9
    • Monty's avatar
      Update matching_candidates_in_table() to treat all conditions similar · b6714489
      Monty authored
      Fixed also that the 'with_found_constraint parameter' to
      matching_candidates_in_table() is as documented: It is now true only
      if there is a reference to a previous table in the WHERE condition for
      the current examined table (as it was originally documented)
      
      Changes in test results:
      - Filtered was 25% smaller for some queries (expected).
      - Some join order changed (probably because the tables had very few rows).
      - Some more table scans, probably because there would be fewer returned
        rows.
      - Some tests exposes a bug that if there is more filtered rows, then the
        cost for table scan will be higher. This will be fixed in a later commit.
      b6714489
    • Monty's avatar
      Fix calculation of selectivity · dc2f0d13
      Monty authored
      calculate_cond_selectivity_for_table() is largely rewritten:
      - Process keys in the order of rows found, smaller ranges first. If two
        ranges has equal number of rows, use the one with more key parts.
        This helps us to mark more used fields to not be used for further
        selectivity calculations. See cmp_quick_ranges().
      - Ignore keys with fields that where used by previous keys
      - Don't use rec_per_key[] to calculate selectivity for smaller
        secondary key parts.  This does not work as rec_per_key[] value
        is calculated in the context of the previous key parts, not for the
        key part itself. The one exception is if the previous key parts
        are all constants.
      
      Other things:
      - Ensure that select->cond_selectivity is always between 0 and 1.
      - Ensure that select->opt_range_condition_rows is never updated to
        a higher value. It is initially set to the number of rows in table.
      - We now store in table->opt_range_condition_rows the lowest number of
        rows that any row-read-method has found so far. Before it was only done
        for QUICK_SELECT_I::QS_TYPE_ROR_UNION and
        QUICK_SELECT_I::QS_TYPE_INDEX_MERGE.
        Now it is done for a lot more methods. See
        calculate_cond_selectivity_for_table() for details.
      - Calculate and use selectivity for the first key part of a multiple key
        part if the first key part is a constant.
        WHERE key1_part1=5 and key2_part1=5.  IF key1 is used, then we can still
        use selectivity for key2
      
      Changes in test results:
      - 'filtered' is slightly changed, usually to something slightly smaller.
      - A few cases where for group by queries the table order changed. This was
        because the number of resulting rows from a group by query with MIN/MAX
        is now set to be smaller.
      - A few index was changed as we now prefer index with more key parts if
        the number of resulting rows is the same.
      dc2f0d13
  2. 30 Jan, 2023 11 commits
    • Monty's avatar
      Fixed bug in SQL_SELECT_LIMIT · 7d0bef6c
      Monty authored
      We where comparing costs when we should be comparing number of rows
      that will be examined
      7d0bef6c
    • Monty's avatar
      Simple optimization to speed up some handler functions when checking killed · fc0c157a
      Monty authored
      - Avoid checking for has_transactions if killed flag is not checked
      - Simplify code (Have checked with gcc -O3 that there is improvements)
      - Added handler::fast_increment_statstics() to be used when a handler
        functions wants to increase two statistics for one row access.
      - Made check_limit_rows_examened() inline (even if it didn't make any
        difference for gcc 7.5.0), still the right thing to do
      fc0c157a
    • Monty's avatar
      Adjusted Range_rowid_filter_cost_info lookup cost slightly. · 07b0d1a3
      Monty authored
      If the array size would be 1, the cost would be 0 which is wrong.
      Fixed by adding a small (0.001) base value to the lookup cost.
      
      This causes not changes in any result files.
      07b0d1a3
    • Vicențiu Ciorbaru's avatar
      987fcf91
    • Monty's avatar
      Change class variable names in rowid_filter to longer, more clear names · bcd5454b
      Monty authored
      No code logic changes was done
      
      a     -> gain
      b     -> cost_of_building_range_filter
      a_adj -> gain_adj
      r     -> row_combinations
      
      Other things:
      - Optimized the layout of class Range_rowid_filter_cost_info.
        One effect was that I moved key_no to the private section to get
        better alignment and had to introduce a get_key_no() function.
      - Indentation changes in rowid_filter.cc to avoid long rows.
      bcd5454b
    • Monty's avatar
      Updated convert-debug-for-diff · 2cc5750c
      Monty authored
      2cc5750c
    • Monty's avatar
      Optimizer code cleanups, no logic changes · 4062fc28
      Monty authored
      - Updated comments
      - Added some extra DEBUG
      - Indentation changes and break long lines
      - Trivial code changes like:
        - Combining 2 statements in one
        - Reorder DBUG lines
        - Use a variable to store a pointer that is used multiple times
      - Moved declaration of variables to start of loop/function
      - Removed dead or commented code
      - Removed wrong DBUG_EXECUTE code in best_extension_by_limited_search()
      4062fc28
    • Monty's avatar
      Limit calculated rows to the number of rows in the table · 87d4d723
      Monty authored
      The result file changes are mainly that number of rows is one smaller
      for some queries with DISTINCT or GROUP BY
      87d4d723
    • Monty's avatar
      Ensure that test_quick_select doesn't return more rows than in the table · c443dbff
      Monty authored
      Other changes:
      - In test_quick_select(), assume that if table->used_stats_records is 0
        then the table has 0 rows.
      - Fixed prepare_simple_select() to populate table->used_stat_records
      - Enusre that set_statistics_for_tables() doesn't cause used_stats_records
        to be 0 when using stat_tables.
      - To get blackhole to work with replication, set stats.records to 2 so
        that test_quick_select() doesn't assume the table is empty.
      c443dbff
    • Monty's avatar
      MDEV-14907 FEDERATEDX doesn't respect DISTINCT · 8b977a6c
      Monty authored
      This is a minor cleanup of the original commit
      8b977a6c
    • Monty's avatar
      Improve comments in the optimizer · 9d0fbcc4
      Monty authored
      9d0fbcc4
  3. 25 Jan, 2023 3 commits
  4. 24 Jan, 2023 14 commits
    • Marko Mäkelä's avatar
      MDEV-26790 InnoDB read-ahead may cause page writes · a30d4250
      Marko Mäkelä authored
      buf_LRU_get_free_block(): Replace the Boolean parameter with a
      ternary parameter, so that have_no_mutex_soft can be specified
      reduce the chances of initiating page eviction flushing in read-ahead.
      
      buf_read_acquire(): Invoke buf_LRU_get_free_block(have_no_mutex_soft)
      and check in each caller for a nullptr return value.
      a30d4250
    • Marko Mäkelä's avatar
      MDEV-30216 Read-ahead unnecessarily allocates and frees pages when a page is in the buffer pool · d6aed216
      Marko Mäkelä authored
      buf_pool_t::page_hash_contains(): Check if a page is cached.
      
      buf_read_ahead_random(), buf_read_page_background(),
      buf_read_ahead_linear(): Before invoking buf_read_page_low(),
      preallocate a buffer page for the read request.
      
      buf_read_page(), buf_page_init_for_read(), buf_read_page_low():
      Add a parameter for the buf_pool.page_hash chain, to avoid duplicated
      computations.
      
      buf_page_t::read_complete(): Only attempt recovery if an uncompressed
      page frame has been allocated.
      
      buf_page_init_for_read(): Before trying to acquire buf_pool.mutex, acquire
      an exclusive buf_pool.page_hash latch and check if the page is already
      located in the buffer pool. If the buf_pool.mutex is not immediately
      available, release both latches and acquire them in the correct order,
      and then recheck if the page is already in the buffer pool. This should
      hopefully reduce some contention on buf_pool.mutex.
      
      buf_page_init_for_read(), buf_read_page_low(): Input the "recovery needed"
      flag in the least significant bit of zip_size.
      
      buf_read_acquire(), buf_read_release(): Interface for allocating and
      freeing buffer pages for reading.
      
      buf_read_recv_pages(): Set the flag that recovery is needed.
      Other ROW_FORMAT=COMPRESSED reads during recovery
      will not need any recovery.
      d6aed216
    • Marko Mäkelä's avatar
      Merge 10.10 into 10.11 · 10635c28
      Marko Mäkelä authored
      10635c28
    • Marko Mäkelä's avatar
      Merge 10.9 into 10.10 · 51fc6b91
      Marko Mäkelä authored
      51fc6b91
    • Marko Mäkelä's avatar
      Merge 10.8 into 10.9 · 4d9fe403
      Marko Mäkelä authored
      4d9fe403
    • Marko Mäkelä's avatar
      Merge 10.7 into 10.8 · fa543a0f
      Marko Mäkelä authored
      fa543a0f
    • Marko Mäkelä's avatar
      Merge 10.6 into 10.7 · cea50896
      Marko Mäkelä authored
      cea50896
    • Marko Mäkelä's avatar
      MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT · de4030e4
      Marko Mäkelä authored
      This also fixes part of MDEV-29835 Partial server freeze
      which is caused by violations of the latching order that was
      defined in https://dev.mysql.com/worklog/task/?id=6326
      (WL#6326: InnoDB: fix index->lock contention). Unless the
      current thread is holding an exclusive dict_index_t::lock,
      it must acquire page latches in a strict parent-to-child,
      left-to-right order. Not all cases of MDEV-29835 are fixed yet.
      Failure to follow the correct latching order will cause deadlocks
      of threads due to lock order inversion.
      
      As part of these changes, the BTR_MODIFY_TREE mode is modified
      so that an Update latch (U a.k.a. SX) will be acquired on the
      root page, and eXclusive latches (X) will be acquired on all pages
      leading to the leaf page, as well as any left and right siblings
      of the pages along the path. The DEBUG_SYNC test innodb.innodb_wl6326
      will be removed, because at the time the DEBUG_SYNC point is hit,
      the thread is actually holding several page latches that will be
      blocking a concurrent SELECT statement.
      
      We also remove double bookkeeping that was caused due to excessive
      information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo
      store information of latched pages, and ensure that
      mtr_memo_slot_t::object is never a null pointer.
      The tree_blocks[] and tree_savepoints[] were redundant.
      
      buf_page_get_low(): If innodb_change_buffering_debug=1, to avoid
      a hang, do not try to evict blocks if we are holding a latch on
      a modified page. The test innodb.innodb-change-buffer-recovery
      will be removed, because change buffering may no longer be forced
      by debug injection when the change buffer comprises multiple pages.
      Remove a debug assertion that could fail when
      innodb_change_buffering_debug=1 fails to evict a page.
      For other cases, the assertion is redundant, because we already
      checked that right after the got_block: label. The test
      innodb.innodb-change-buffering-recovery will be removed, because
      due to this change, we will be unable to evict the desired page.
      
      mtr_t::lock_register(): Register a change of a page latch
      on an unmodified buffer-fixed block.
      
      mtr_t::x_latch_at_savepoint(), mtr_t::sx_latch_at_savepoint():
      Replaced by the use of mtr_t::upgrade_buffer_fix(), which now
      also handles RW_S_LATCH.
      
      mtr_t::set_modified(): For temporary tables, invoke
      buf_page_t::set_modified() here and not in mtr_t::commit().
      We will never set the MTR_MEMO_MODIFY flag on other than
      persistent data pages, nor set mtr_t::m_modifications when
      temporary data pages are modified.
      
      mtr_t::commit(): Only invoke the buf_flush_note_modification() loop
      if persistent data pages were modified.
      
      mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo.
      This avoids many redundant entries in mtr_t::m_memo, as well as
      redundant calls to buf_page_get_gen() for blocks that had already
      been looked up in a mini-transaction.
      
      btr_get_latched_root(): Return a pointer to an already latched root page.
      This replaces btr_root_block_get() in cases where the mini-transaction
      has already latched the root page.
      
      btr_page_get_parent(): Fetch a parent page that was already latched
      in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched().
      If needed, upgrade the root page U latch to X.
      This avoids bloating mtr_t::m_memo as well as performing redundant
      buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for
      B-tree defragmentation, we will invoke btr_cur_search_to_nth_level().
      
      btr_cur_search_to_nth_level(): This will only be used for non-leaf
      (level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE
      or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be
      removed altogether, or retained for the case of
      CHECK TABLE without QUICK.
      
      btr_cur_t::left_block: Remove. btr_pcur_move_backward_from_page()
      can retrieve the left sibling from the end of mtr_t::m_memo.
      
      btr_cur_t::open_leaf(): Some clean-up.
      
      btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level()
      for searches to level=0 (the leaf level). We will never release
      parent page latches before acquiring leaf page latches. If we need to
      temporarily release the level=1 page latch in the BTR_SEARCH_PREV or
      BTR_MODIFY_PREV latch_mode, we will reposition the cursor on the
      child node pointer so that we will land on the correct leaf page.
      
      btr_cur_t::pessimistic_search_leaf(): Implement new BTR_MODIFY_TREE
      latching logic in the case that page splits or merges will be needed.
      The parent pages (and their siblings) should already be latched on
      the first dive to the leaf and be present in mtr_t::m_memo; there
      should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost
      suffices; it must be revised in MDEV-29835 and work-arounds removed
      for cases where mtr_t::get_already_latched() fails to find a block.
      
      rtr_search_to_nth_level(): A SPATIAL INDEX version of
      btr_search_to_nth_level() that can search to any level
      (including the leaf level).
      
      rtr_search_leaf(), rtr_insert_leaf(): Wrappers for
      rtr_search_to_nth_level().
      
      rtr_search(): Replaces rtr_pcur_open().
      
      rtr_latch_leaves(): Replaces btr_cur_latch_leaves(). Note that unlike
      in the B-tree code, there is no error handling in case the sibling
      pages are corrupted.
      
      rtr_cur_restore_position(): Remove an unused constant parameter.
      
      btr_pcur_open_on_user_rec(): Remove the constant parameter
      mode=PAGE_CUR_GE.
      
      row_ins_clust_index_entry_low(): Use a new
      mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page
      when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC.
      
      BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove.
      
      BTR_CONT_MODIFY_TREE: Note that this is only used by
      rtr_search_to_nth_level().
      
      btr_pcur_optimistic_latch_leaves(): Replaces
      btr_cur_optimistic_latch_leaves().
      
      ibuf_delete_rec(): Acquire exclusive ibuf.index->lock in order
      to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV).
      
      btr_blob_log_check_t(): Acquire a U latch on the root page,
      so that btr_page_alloc() in btr_store_big_rec_extern_fields()
      will avoid a deadlock.
      
      btr_store_big_rec_extern_fields(): Assert that the root page latch
      is being held.
      
      Tested by: Matthias Leich
      Reviewed by: Vladislav Lesin
      de4030e4
    • Denis Protivensky's avatar
      MDEV-24623 Replicate bulk insert as table-level exclusive key · 39f46745
      Denis Protivensky authored
      - introduce table key construction function in wsrep service interface
      - don't add row keys when replicating bulk insert
      - don't start bulk insert on applier or when transaction is not active
      - don't start bulk insert on system versioned tables
      - implement actual bulk insert table-level key replication
      Reviewed-by: default avatarJan Lindström <jan.lindstrom@mariadb.com>
      39f46745
    • Daniel Black's avatar
      rpm: ignore man3 · a10003bd
      Daniel Black authored
      During testing of RPM packages in MDEV-30203:
        file /usr/share/man/man3 from install of
        MariaDB-devel-11.0.1-1.el7_9.x86_64 conflicts with file from
        package filesystem-3.2-25.el7.x86_64
      
      MariaDB is the first libmariadb to include man3 man pages
      so make the changes here like what is done for man1 and man8.
      a10003bd
    • Daniel Black's avatar
      MDEV-26548: replace .mysql_history with .mariadb_history · f8ca355e
      Daniel Black authored
      Fall back to using .mysql_history if .mariadb_history isn't
      present and .mysql_history is present.
      
      Also replace the use of MYSQL_HISTFILE as an environment variable
      with MARIADB_HISTFILE and fall back to using MYSQL_HISTFILE if
      MARIADB_HISTFILE isn't present.
      f8ca355e
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-30393 InnoDB: Assertion failure in dict0dict.cc upon ADD FULLTEXT INDEX · ef6b3806
      Thirunarayanan Balathandayuthapani authored
      Problem:
      ========
      - InnoDB fails to remove the newly created table or index from
      data dictionary and table cache if the alter fails in commit phase
      
      Solution:
      ========
      - InnoDB should restart the transaction to remove the newly
      created table and index when it fails in commit phase of an alter
      operation. innodb_fts.misc_debug tests the scenario with the
      help of debug point "stats_lock_fail"
      ef6b3806
    • Marko Mäkelä's avatar
      MDEV-30447: use of undeclared identifier O_DIRECT · aafe85ec
      Marko Mäkelä authored
      In commit 24648768, some use of
      O_DIRECT was added without proper #ifdef guard. That broke the
      compilation in environments that do not define O_DIRECT, such as
      OpenBSD.
      aafe85ec
    • Ian Gilfillan's avatar
      Update 10.10 HELP tables · d456c070
      Ian Gilfillan authored
      d456c070
  5. 23 Jan, 2023 5 commits