• Monty's avatar
    Improve pruning in greedy_search by sorting tables during search · b3c74bdc
    Monty authored
    MDEV-28073 Slow query performance in MariaDB when using many tables
    
    The faster we can find a good query plan, the more options we have for
    finding and pruning (ignoring) bad plans.
    
    This patch adds sorting of plans to best_extension_by_limited_search().
    The plans, from best_access_path() are sorted according to the numbers
    of found rows.  This allows us to faster find 'good tables' and we are
    thus able to eliminate 'bad plans' faster.
    
    One side effect of this patch is that if two tables have equal cost,
    the table that which was used earlier in the query is preferred.
    This allows users to improve plans by reordering eq_ref tables in the
    order they would like them to be uses.
    
    Result changes caused by the patch:
    - Traces are different as now we print the cost for using tables before
      we start considering them in the plan.
    - Table order are changed for some plans. In most cases this is because
      the plans are equal and tables are in this case sorted according to
      their usage in the original query.
    - A few plans was changed as the optimizer was able to find a better
      plan (that was pruned by the original code).
    
    Other things:
    
    - Added a new statistic variable: "optimizer_join_prefixes_check_calls",
      which counts number of calls to best_extension_by_limited_search().
      This can be used to check the prune efficiency in greedy_search().
    - Added variable "JOIN_TAB::embedded_dependent" to be able to handle
      XX IN (SELECT..) in the greedy_optimizer.  The idea is that we
      should prune a table if any of the tables in embedded_dependent is
      not yet read.
    - When using many tables in a query, there will be some additional
      memory usage as we need to pre-allocate table of
      table_count*table_count*sizeof(POSITION) objects (POSITION is 312
      bytes for now) to hold the pre-calculated best_access_path()
      information.  This memory usage is offset by the expected
      performance improvement when using many tables in a query.
    - Removed the code from an earlier patch to keep the table order in
      join->best_ref in the original order.  This is not needed anymore as we
      are now sorting the tables for each best_extension_by_limited_search()
      call.
    b3c74bdc
subselect2.result 25.5 KB