An error occurred fetching the project authors.
  1. 16 Sep, 2024 2 commits
    • Christian Gonzalez's avatar
      Make SESSION_USER() comparable with CURRENT_USER() · c88cb0b7
      Christian Gonzalez authored
      Update `SESSION_USER()` behaviour to be comparable with `CURRENT_USER()`.
      `SESSION_USER()` will return the user and host columns from `mysql.user`
      used to authenticate the user when the session was created.
      
      Historically `SESSION_USER()` was an alias of `USER()` function. The
      main difference with `USER()` behaviour after this changes is that
      `SESSION_USER()` now returns the host column from `mysql.user` instead of
      the client host or ip.
      
      NOTE: `SESSION_USER_IS_USER` old mode is added to make the change
      backward compatible.
      
      All new code of the whole pull request, including one or several files
      that are either new files or modified ones, are contributed under the
      BSD-new license. I am contributing on behalf of my employer
      Amazon Web Services, Inc.
      c88cb0b7
    • Monty's avatar
      MDEV-33144 Implement the Percona variable slow_query_log_always_write_time · 295c0ebf
      Monty authored
      This task is inspired by the Percona implementation of
      slow_query_log_always_write_time.
      
      This task implements the variable log_slow_always_query_time (name
      matching other MariaDB variables using the slow query log). The
      default value for the variable is 31536000, which makes MariaDB
      compatible with older installations.
      
      For queries with execution time longer than log_slow_always_query_time
      the variables log_slow_rate_limit and log_slow_min_examined_row_limit
      will be ignored and the query will be written to the slow query log
      if there is no other limitations (like log_slow_filter etc).
      
      Other things:
      - long_query_time internal variable renamed to log_slow_query_time.
      - More descriptive information for "log_slow_query_time".
      295c0ebf
  2. 14 Sep, 2024 3 commits
    • Sergei Golubchik's avatar
      VEC_Distance_Cosine() · 78bd693a
      Sergei Golubchik authored
      78bd693a
    • Sergei Golubchik's avatar
      mhnsw: inter-statement shared cache · 1dc75e2e
      Sergei Golubchik authored
      * preserve the graph in memory between statements
      * keep it in a TABLE_SHARE, available for concurrent searches
      * nodes are generally read-only, walking the graph doesn't change them
      * distance to target is cached, calculated only once
      * SIMD-optimized bloom filter detects visited nodes
      * nodes are stored in an array, not List, to better utilize bloom filter
      * auto-adjusting heuristic to estimate the number of visited nodes
        (to configure the bloom filter)
      * many threads can concurrently walk the graph. MEM_ROOT and Hash_set
        are protected with a mutex, but walking doesn't need them
      * up to 8 threads can concurrently load nodes into the cache,
        nodes are partitioned into 8 mutexes (8 is chosen arbitrarily, might
        need tuning)
      * concurrent editing is not supported though
      * this is fine for MyISAM, TL_WRITE protects the TABLE_SHARE and the
        graph (note that TL_WRITE_CONCURRENT_INSERT is not allowed, because an
        INSERT into the main table means multiple UPDATEs in the graph)
      * InnoDB uses secondary transaction-level caches linked in a list in
        in thd->ha_data via a fake handlerton
      * on rollback the secondary cache is discarded, on commit nodes
        from the secondary cache are invalidated in the shared cache
        while it is exclusively locked
      * on savepoint rollback both caches are flushed. this can be improved
        in the future with a row visibility callback
      * graph size is controlled by @@mhnsw_cache_size, the cache is flushed
        when it reaches the threshold
      1dc75e2e
    • Sergei Golubchik's avatar
      mhnsw: configurable parameters · 365afe70
      Sergei Golubchik authored
      1. introduce alpha. the value of 1.1 is optimal, so hard-code it.
      
      2. hard-code ef_construction=10, best by test
      
      3. rename hnsw_max_connection_per_layer to mhnsw_max_edges_per_node
         (max_connection is rather ambiguous in MariaDB) and add a help text
      
      4. rename hnsw_ef_search to mhnsw_min_limit and add a help text
      365afe70
  3. 28 Aug, 2024 1 commit
    • Brandon Nesterenko's avatar
      MDEV-33756: Deprecate binlog_optimize_thread_scheduling · 9811d23b
      Brandon Nesterenko authored
      The option binlog_optimize_thread_scheduling was initially added
      to provide a safe alternative for the newly added binlog group
      commit logic, such that when 0, it would disable a leader thread
      from performing the binlog write for all transactions that are a
      part of the group commit. Any problems related to the binlog group
      commit optimization should be sorted out by now, so we can
      deprecate-to-eventually-remove the option altogether.
      
      This commit performs the deprecation, and the removal is tracked
      by MDEV-33745. Note, as the option is only able to be provided
      via configuration at startup time, users will not see a
      deprecation message unless looking through the CLI help
      message.
      
      Reviewed By
      ============
      Kristian Nielsen <knielsen@knielsen-hq.org>
      Sergei Golubchik <serg@mariadb.org>
      9811d23b
  4. 29 Jul, 2024 1 commit
    • Monty's avatar
      MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities · 4bf7c966
      Monty authored
      (With trivial fixes by sergey@mariadb.com)
      Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs
      
      Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int
      in InnoDB that in effect doubles the Cardinality for secondary keys.
      This has the biggest effect for indexes where a few rows has the same key
      value. Using this may also cause table scans for very small tables (which
      in some cases may be better than an index scan).
      
      The user visible effect is that 'SHOW INDEX FROM table_name' will for
      InnoDB show the true Cardinality (and not 2x the real value). It will
      also allow the optimizer to chose a better index in some cases as the
      division by 2 could have a bad effect for tables with 2-5 identical values
      per key.
      
      A few notes about using fix_innodb_cardinality:
      - It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX
        will also update the statistics in table share.
      - The effect of fix_innodb_cardinality for query plans or EXPLAIN
        is only visible after first open of the table. This is why one must
        do a flush tables or use SHOW INDEX for the option to take effect.
      - Using fix_innodb_cardinality can thus affect all user in their query
        plans if they are using the same tables.
      
      Because of this, it is strongly recommended that one uses
      optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly
      in configuration files to not cause issues for other users.
      4bf7c966
  5. 10 Jul, 2024 1 commit
    • Monty's avatar
      MDEV-34504 PURGE BINARY LOGS not working anymore · dd997809
      Monty authored
      PURGE BINARY LOGS did not always purge binary logs. This commit fixes
      some of the issues and adds notifications if a binary log cannot be
      purged.
      
      User visible changes:
      - 'PURGE BINARY LOG TO log_name' and 'PURGE BINARY LOGS BEFORE date'
        worked differently. 'TO' ignored 'slave_connections_needed_for_purge'
        while 'BEFORE' did not. Now both versions ignores the
        'slave_connections_needed_for_purge variable'.
      - 'PURGE BINARY LOG..' commands now returns 'note' if a binary log cannot
         be deleted like
         Note 1375 Binary log 'master-bin.000004' is not purged because it is
                   the current active binlog
      - Automatic binary log purges, based on date or size, will write a
        note to the error log if a binary log matching the size or date
        cannot yet be deleted.
      - If 'slave_connections_needed_for_purge' is set from a config or
        command line, it is set to 0 if Galera is enabled and 1 otherwise
        (old default). This ensures that automatic binary log purge works
        with Galera as before the addition of
        'slave_connections_needed_for_purge'.
        If the variable is changed to 0, a warning will be printed to the error
        log.
      
      Code changes:
      - Added THD argument to several purge_logs related functions that needed
        THD.
      - Added 'interactive' options to purge_logs functions. This allowed
        me to remove testing of sql_command == SQLCOM_PURGE.
      - Changed purge_logs_before_date() to first check if log is applicable
        before calling can_purge_logs(). This ensures we do not get a
        notification for logs that does not match the remove criteria.
      - MYSQL_BIN_LOG::can_purge_log() will write notifications to the user
        or error log if a log cannot yet be removed.
      - log_in_use() will return reason why a binary log cannot be removed.
      
      Changes to keep code consistent:
      - Moved checking of binlog_format for Galera to be after Galera is
        initialized (The old check never worked). If Galera is enabled
        we now change the binlog_format to ROW, with a warning, instead of
        aborting the server. If this change happens a warning will be printed to
        the error log.
      - Print a warning if Galera or FLASHBACK changes the binlog_format
        to ROW. Before it was done silently.
      
      Reviewed by: Sergei Golubchik <serg@mariadb.com>,
                   Kristian Nielsen <knielsen@knielsen-hq.org>
      dd997809
  6. 02 Jul, 2024 2 commits
    • Monty's avatar
      MDEV-34494 Add server_uid global variable and add it to error log at startup · 2739b5f5
      Monty authored
      The feedback plugin server_uid variable and the calculate_server_uid()
      function is moved from feedback/utils.cc to sql/mysqld.cc
      
      server_uid is added as a global variable (shown in 'show variables') and
      is written to the error log on server startup together with server version
      and server commit id.
      2739b5f5
    • Monty's avatar
      MDEV-34491 Setting log_slow_admin="" at startup should be converted to log_slow_admin=ALL · d8c9c5ea
      Monty authored
      We have an issue if a user have the following in a configuration file:
      log_slow_filter=""                  # Log everything to slow query log
      log_queries_not_using_indexes=ON
      
      This set log_slow_filter to 'not_using_index' which disables
      slow_query_logging of most queries.
      In effect, on should never use log_slow_filter="" in config files but
      instead use log_slow_filter=ALL.
      
      Fixed by changing log_slow_filter="" that comes either from a
      configuration file or from the command line, when starting to the server,
      to log_slow_filter=ALL.
      A warning will be printed when this happens.
      
      Other things:
      - One can now use =ALL for any 'set' variable to set all options at once.
        (backported from 10.6)
      d8c9c5ea
  7. 01 Jul, 2024 1 commit
  8. 27 May, 2024 7 commits
    • Monty's avatar
      MDEV-9101 Limit size of created disk temporary files and tables · b9f57931
      Monty authored
      Two new variables added:
      - max_tmp_space_usage : Limits the the temporary space allowance per user
      - max_total_tmp_space_usage: Limits the temporary space allowance for
        all users.
      
      New status variables: tmp_space_used & max_tmp_space_used
      New field in information_schema.process_list: TMP_SPACE_USED
      
      The temporary space is counted for:
      - All SQL level temporary files. This includes files for filesort,
        transaction temporary space, analyze, binlog_stmt_cache etc.
        It does not include engine internal temporary files used for repair,
        alter table, index pre sorting etc.
      - All internal on disk temporary tables created as part of resolving a
        SELECT, multi-source update etc.
      
      Special cases:
      - When doing a commit, the last flush of the binlog_stmt_cache
        will not cause an error even if the temporary space limit is exceeded.
        This is to avoid giving errors on commit. This means that a user
        can temporary go over the limit with up to binlog_stmt_cache_size.
      
      Noteworthy issue:
      - One has to be careful when using small values for max_tmp_space_limit
        together with binary logging and with non transactional tables.
        If a the binary log entry for the query is bigger than
        binlog_stmt_cache_size and one hits the limit of max_tmp_space_limit
        when flushing the entry to disk, the query will abort and the
        binary log will not contain the last changes to the table.
        This will also stop the slave!
        This is also true for all Aria tables as Aria cannot do rollback
        (except in case of crashes)!
        One way to avoid it is to use @@binlog_format=statement for
        queries that updates a lot of rows.
      
      Implementation:
      - All writes to temporary files or internal temporary tables, that
        increases the file size, are routed through temp_file_size_cb_func()
        which updates and checks the temp space usage.
      - Most of the temporary file monitoring is done inside IO_CACHE.
        Temporary file monitoring is done inside the Aria engine.
      - MY_TRACK and MY_TRACK_WITH_LIMIT are new flags for ini_io_cache().
        MY_TRACK means that we track the file usage. TRACK_WITH_LIMIT means
        that we track the file usage and we give an error if the limit is
        breached. This is used to not give an error on commit when
        binlog_stmp_cache is flushed.
      - global_tmp_space_used contains the total tmp space used so far.
        This is needed quickly check against max_total_tmp_space_usage.
      - Temporary space errors are using EE_LOCAL_TMP_SPACE_FULL and
        handler errors are using HA_ERR_LOCAL_TMP_SPACE_FULL.
        This is needed until we move general errors to it's own error space
        so that they cannot conflict with system error numbers.
      - Return value of my_chsize() and mysql_file_chsize() has changed
        so that -1 is returned in the case my_chsize() could not decrease
        the file size (very unlikely and will not happen on modern systems).
        All calls to _chsize() are updated to check for > 0 as the error
        condition.
      - At the destruction of THD we check that THD::tmp_file_space == 0
      - At server end we check that global_tmp_space_used == 0
      - As a precaution against errors in the tmp_space_used code, one can set
        max_tmp_space_usage and max_total_tmp_space_usage to 0 to disable
        the tmp space quota errors.
      - truncate_io_cache() function added.
      - Aria tables using static or dynamic row length are registered in 8K
        increments to avoid some calls to update_tmp_file_size().
      
      Other things:
      - Ensure that all handler errors are registered.  Before, some engine
        errors could be printed as "Unknown error".
      - Fixed bug in filesort() that causes a assert if there was an error
        when writing to the temporay file.
      - Fixed that compute_window_func() now takes into account write errors.
      - In case of parallel replication, rpl_group_info::cleanup_context()
        could call trans_rollback() with thd->error set, which would cause
        an assert. Fixed by resetting the error before calling trans_rollback().
      - Fixed bug in subselect3.inc which caused following test to use
        heap tables with low value for max_heap_table_size
      - Fixed bug in sql_expression_cache where it did not overflow
        heap table to Aria table.
      - Added Max_tmp_disk_space_used to slow query log.
      - Fixed some bugs in log_slow_innodb.test
      b9f57931
    • Sergei Golubchik's avatar
      MDEV-33145 support for old-mode=OLD_FLUSH_STATUS · 9293d40f
      Sergei Golubchik authored
      add old-mode that restores inconsistent legacy behavior for FLUSH STATUS.
      It doesn't affect FLUSH { SESSION | GLOBAL } STATUS.
      9293d40f
    • Monty's avatar
      MDEV-33655 Remove alter_algorithm · 2464ee75
      Monty authored
      Remove alter_algorithm but keep the variable as no-op (with a warning).
      
      The reasons for removing alter_algorithm are:
      - alter_algorithm was introduced as a replacement for the
        old_alter_table that was used to force the usage of the original
        alter table algorithm (copy) in the cases where the new alter
        algorithm did not work. The new option was added as a way to force
        the usage of a specific algorithm when it should instead have made
        it possible to disable algorithms that would not work for some
        reason.
      - alter_algorithm introduced some cases where ALTER TABLE would not
        work without specifying the ALGORITHM=XXX option together with
        ALTER TABLE.
      - Having different values of alter_algorithm on master and slave could
        cause slave to stop unexpectedly.
      - ALTER TABLE FORCE, as used by mariadb-upgrade, would not always work
        if alter_algorithm was set for the server.
      - As part of the MDEV-33449 "improving repair of tables" it become
        clear that alter- algorithm made it harder to provide a better and
        more consistent ALTER TABLE FORCE and REPAIR TABLE and it would be
        better to remove it.
      2464ee75
    • Monty's avatar
      Fixed warnings when using deprecated variables · 8af7a994
      Monty authored
      Also fixed that all unused variables are using the same variable comment.
      The warning will be tested with the next commit that deprecates the
      variable alter_algorithm.
      8af7a994
    • Monty's avatar
      MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range · dfdedd46
      Monty authored
      This patch extends the timestamp from
      2038-01-19 03:14:07.999999 to 2106-02-07 06:28:15.999999
      for 64 bit hardware and OS where 'long' is 64 bits.
      This is true for 64 bit Linux but not for Windows.
      
      This is done by treating the 32 bit stored int as unsigned instead of
      signed.  This is safe as MariaDB has never accepted dates before the epoch
      (1970).
      The benefit of this approach that for normal timestamp the storage is
      compatible with earlier version.
      
      However for tables using system versioning we before stored a
      timestamp with the year 2038 as the 'max timestamp', which is used to
      detect current values.  This patch stores the new 2106 year max value
      as the max timestamp. This means that old tables using system
      versioning needs to be updated with mariadb-upgrade when moving them
      to 11.4. That will be done in a separate commit.
      dfdedd46
    • Sergei Golubchik's avatar
      MDEV-28671 post-testing fixes · 5296f908
      Sergei Golubchik authored
      Various help message improvements:
      * MySQL->MariaDB, mysqld->mariadbd, "mysqld daemon" -> "mariadbd process"
      * typos
      * don't specify defaults directly in the help message
      * don't say that an option is deprecated, mark is as such
      * missing spaces in the middle of the text
      etc
      5296f908
    • Sergei Golubchik's avatar
      MDEV-28671 post-merge fixes · df10a945
      Sergei Golubchik authored
      * use new deprecated printer for all deprecated server options
      * restore alphabetic option sorting order
      * move deprecated printer from mysqld.cc to my_getopt.c
      * in --help print deprecation message at the end of the option help
      * move 'ALL' help text where it belongs - to other SET options, and
        with a correct indentation.
      * consistently end all or none command-line option help strings
        with a dot - my_print_help() needs that.
        It's about 50/50 now, so let's do none, less line wraps in --help
      * remove trailing spaces from command-line option help strings
      df10a945
  9. 14 Feb, 2024 1 commit
    • Monty's avatar
      MDEV-31404 Implement binlog_space_limit · 18dfcfde
      Monty authored
      binlog_space_limit is a variable in Percona server used to limit the total
      size of all binary logs.
      
      This implementation is based on code from Percona server 5.7.
      
      In MariaDB we decided to call the variable max-binlog-total-size to be
      similar to max-binlog-size. This makes it easier to find in the output
      from 'mariadbd --help --verbose'). MariaDB will also support
      binlog_space_limit for compatibility with Percona.
      
      Some internal notes to explain implementation notes:
      
      - When running MariaDB does not delete binary logs that are either
        used by slaves or have active xid that are not yet committed.
      
      Some implementation notes:
      
      - max-binlog-total-size is by default 0 (no limit).
      - max-binlog-total-size can be changed without server restart.
      - Binlog file sizes are checked on startup, or if
        max-binlog-total-size is set to a value > 0, not for every log write.
        The total size of all binary logs is cached and dynamically updated
        when updating the binary log on binary log rotation.
      - max-binlog-total-size is checked against existing log files during
        serverstart, binlog rotation, FLUSH LOGS, when writing to binary log
        or when max-binlog-total-size changes value.
      - Option --slave-connections-needed-for-purge with 1 as default added.
        This allows one to ensure that we do not delete binary logs if there
        is less than 'slave-connections-needed-for-purge' connected.
        Without this option max-binlog-total-size would potentially delete
        binlogs needed by slaves on server startup or when a slave disconnects
        as there are then no connected slaves to protect active binlogs.
      - PURGE BINARY LOGS TO ... will be executed as if
        slave-connectitons-needed-for-purge would be zero. In other words
        it will do the purge even if there is no slaves connected. If there
        are connected slaves working on the logs, these will be protected.
      - If binary log is on and max-binlog-total_size <> 0 then the status
        variable 'Binlog_disk_use' shows the current size of all old binary
        logs + the state of the current one.
      - Removed test of strcmp(log_file_name, log_info.log_file_name) in
        purge_logs_before_date() as this is tested in can_purge_logs()
      - To avoid expensive calls of log_in_use() we cache the result for the
        last log that is in use by a slave. Future calls to can_purge_logs()
        for this binary log will be quickly detected and false will be returned
        until a slave starts working on a new log.
      - Note that after a binary log rotation caused by max_binlog_size,
        the last log will not be purged directly as it is still in use
        internally. The next binary log write will purge binlogs if needed.
      
      Reviewer:Kristian Nielsen <knielsen@knielsen-hq.org>
      18dfcfde
  10. 12 Feb, 2024 1 commit
    • Monty's avatar
      MDEV-33306 Optimizer choosing incorrect index in 10.6, 10.5 but not in 10.4 · 3907345e
      Monty authored
      In MariaDB up to 10.11, the test_if_cheaper_ordering() code (that tries
      to optimizer how GROUP BY is executed) assumes that if a table scan is used
      then if there is any index usable by GROUP BY it will be used.
      
      The reason MySQL 10.4 provides a better plan is because of two differences:
      - Plans using 'ref' has a cost of 1/10 of what it should be (as a
        protection against table scans). This is why 'ref' is used in 10.4
        and not in 10.5.
      - When 'ref' is used, then GROUP BY will not use an index for GROUP BY.
      
      In MariaDB 10.5 the chosen plan is a table scan (as it calculated to be
      faster) but as 'ref' is not used, the test_if_cheaper_ordering()
      optimizer phase decides (as ref is not usd) to use an index for GROUP BY,
      which has bad performance.
      
      Description of fix:
      - All new code is protected by the "optimizer_adjust_secondary_key_costs"
        variable, which is now a bit map, and is only executed if the option
        "disable_forced_index_in_group_by" set.
      - Corrects GROUP BY handling in test_if_cheaper_ordering() by making
        the choise of using and index with GROUP BY cost based instead of rule
        based.
      - Adds TIME_FOR_COMPARE to all costs, when using group by, to make
        read_time, index_scan_time and range_cost comparable.
      
      Other things:
      - Made optimizer_adjust_secondary_key_costs a bit map (compatible with old
        code).
      
      Notes:
      Current code ignores costs for the algorithm used when doing GROUP
      BY on the first table:
        - Create an in-memory temporary table for handling group by and doing a
          filesort of the result file
      We can probably in 10.6 continue to ignore this cost.
      
      This patch should NOT be merged to 11.0 series (not needed in 11.0).
      3907345e
  11. 27 Jan, 2024 1 commit
    • Kristian Nielsen's avatar
      MDEV-4991: GTID binlog indexing · d039346a
      Kristian Nielsen authored
      Improve the performance of slave connect using B+-Tree indexes on each binlog
      file. The index allows fast lookup of a GTID position to the corresponding
      offset in the binlog file, as well as lookup of a position to find the
      corresponding GTID position.
      
      This eliminates a costly sequential scan of the starting binlog file
      to find the GTID starting position when a slave connects. This is
      especially costly if the binlog file is not cached in memory (IO
      cost), or if it is encrypted or a lot of slaves connect simultaneously
      (CPU cost).
      
      The size of the index files is generally less than 1% of the binlog data, so
      not expected to be an issue.
      
      Most of the work writing the index is done as a background task, in
      the binlog background thread. This minimises the performance impact on
      transaction commit. A simple global mutex is used to protect index
      reads and (background) index writes; this is fine as slave connect is
      a relatively infrequent operation.
      
      Here are the user-visible options and status variables. The feature is on by
      default and is expected to need no tuning or configuration for most users.
      
      binlog_gtid_index
        On by default. Can be used to disable the indexes for testing purposes.
      
      binlog_gtid_index_page_size (default 4096)
        Page size to use for the binlog GTID index. This is the size of the nodes
        in the B+-tree used internally in the index. A very small page-size (64 is
        the minimum) will be less efficient, but can be used to stress the
        BTree-code during testing.
      
      binlog_gtid_index_span_min (default 65536)
        Control sparseness of the binlog GTID index. If set to N, at most one
        index record will be added for every N bytes of binlog file written.
        This can be used to reduce the number of records in the index, at
        the cost only of having to scan a few more events in the binlog file
        before finding the target position
      
      Two status variables are available to monitor the use of the GTID indexes:
      
        Binlog_gtid_index_hit
        Binlog_gtid_index_miss
      
      The "hit" status increments for each successful lookup in a GTID index.
      The "miss" increments when a lookup is not possible. This indicates that the
      index file is missing (eg. binlog written by old server version
      without GTID index support), or corrupt.
      Signed-off-by: default avatarKristian Nielsen <knielsen@knielsen-hq.org>
      d039346a
  12. 23 Jan, 2024 1 commit
    • Monty's avatar
      MDEV-33118 optimizer_adjust_secondary_key_costs variable · 6f65e082
      Monty authored
      optimizer-adjust_secondary_key_costs is added to provide 2 small
      adjustments to the 10.x optimizer cost model. This can be used in the
      case where the optimizer wrongly uses a secondary key instead of a
      clustered primary key.
      
      The reason behind this change is that MariaDB 10.x does not take into
      account that for engines like InnoDB, that scanning a primary key can be
      up to 7x faster than scanning a secondary key + read the row data trough
      the primary key.
      
      The different values for optimizer_adjust_secondary_key_costs are:
      
      optimizer_adjust_secondary_key_costs=0
      - No changes to current model
      
      optimizer_adjust_secondary_key_costs=1
      - Ensure that the cost of of secondary indexes has a cost of at
        least 5x times the cost of a clustered primary key (if one exists).
        This disables part of the worst_seek optimization described below.
      
      optimizer_adjust_secondary_key_costs=2
      - Disable "worst_seek optimization" and adjust filter cost slightly
        (add cost of 1 if filter is used).
      
      The idea behind 'worst_seek optimization' is that we limit the
      cost for all non clustered ref access to the least of:
      - best-rows-by-range (or all rows in no range found) / 10
      - scan-time-table (roughly number of file blocks to scan table) * 3
      
      In addition we also do not try to use rowid_filter if number of rows
      estimated for 'ref' access is less than the worst_seek limitation.
      
      The idea is that worst_seek is trying to take into account that if
      we do a lot of accesses through a key, this is likely to be cached.
      However it only does this for secondary keys, and not for clustered
      keys or index only reads.
      
      The effect of the worst_seek are:
      - In some cases 'ref' will have a much lower cost than range or using
        a clustered key.
      - Some possible rowid filters for secondary keys will be ignored.
      
      When implementing optimizer_adjust_secondary_key_costs=2, I noticed
      that there is a slightly different costs for how ref+filter and
      range+filter are calculated.  This caused a lot of range and
      range+filter to change to ref+filter, which is not good as
      range+filter provides the optimizer a better estimate of how many
      accepted rows there will be in the result set.
      Adding a extra small cost (1 seek) when using filter mitigated the
      above problems in almost all cases.
      
      This patch should not be applied to MariaDB 11.0 as worst_seeks is
      removed in 11.0 and the cost calculation for clustered keys, secondary
      keys, index scan and filter is more exact.
      
      Test case changes for --optimizer-adjust_secondary_key_costs=1
      (Fix secondary key costs to be 5x of primary key):
      
      - stat_tables_innodb:
        - Complex change (probably ok as number of rows are really small)
          - ref over 1 row changed to range over 10 rows with join buffer
          - ref over 5 rows changed to eq_ref
          - secondary ref over 1 row changed to ref of primary key over 4 rows
          - Change of key to use longer key with index pushdown (a little
            bit worse but not significant).
        - Change to use secondary (1 row) -> primary (4 rows)
      - rowid_filter_innodb:
        - index_merge (2 rows) & ref (1) -> all (23 rows) -> primary eq_ref.
      
      Test case changes for --optimizer-adjust_secondary_key_costs=2
      (remove of worst_seeks & adjust filter cost):
      
      - stat_tables_innodb:
        - Join order change (probably ok as number of rows are really small)
        - ref (5 rows) & ref(1 row) changed to range (10 rows & join buffer)
          & eq_ref.
      - selectivity_innodb:
        - ref -> ref|filter  (ok)
      - rowid_filter_innodb:
        - ref -> ref|filter (ok)
        - range|filter (64 rows) changed to ref|filter (128 rows).
          ok as ref|filter outputs wrong number of rows in explain.
      - range, range_mrr_icp:
        -ref (500 rows -> ALL (1000 rows) (ok)
      - select_pkeycache, select, select_jcl6:
        - ref|filter (2 rows) -> ref (2 rows) (ok)
      - selectivity:
        - ref -> ref_filter (ok)
      - range:
        - Change of 'filtered' but no stat or plan change (ok)
      - selectivity:
       - ref -> ref+filter (ok)
       - Change of filtered but no plan change (ok)
      - join_nested_jcl6:
        - range -> ref|filter (ok as only 2 rows)
      - subselect3, subselect3_jcl6:
        - ref_or_null (4 rows) -> ALL (10 rows) (ok)
        - Index_subquery (4 rows) -> ALL (10 rows)  (ok)
      - partition_mrr_myisam, partition_mrr_aria and partition_mrr_innodb:
        - Uses ALL instead of REF for a key value that is the same for > 50%
          of rows.  (good)
      order_by_innodb:
        - range (200 rows) -> ref (20 rows)+filesort (ok)
      - subselect_sj2_mat:
        - One test changed. One ALL removed and replaced with eq_ref. Likely
          to be better.
      - join_cache:
        - Changed ref over 60% of the rows to use hash join (ok)
      - opt_tvc:
        - Changed to use eq_ref instead of ref with plan change (probably ok)
      - opt_trace:
        - No worst/max seeks clipping (good).
        - Almost double range_scan_time and index_scan_time (ok).
      - rowid_filter:
        - ref -> ref|filtered (ok)
        - range|filter (77 rows) changed to ref|filter (151 rows).  Proably
          ok as ref|filter outputs wrong number of rows in explain.
      
      Reviewer: Sergei Petrunia <sergey@mariadb.com>
      6f65e082
  13. 30 Nov, 2023 1 commit
    • Vladislav Vaintroub's avatar
      MDEV-31608 - Connector/NET fails to connect since 10.10 · 9d07b052
      Vladislav Vaintroub authored
      Connector/NET does not expect collation IDs returned by "show collations"
      to be NULL, runs into an exception.
      
      The fix is to determine connector/net using its connection attributes,
      then make sure "show collations" does not output NULL IDs.
      
      The patch introduces new old_mode NO_NULL_COLLATION_IDs, that is
      automatically set, once MySQL Connector/NET connection is determined.
      
      A test was added, that uses MySql.Data from powershell - only works
      if MySql.Data is installed into GAC (i.e with C/NET MSI package)
      9d07b052
  14. 23 Nov, 2023 1 commit
    • Libing Song's avatar
      MDEV-32589 FULL_NODUP mode for binlog_row_image · a119c5f9
      Libing Song authored
      This patch provides a new mode FULL_NODUP to binlog_row_image system
      variable. With FULL_NODUP mode, all columns are included in before
      image, but only updated columns are included in after image for UPDATE.
      While all columns are included in the after image for INSERT.
      
      FULL_NODUP is for replacing FULL mode. It includes all data of
      the before and after image as FULL mode, but it uses less storage
      especially in the case that only a few columns are updated.
      
      Note: It will binlog full before and after image for all modes if the
            table has no primary key. FULL_NODUP follows the behavior.
      a119c5f9
  15. 02 Nov, 2023 1 commit
  16. 19 Oct, 2023 1 commit
    • Sergei Petrunia's avatar
      MDEV-32113: utf8mb3_key_col=utf8mb4_value cannot be used for ref · 4941ac91
      Sergei Petrunia authored
      (Variant#3: Allow cross-charset comparisons, use a special
      CHARSET_INFO to create lookup keys. Review input addressed.)
      
      Equalities that compare utf8mb{3,4}_general_ci strings, like:
      
        WHERE ... utf8mb3_key_col=utf8mb4_value    (MB3-4-CMP)
      
      can now be used to construct ref[const] access and also participate
      in multiple-equalities.
      This means that utf8mb3_key_col can be used for key-lookups when
      compared with an utf8mb4 constant, field or expression using '=' or
      '<=>' comparison operators.
      
      This is controlled by optimizer_switch='cset_narrowing=on', which is
      OFF by default.
      
      IMPLEMENTATION
      Item value comparison in (MB3-4-CMP) is done using utf8mb4_general_ci.
      This is valid as any utf8mb3 value is also an utf8mb4 value.
      
      When making index lookup value for utf8mb3_key_col, we do "Charset
      Narrowing": characters that are in the Basic Multilingual Plane (=BMP) are
      copied as-is, as they can be represented in utf8mb3. Characters that are
      outside the BMP cannot be represented in utf8mb3 and are replaced
      with U+FFFD, the "Replacement Character".
      
      In utf8mb4_general_ci, the Replacement Character compares as equal to any
      character that's not in BMP. Because of this, the constructed lookup value
      will find all index records that would be considered equal by the original
      condition (MB3-4-CMP).
      Approved-by: default avatarMonty <monty@mariadb.org>
      4941ac91
  17. 03 Oct, 2023 2 commits
    • Monty's avatar
      Change SEL_ARG::MAX_SEL_ARGS to a user defined variable optimizer_max_sel_args · d4347177
      Monty authored
      This allows a user to to change the default value of MAX_SEL_ARGS (16000)
      in the rare case where they neeed more generated SEL_ARGS (as part of
      the range optimizer)
      d4347177
    • Monty's avatar
      MDEV-32203 Raise notes when an index cannot be used on data type mismatch · 4e9322e2
      Monty authored
      Raise notes if indexes cannot be used:
      - in case of data type or collation mismatch (diferent error messages).
      - in case if a table field was replaced to something else
        (e.g. Item_func_conv_charset) during a condition rewrite.
      
      Added option to write warnings and notes to the slow query log for
      slow queries.
      
      New variables added/changed:
      
      - note_verbosity, with is a set of the following options:
        basic            - All old notes
        unusable_keys    - Print warnings about keys that cannot be used
                           for select, delete or update.
        explain          - Print unusable_keys warnings for EXPLAIN querys.
      
      The default is 'basic,explain'. This means that for old installations
      the only notable new behavior is that one will get notes about
      unusable keys when one does an EXPLAIN for a query. One can turn all
      of all notes by either setting note_verbosity to "" or setting sql_notes=0.
      
      - log_slow_verbosity has a new option 'warnings'. If this is set
        then warnings and notes generated are printed in the slow query log
        (up to log_slow_max_warnings times per statement).
      
      - log_slow_max_warnings   - Max number of warnings written to
                                  slow query log.
      
      Other things:
      - One can now use =ALL for any 'set' variable to set all options at once.
        For example using "note_verbosity=ALL" in a config file or
        "SET @@note_verbosity=ALL' in SQL.
      - mysqldump will in the future use @@note_verbosity=""' instead of
        @sql_notes=0 to disable notes.
      - Added "enum class Data_type_compatibility" and changing the return type
        of all Field::can_optimize*() methods from "bool" to this new data type.
      
      Reviewer & Co-author: Alexander Barkov <bar@mariadb.com>
      - The code that prints out the notes comes mainly from Alexander
      4e9322e2
  18. 30 Sep, 2023 1 commit
    • Sergei Golubchik's avatar
      MDEV-32104 remove deprecated features · 82174dae
      Sergei Golubchik authored
      In particular:
      
      * @@debug
        deprecated since 5.5.37
      * sr_YU locale
        deprecated since 10.0.11
      * "engine_condition_pushdown" in the @@optimizer_switch
        deprecated since 10.1.1
      * @@date_format, @@datetime_format, @@time_format, @@max_tmp_tables
        deprecated since  10.1.2
      * @@wsrep_causal_reads
        deprecated since 10.1.3
      * "parser" in mroonga table comment
        deprecated since 10.2.11
      82174dae
  19. 12 Sep, 2023 1 commit
    • Sergei Petrunia's avatar
      MDEV-31496: Make optimizer handle UCASE(varchar_col)=... · e987b935
      Sergei Petrunia authored
      (Review input addressed)
      (Added handling of UPDATE/DELETE and partitioning w/o index)
      
      If the properties of the used collation allow, do the following
      equivalent rewrites:
      
      1. UPPER(key_col)=expr  ->  key_col=expr
         expr=UPPER(key_col)  ->  expr=key_col
         (also rewrite both sides of the equality at the same time)
      
      2. UPPER(key_col) IN (constant-list)  -> key_col IN (constant-list)
      
      - Mark utf8mb{3,4}_general_ci as collations that allow this.
      - Add optimizer_switch='sargable_casefold=ON' to control this.
        (ON by default in this patch)
      - Cover the rewrite in Optimizer Trace, rewrite name is
        "sargable_casefold_removal".
      e987b935
  20. 15 Aug, 2023 1 commit
  21. 02 Aug, 2023 2 commits
  22. 17 Jul, 2023 1 commit
    • Alexander Barkov's avatar
      MDEV-30164 System variable for default collations · 75f25e4c
      Alexander Barkov authored
      This patch adds a way to override default collations
      (or "character set collations") for desired character sets.
      
      The SQL standard says:
      > Each collation known in an SQL-environment is applicable to one
      > or more character sets, and for each character set, one or more
      > collations are applicable to it, one of which is associated with
      > it as its character set collation.
      
      In MariaDB, character set collations has been hard-coded so far,
      e.g. utf8mb4_general_ci has been a hard-coded character set collation
      for utf8mb4.
      
      This patch allows to override (globally per server, or per session)
      character set collations, so for example, uca1400_ai_ci can be set as a
      character set collation for Unicode character sets
      (instead of compiled xxx_general_ci).
      
      The array of overridden character set collations is stored in a new
      (session and global) system variable @@character_set_collations and
      can be set as a comma separated list of charset=collation pairs, e.g.:
      
      SET @@character_set_collations='utf8mb3=uca1400_ai_ci,utf8mb4=uca1400_ai_ci';
      
      The variable is empty by default, which mean use the hard-coded
      character set collations (e.g. utf8mb4_general_ci for utf8mb4).
      
      The variable can also be set globally by passing to the server startup command
      line, and/or in my.cnf.
      75f25e4c
  23. 07 Jul, 2023 1 commit
    • Monty's avatar
      MDEV-31558 Add InnoDB engine information to the slow query log · 99bd2260
      Monty authored
      The new statistics is enabled by adding the "engine", "innodb" or "full"
      option to --log-slow-verbosity
      
      Example output:
      
       # Pages_accessed: 184  Pages_read: 95  Pages_updated: 0  Old_rows_read: 1
       # Pages_read_time: 17.0204  Engine_time: 248.1297
      
      Page_read_time is time doing physical reads inside a storage engine.
      (Writes cannot be tracked as these are usually done in the background).
      Engine_time is the time spent inside the storage engine for the full
      duration of the read/write/update calls. It uses the same code as
      'analyze statement' for calculating the time spent.
      
      The engine statistics is done with a generic interface that should be
      easy for any engine to use. It can also easily be extended to provide
      even more statistics.
      
      Currently only InnoDB has counters for Pages_% and Undo_% status.
      Engine_time works for all engines.
      
      Implementation details:
      
      class ha_handler_stats holds all engine stats.  This class is included
      in handler and THD classes.
      While a query is running, all statistics is updated in the handler. In
      close_thread_tables() the statistics is added to the THD.
      
      handler::handler_stats is a pointer to where statistics should be
      collected. This is set to point to handler::active_handler_stats if
      stats are requested. If not, it is set to 0.
      handler_stats has also an element, 'active' that is 1 if stats are
      requested. This is to allow engines to avoid doing any 'if's while
      updating the statistics.
      
      Cloned or partition tables have the pointer set to the base table if
      status are requested.
      
      There is a small performance impact when using --log-slow-verbosity=engine:
      - All engine calls in 'select' will be timed.
      - IO calls for InnoDB reads will be timed.
      - Incrementation of counters are done on local variables and accesses
        are inline, so these should have very little impact.
      - Statistics has to be reset for each statement for the THD and each
        used handler. This is only 40 bytes, which should be neglectable.
      - For partition tables we have to loop over all partitions to update
        the handler_status as part of table_init(). Can be optimized in the
        future to only do this is log-slow-verbosity changes. For this to work
        we have to update handler_status for all opened partitions and
        also for all partitions opened in the future.
      
      Other things:
      - Added options 'engine' and 'full' to log-slow-verbosity.
      - Some of the new files in the test suite comes from Percona server, which
        has similar status information.
      - buf_page_optimistic_get(): Do not increment any counter, since we are
        only validating a pointer, not performing any buf_pool.page_hash lookup.
      - Added THD argument to save_explain_data_intern().
      - Switched arguments for save_explain_.*_data() to have
        always THD first (generates better code as other functions also have THD
        first).
      99bd2260
  24. 03 May, 2023 1 commit
    • Monty's avatar
      MDEV-30812: Improve output cardinality estimates for hash join · 3bdc5542
      Monty authored
      Introduces @@optimizer_switch flag: hash_join_cardinality
      
      When this option is on, use EITS statistics to produce tighter bounds
      for hash join output cardinality.
      
      This patch is an extension / replacement to a similar patch in 10.6
      
      New features:
      - optimizer_switch hash_join_cardinality is on by default
      - records_out is set to fanout when HASH is used
      - Fixed bug in is_eits_usable: The function did not work with views
      3bdc5542
  25. 02 May, 2023 1 commit
    • Monty's avatar
      MDEV-28054 Various crashes upon INSERT/UPDATE after changing Aria settings · 4cb0d43a
      Monty authored
      The cause of the crash was that test was setting
      aria_sort_buffer_size to MAX_LONG_LONG, which caused an overflow in
      my_malloc() when trying to allocate the buffer + 8 bytes.
      
      Fixed by reducing max size of sort_buffer for Aria and MyISAM
      
      Other things:
      - Added code in maria_repair_parallell() to not allocate a big sort buffer
        for small files.
      - Updated size of minumim sort buffer in Aria
      4cb0d43a
  26. 28 Apr, 2023 1 commit
  27. 26 Apr, 2023 1 commit
  28. 12 Apr, 2023 1 commit
    • Junqi Xie's avatar
      MDEV-21921 Make transaction_isolation and transaction_read_only into system variables · d20a96f9
      Junqi Xie authored
      In MariaDB, we have a confusing problem where:
      * The transaction_isolation option can be set in a configuration file, but it cannot be set dynamically.
      * The tx_isolation system variable can be set dynamically, but it cannot be set in a configuration file.
      
      Therefore, we have two different names for the same thing in different contexts. This is needlessly confusing, and it complicates the documentation. The same thing applys for transaction_read_only.
      
      MySQL 5.7 solved this problem by making them into system variables. https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-20.html
      
      This commit takes a similar approach by adding new system variables and marking the original ones as deprecated. This commit also resolves some legacy problems related to SET STATEMENT and transaction_isolation.
      d20a96f9