1. 31 Mar, 2020 16 commits
  2. 30 Mar, 2020 9 commits
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 37c14690
      Marko Mäkelä authored
      37c14690
    • Marko Mäkelä's avatar
      Cleanup recv_sys: Move things to members · aae3f921
      Marko Mäkelä authored
      recv_sys.recovery_on: Replaces recv_recovery_on.
      
      recv_sys_t::apply(): Replaces recv_apply_hashed_log_recs().
      
      recv_sys_var_init(): Remove.
      
      recv_sys_t::recover_low(): Attempt to initialize a page based
      on buffered redo log records.
      aae3f921
    • Marko Mäkelä's avatar
      MDEV-12353: Remove a trace of pre-MDEV-13564 crash-upgrade · a8b04c3e
      Marko Mäkelä authored
      In commit f8a9f906
      we removed support for crash-upgrade from older versions,
      but forgot to remove a check for recovering TRUNCATE TABLE
      if MariaDB 10.2.18 or 10.3.9 or earlier were killed and
      we are attempting to upgrade to MariaDB 10.5.2 or later.
      Already MariaDB 10.4 would refuse to recover such TRUNCATE
      operations.
      a8b04c3e
    • Marko Mäkelä's avatar
      Merge 10.3 into 10.4 · e2f1f88f
      Marko Mäkelä authored
      e2f1f88f
    • Marko Mäkelä's avatar
      MDEV-20590 Introduce a file format constraint to ALTER TABLE · b092d35f
      Marko Mäkelä authored
      If a table is altered using the MDEV-11369/MDEV-15562/MDEV-13134
      ALGORITHM=INSTANT, it can force the table to use a non-canonical
      format:
      
      * A hidden metadata record at the start of the clustered index
      is used to store each column's DEFAULT value. This makes it possible
      to add new columns that have default values without rebuilding the table.
      
      * Starting with MDEV-15562 in MariaDB Server 10.4, a BLOB in the
      hidden metadata record is used to store column mappings. This makes
      it possible to drop or reorder columns without rebuilding the table.
      This also makes it possible to add columns to any position or drop
      columns from any position in the table without rebuilding the table.
      
      If a column is dropped without rebuilding the table, old records
      will contain garbage in that column's former position, and new records
      will be written with NULL values, empty strings, or dummy values.
      
      This is generally not a problem. However, there may be cases where
      users may want to avoid putting a table into this format.
      For example, users may want to ensure that future UPDATE operations
      after an ADD COLUMN will be performed in-place, to reduce write
      amplification. (Instantly added columns are essentially always
      variable-length.) Users might also want to avoid bugs similar to
      MDEV-19916, or they may want to be able to export tables to
      older versions of the server.
      
      We will introduce the option innodb_instant_alter_column_allowed,
      with the following values:
      
      * never (0): Do not allow instant add/drop/reorder,
      to maintain format compatibility with MariaDB 10.x and MySQL 5.x.
      If the table (or partition) is not in the canonical format, then
      any ALTER TABLE (even one that does not involve instant column
      operations) will force a table rebuild.
      
      * add_last (1, default in 10.3): Store a hidden metadata record that
      allows columns to be appended to the table instantly (MDEV-11369).
      In 10.4 or later, if the table (or partition) is not in this format,
      then any ALTER TABLE (even one that does not involve column changes)
      will force a table rebuild.
      
      Starting with 10.4:
      
      * add_drop_reorder (2, default): Like 'add_last', but allow the
      metadata record to store a column map, to support instant
      add/drop/reorder of columns (MDEV-15562).
      b092d35f
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-21832 FORCE all partition to rebuild if any one of the · f8ec3ba0
      Thirunarayanan Balathandayuthapani authored
      		partition does rebuild
      
      - In ha_innobase::commit_inplace_alter_table() assumes that all partition
      should do the same kind of alter operations. During DDL, if one partition
      requires table rebuild and other partition doesn't need rebuild
      then all partition should be forced to rebuild.
      f8ec3ba0
    • Marko Mäkelä's avatar
      Fix GCC -Wstringop-truncation · 67f27824
      Marko Mäkelä authored
      67f27824
    • Marko Mäkelä's avatar
      Merge 10.2 into 10.3 · 1a9b6c4c
      Marko Mäkelä authored
      1a9b6c4c
    • Varun Gupta's avatar
      MDEV-22019: Sig 11 in next_breadth_first_tab | max_sort_length setting +... · b11ff3d4
      Varun Gupta authored
      MDEV-22019: Sig 11 in next_breadth_first_tab | max_sort_length setting + double GROUP BY leads to crash
      
      No need to create a temp table for aggregation if we have encountered some error.
      b11ff3d4
  3. 28 Mar, 2020 5 commits
    • Marko Mäkelä's avatar
      MDEV-20377: Enable MemorySanitizer user-poisoning · 6be56dd1
      Marko Mäkelä authored
      For ENGINE=Aria, we work around bugs in various tests that catch
      writes of uninitialized bytes from the Aria page cache.
      (Why do we even write anything on DROP TABLE?)
      6be56dd1
    • Marko Mäkelä's avatar
      MDEV-20377: Make WITH_MSAN more usable · 94d0bb4d
      Marko Mäkelä authored
      MemorySanitizer (clang -fsanitize=memory) requires that all code
      be compiled with instrumentation enabled. The C runtime library
      is an exception. Failure to use instrumented libraries will cause
      bogus messages about memory being uninitialized.
      
      In WITH_MSAN builds, we must avoid calling getservbyname(),
      because even though it is a standard library function, it is
      not instrumented, not even in clang 10.
      
      The following cmake options were tested:
      
      -DCMAKE_C_FLAGS='-march=native -O2'
      -DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2'
      -DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug
      -DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF
      -DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO
      -DWITH_SAFEMALLOC=OFF
      -DWITH_{ZLIB,SSL,PCRE}=bundled
      -DHAVE_LIBAIO_H=0
      -DWITH_MSAN=ON
      
      MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED()
      and in the future, __msan_unpoison().
      
      For now, neither MEM_MAKE_DEFINED() nor MEM_UNDEFINED()
      perform any action under MSAN. Enabling them will catch more bugs, but
      will also require some more fixes or work-arounds.
      
      Json_writer::add_double(): Work around a frequently occurring
      failure in optimizer tests, related to EXPLAIN FORMAT=JSON.
      
      dtoa(): Disable MSAN altogether. For some reason, this function
      is triggering a lot of trouble, especially when invoked for
      DBUG functions. The MDL default timeout is dd=86400 seconds,
      and for some reason it is claimed to be uninitialized.
      
      InnoDB: Define UNIV_DEBUG_VALGRIND also WITH_MSAN.
      
      ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in
      functions instead of inline assembler when building WITH_MSAN.
      This will require at least -msse4.2 when building for IA-32 or AMD64.
      The inline assembler would not be instrumented, and would thus cause
      bogus failures.
      94d0bb4d
    • Marko Mäkelä's avatar
      Do not compare uninitialized data · 6ec6eda4
      Marko Mäkelä authored
      Valgrind only seems to complain about memcmp() operations that
      actually end up reading uninitialized data, while MemorySanitizer
      requires that the entire length of both buffers be defined.
      6ec6eda4
    • Marko Mäkelä's avatar
    • Vladislav Vaintroub's avatar
      MDEV-20372 thread_pool_info fails randomly in 10.5 · e1295554
      Vladislav Vaintroub authored
      Rework stats a bit, so we're not missing any queue_get() now.
      
      Don't do stats_reset_table(), if generic threadpool is off.
      e1295554
  4. 27 Mar, 2020 10 commits
    • Alexander Barkov's avatar
    • Marko Mäkelä's avatar
      MDEV-22060 MSAN use-of-uninitialized-value in main.query_cache_innodb · d3bdc30c
      Marko Mäkelä authored
      During the test main.query_cache_innodb, only 16 bytes of
      db_buf are initialized during the memcmp() in
      dict_acquire_mdl_shared<false>(), but db_len was wrongly set to 20 bytes.
      
      Something similar was fixed in MDEV-21344, but only for the table name,
      in commit 0e25a8b4.
      
      dict_table_t::parse_name(): Assign the return value of
      filename_to_tablename() to the output parameters for lengths.
      There is no need to invoke strlen().
      d3bdc30c
    • Andrew Hutchings's avatar
      MDEV-20329 Fix S3 engine OpenSSL race · 0181384a
      Andrew Hutchings authored
      With OpenSSL < 1.1 there is a potential for a race condition to occur.
      This can cause the S3 engine to crash. The workaround is to add locking
      callbacks to OpenSSL so that this doesn't happen.
      
      https://curl.haxx.se/libcurl/c/threadsafe.html
      
      There is a fix in libMariaS3 for this which when a certain flag is set
      (HAVE_CURL_OPENSSL_UNSAFE) will add the required locks.
      
      This patch adds CMake support so that the flag is set if it is found
      that Curl is compiled with an unsafe OpenSSL version. For example Ubuntu
      16.04 with libcurl4-openssl-dev.
      0181384a
    • Monty's avatar
      Fixed failing tests in buildbot · ff64152b
      Monty authored
      - Updated icp_tests.inc and result files
      - Updates usage of records_in_range() protype in cassandra
      ff64152b
    • Sergey Vojtovich's avatar
      9eae063e
    • Marko Mäkelä's avatar
      Merge 10.4 into 10.5 · 53aabda6
      Marko Mäkelä authored
      53aabda6
    • Marko Mäkelä's avatar
      MDEV-21899 INSERT into a secondary index with zero-data-length key is not crash-safe · f614b6ea
      Marko Mäkelä authored
      page_cur_insert_rec_low(): Remove a bogus condition that wrongly
      omitted redo logging when the record contains no data payload bytes.
      We can have such records in secondary indexes, when the values of
      the PRIMARY KEY column(s) are the empty string, and the values of
      secondary key column(s) are are NULL or the empty string.
      
      page_apply_delete_dynamic(): Improve the consistency check, and
      do not allow adjacent records to be less than 5 bytes apart from
      each other. The fixed-size part of the record header is 5 bytes.
      Usually there must also be some header or payload bytes, but in
      an extreme case where all columns are CHAR(0) NOT NULL, the
      minimum secondary index record size is 5 bytes, and the table can
      contain at most 1 row. The minimum clustered index record size is
      5+6+7 bytes (header, DB_TRX_ID, DB_ROLL_PTR) or x+5+4 bytes
      (fixed-size header, child page number, and some additional header
      or payload bytes).
      f614b6ea
    • Monty's avatar
      Updated optimizer costs in multi_range_read_info_const() and sql_select.cc · eb483c51
      Monty authored
      - multi_range_read_info_const now uses the new records_in_range interface
      - Added handler::avg_io_cost()
      - Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
        not 1.0.  In this case we trust the avg_io_cost() from the handler.
      - Changed test_quick_select to use TIME_FOR_COMPARE instead of
        TIME_FOR_COMPARE_IDX to align this with the rest of the code.
      - Fixed bug when using test_if_cheaper_ordering where we didn't use
        keyread if index was changed
      - Fixed a bug where we didn't use index only read when using order-by-index
      - Added keyread_time() to HEAP.
        The default keyread_time() was optimized for blocks and not suitable for
        HEAP. The effect was the HEAP prefered table scans over ranges for btree
        indexes.
      - Fixed get_sweep_read_cost() for HEAP tables
      - Ensure that range and ref have same cost for simple ranges
        Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
        we favior ref for range for simple queries.
      - Fixed that matching_candidates_in_table() uses same number of records
        as the rest of the optimizer
      - Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
        HEAP and temporary tables better. A few tests changed because of this.
      - heap::read_time() and heap::keyread_time() adjusted to not add +1.
        This was to ensure that handler::keyread_time() doesn't give
        higher cost for heap tables than for normal tables. One effect of
        this is that heap and derived tables stored in heap will prefer
        key access as this is now regarded as cheap.
      - Changed cost for index read in sql_select.cc to match
        multi_range_read_info_const(). All index cost calculation is now
        done trough one function.
      - 'ref' will now use quick_cost for keys if it exists. This is done
        so that for '=' ranges, 'ref' is prefered over 'range'.
      - scan_time() now takes avg_io_costs() into account
      - get_delayed_table_estimates() uses block_size and avg_io_cost()
      - Removed default argument to test_if_order_by_key(); simplifies code
      eb483c51
    • Monty's avatar
      Removed double calls to records_in_range from distinct and group by · b3ab3105
      Monty authored
      Fixed by moving testing of get_best_group_min_max() after range testing.
      b3ab3105
    • Monty's avatar
      Added page_range to records_in_range() to improve range statistics · f36ca142
      Monty authored
      Prototype change:
      -  virtual ha_rows records_in_range(uint inx, key_range *min_key,
      -                                   key_range *max_key)
      +  virtual ha_rows records_in_range(uint inx, const key_range *min_key,
      +                                   const key_range *max_key,
      +                                   page_range *res)
      
      The handler can ignore the page_range parameter. In the case the handler
      updates the parameter, the optimizer can deduce the following:
      - If previous range's last key is on the same block as next range's first
        key
      - If the current key range is in one block
      - We can also assume that the first and last block read are cached!
        This can be used for a better calculation of IO seeks when we
        estimate the cost of a range index scan.
      
      The parameter is fully implemented for MyISAM, Aria and InnoDB.
      A separate patch will update handler::multi_range_read_info_const() to
      take the benefits of this change and also remove the double
      records_in_range() calls that are not anymore needed.
      f36ca142