1. 31 Jan, 2018 17 commits
    • Sergey Vojtovich's avatar
      MDEV-15104 - Optimise MVCC snapshot · bc7a1dc1
      Sergey Vojtovich authored
      With trx_sys_t::rw_trx_ids removal, MVCC snapshot overhead became
      slightly higher. That is instead of copying an array we now have to
      iterate LF_HASH. All this done under trx_sys.mutex protection.
      
      This patch moves MVCC snapshot out of trx_sys.mutex.
      
      Clean-ups:
      
      Removed MVCC: doesn't make too much sense to keep it in a separate class
      anymore.
      
      Refactored ReadView so that it now calls register()/deregister() routines
      (it was vice versa before).
      
      ReadView doesn't have friends anymore. :(
      
      Even less trx_sys.mutex references.
      bc7a1dc1
    • Sergey Vojtovich's avatar
      MDEV-15104 - Remove trx_sys_t::serialisation_list · c0d5d7c0
      Sergey Vojtovich authored
      serialisation_list was supposed to instantly give minimum registered
      transaction serialisation number. However maintaining and accessing
      this list requires global mutex protection.
      
      Since we already take MVCC snapshot by iterating trx_sys_t::rw_trx_hash,
      it is cheap to integrate minimum registered transaction lookup into this
      iteration.
      c0d5d7c0
    • Sergey Vojtovich's avatar
      MDEV-15104 - Remove trx_sys_t::rw_trx_ids · 53cc9aa5
      Sergey Vojtovich authored
      Take snapshot of registered read-write transaction identifiers directly
      from rw_trx_hash. It immediately saves one trx_sys.mutex lock, reduces
      size of another critical section protected by this mutex, and makes
      further optimisations like removing trx_sys_t::serialisation_list
      possible.
      
      Downside of this approach is bigger overhead for view opening, because
      iterating LF_HASH is more expensive compared to taking snapshot of an
      array. However for low concurrency overhead difference is negligible,
      while for high concurrency mutex is much bigger evil.
      
      Currently we still take trx_sys.mutex to serialise ReadView creation.
      This is required to keep serialisation_list ordered by trx->no as well
      as not to let purge thread to create more recent snapshot while another
      thread gets suspended during creation of older snapshot. This will
      become completely mutex free along with serialisation_list removal.
      
      Compared to previous implementation removing element from rw_trx_hash
      and serialisation_list is not atomic. We disregard all possible bad
      consequences (if there're any) since it will be solved along with
      serialisation_list removal.
      53cc9aa5
    • Sergey Vojtovich's avatar
      Reduce number of trx_sys.mutex references · af566d8a
      Sergey Vojtovich authored
      trx->state change must be guarded by trx->mutex.
      Moved mutex locking to MVCC::view_close().
      af566d8a
    • Marko Mäkelä's avatar
      Follow-up fix to MDEV-15132 Avoid accessing the TRX_SYS page · dcc09afa
      Marko Mäkelä authored
      trx_undo_mem_create_at_db_start(): Do not read TRX_UNDO_TRX_NO
      unless the field is known to be valid, that is, the transaction
      has been serialized and trx_purge_add_undo_to_history() has been
      invoked.
      
      Normally InnoDB pages would be zero-initialized on allocation
      (since MySQL 5.5 or so), but the undo log pages skip that
      mechanism. So, reused undo log pages can contain garbage.
      Undo log headers can start at any offset (there can be
      multiple undo log headers in the same undo log page).
      Therefore, because the TRX_UNDO_TRX_NO is never explicitly
      initialized on undo log header creation, its contents may
      be garbage.
      dcc09afa
    • Daniel Black's avatar
      MariaBackup: gcc7 - snprintf output overflow warning · 7eb084fe
      Daniel Black authored
      extra/mariabackup/xtrabackup.cc: In function ‘ulint xb_process_datadir(const char*, const char*, handle_datadir_entry_func_t)’:
      extra/mariabackup/xtrabackup.cc:4534:1: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
       xb_process_datadir(
       ^~~~~~~~~~~~~~~~~~
      mariabackup/xtrabackup.cc:4607:11: note: ‘snprintf’ output 2 or more bytes (assuming 4001) into a destination of size 4000
         snprintf(dbpath, sizeof(dbpath), "%s/%s", path, dbinfo.name);
         ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      7eb084fe
    • Daniel Black's avatar
      versioning: add explict fallthough to prevent gcc warning · 464ba0e9
      Daniel Black authored
      gcc7 warning:
      
      sql/table.cc: In member function ‘int TABLE_SHARE::init_from_binary_frm_image(THD*, bool, const uchar*, size_t)’:
      sql/table.cc:2032:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
                 if (vers_can_native)
                 ^~
      sql/table.cc:2037:9: note: here
               default:
               ^~~~~~~
      464ba0e9
    • Marko Mäkelä's avatar
      MDEV-15132 Avoid accessing the TRX_SYS page · 5db9c6e4
      Marko Mäkelä authored
      trx_write_serialisation_history(): Only invoke trx_sysf_get()
      to exclusively lock the TRX_SYS page if some change really
      has to be written to the page.
      
      On transaction commit, we will still write some binlog and
      Galera WSREP XID information.
      
      FIXME: If this information has to be written, it should be
      partitioned into the rollback segment pages.
      5db9c6e4
    • Marko Mäkelä's avatar
      MDEV-15132 Avoid accessing the TRX_SYS page · c7d04487
      Marko Mäkelä authored
      InnoDB maintains an internal persistent sequence of transaction
      identifiers. This sequence is used for assigning both transaction
      start identifiers (DB_TRX_ID=trx->id) and end identifiers (trx->no)
      as well as end identifiers for the mysql.transaction_registry table
      that was introduced in MDEV-12894.
      
      TRX_SYS_TRX_ID_WRITE_MARGIN: Remove. After this many updates of
      the sequence we used to update the TRX_SYS page. We can avoid accessing
      the TRX_SYS page if we modify the InnoDB startup so that resurrecting
      the sequence from other pages of the transaction system.
      
      TRX_SYS_TRX_ID_STORE: Deprecate. The field only exists for the purpose
      of upgrading from an earlier version of MySQL or MariaDB.
      
      Starting with this fix, MariaDB will rely on the fields
      TRX_UNDO_TRX_ID, TRX_UNDO_TRX_NO in the undo log header page of
      each non-committed transaction, and on the new field
      TRX_RSEG_MAX_TRX_ID in rollback segment header pages.
      
      Because of this change, setting innodb_force_recovery=5 or 6 may cause
      the system to recover with trx_sys.get_max_trx_id()==0. We must adjust
      checks for invalid DB_TRX_ID and PAGE_MAX_TRX_ID accordingly.
      
      We will change the startup and shutdown messages to display the
      trx_sys.get_max_trx_id() in addition to the log sequence number.
      
      trx_sys_t::flush_max_trx_id(): Remove.
      
      trx_undo_mem_create_at_db_start(), trx_undo_lists_init():
      Add an output parameter max_trx_id, to be updated from
      TRX_UNDO_TRX_ID, TRX_UNDO_TRX_NO.
      
      TRX_RSEG_MAX_TRX_ID: New field, for persisting
      trx_sys.get_max_trx_id() at the time of the latest transaction commit.
      Startup is not reading the undo log pages of committed transactions.
      We want to avoid additional page accesses on startup, as well as
      trouble when all undo logs have been emptied.
      On startup, we will simply determine the maximum value from all pages
      that are being read anyway.
      
      TRX_RSEG_FORMAT: Redefined from TRX_RSEG_MAX_SIZE.
      
      Old versions of InnoDB wrote uninitialized garbage to unused data fields.
      Because of this, we cannot simply introduce a new field in the
      rollback segment pages and expect it to be always zero, like it would
      if the database was created by a recent enough InnoDB version.
      
      Luckily, it looks like the field TRX_RSEG_MAX_SIZE was always written
      as 0xfffffffe. We will indicate a new subformat of the page by writing
      0 to this field. This has the nice side effect that after a downgrade
      to older versions of InnoDB, transactions should fail to allocate any
      undo log, that is, writes will be blocked. So, there is no problem of
      getting corrupted transaction identifiers after downgrading.
      
      trx_rseg_t::max_size: Remove.
      
      trx_rseg_header_create(): Remove the parameter max_size=ULINT_MAX.
      
      trx_purge_add_undo_to_history(): Update TRX_RSEG_MAX_SIZE
      (and TRX_RSEG_FORMAT if needed). This is invoked on transaction commit.
      
      trx_rseg_mem_restore(): If TRX_RSEG_FORMAT contains 0,
      read TRX_RSEG_MAX_SIZE.
      
      trx_rseg_array_init(): Invoke trx_sys.init_max_trx_id(max_trx_id + 1)
      where max_trx_id was the maximum that was encountered in the rollback
      segment pages and the undo log pages of recovered active, XA PREPARE,
      or some committed transactions. (See trx_purge_add_undo_to_history()
      which invokes trx_rsegf_set_nth_undo(..., FIL_NULL, ...);
      not all committed transactions will be immediately detached from the
      rollback segment header.)
      c7d04487
    • Marko Mäkelä's avatar
      Clean up trx_undo_page_get_end() · bb441ca4
      Marko Mäkelä authored
      bb441ca4
    • Marko Mäkelä's avatar
      Simplify undo log access during InnoDB startup · 6058f92f
      Marko Mäkelä authored
      trx_rseg_mem_restore(): Update the max_trx_id from the undo log pages.
      
      trx_sys_init_at_db_start(): Remove; merge with trx_lists_init_at_db_start().
      
      trx_undo_lists_init(): Move to the only calling module, trx0rseg.cc.
      
      trx_undo_mem_create_at_db_start(): Declare globally. Return the number
      of pages.
      6058f92f
    • Marko Mäkelä's avatar
      Do not call trx_rseg_mem_restore() when creating rollback segment · d24229ba
      Marko Mäkelä authored
      trx_rseg_mem_create(): Initialize rseg->curr_size and rseg->max_size.
      
      trx_rseg_create(), trx_temp_rseg_create():
      Do not call trx_rseg_mem_restore().
      d24229ba
    • Marko Mäkelä's avatar
      Clean up some undo page accessor functions · 0ead8d95
      Marko Mäkelä authored
      trx_undo_page_get_prev_rec(), trx_undo_page_get_last_rec(),
      trx_undo_page_get_first_rec(), trx_undo_page_get_start():
      Move to the only caller, trx0undo.cc.
      
      Add some const qualifiers.
      0ead8d95
    • Marko Mäkelä's avatar
      Remove unnecessary function parameters · 648e8c12
      Marko Mäkelä authored
      trx_rseg_get_nth_undo(), trx_rsegf_undo_find_free():
      Add a const qualifier, and remove the unused parameter mtr_t*.
      648e8c12
    • Marko Mäkelä's avatar
      Simplify access to the TRX_SYS page · 8d1d38f9
      Marko Mäkelä authored
      trx_sysf_t: Remove.
      
      trx_sysf_get(): Return the TRX_SYS page, not a pointer within it.
      
      trx_sysf_rseg_get_space(), trx_sysf_rseg_get_page_no():
      Remove a parameter, and merge the declaration and definition.
      Take the TRX_SYS page as a parameter.
      
      TRX_SYS_N_RSEGS: Correct the comment.
      
      trx_sysf_rseg_find_free(), trx_sys_update_mysql_binlog_offset(),
      trx_sys_update_wsrep_checkpoint(): Take the TRX_SYS page as a parameter.
      
      trx_rseg_header_create(): Add a parameter for the TRX_SYS page.
      
      trx_sysf_rseg_set_space(), trx_sysf_rseg_set_page_no(): Remove;
      merge to the only caller, trx_rseg_header_create().
      8d1d38f9
    • Marko Mäkelä's avatar
      Avoid an assertion failure on aborted startup · 54c715ac
      Marko Mäkelä authored
      srv_init_abort_low(): Call srv_shutdown_bg_undo_sources() so that if
      startup aborts while creating InnoDB system tables, the shutdown will
      proceed correctly.
      54c715ac
    • Igor Babaev's avatar
      Fixed MDEV-14994 Assertion `join->best_read < double(1.79...15e+308L)' or · 7a9611ae
      Igor Babaev authored
      server crash in JOIN::fix_all_splittings_in_plan
      
      Cost formulas must take into account the case when a splittable table
      has now rows.
      7a9611ae
  2. 30 Jan, 2018 20 commits
    • Monty's avatar
      Don't give warning about usage of --language with full path · a1e0e64a
      Monty authored
      Only give warning if warnings > 2, as there is no plan to change
      the current behavior.
      a1e0e64a
    • Monty's avatar
      Remove compiler warnings · f10fae7e
      Monty authored
      f10fae7e
    • Monty's avatar
      Added some checking that LEX_CSTRING is \0 terminated · 486c86dd
      Monty authored
      - When adding LEX_CSTRING to String, we are now checking that
        string is \0 terminated (as normally LEX_CSTRING should be
        usable for printf(). In the cases when one wants to avoid the
        checking one can use String->append(ptr, length) instead of just
        String->append(LEX_CSTRING*)
      486c86dd
    • Monty's avatar
      Change C_STRING_WITH_LEN to STRING_WITH_LEN · f55dc7f7
      Monty authored
      This preserves const str for constant strings
      
      Other things
      - A few variables where changed from LEX_STRING to LEX_CSTRING
      - Incident_log_event::Incident_log_event and record_incident where
        changed to take LEX_CSTRING* as an argument instead of LEX_STRING
      f55dc7f7
    • Monty's avatar
      Removed not used functions and variables · 18e22cb6
      Monty authored
      18e22cb6
    • Monty's avatar
      Added defines for mysqld_error_find_printf_error_used · bbe0055f
      Monty authored
      This is to make it easier to use the
      create_mysqld_error_find_printf_error tool to find wrong print
      bbe0055f
    • Monty's avatar
      Renamed Item_user_var_as_out_param::name to org_name · 29fd049a
      Monty authored
      Rename was done as the old 'name' hide the original item name.
      29fd049a
    • Monty's avatar
      b9b17e63
    • Monty's avatar
      Fixed wrong arguments to printf · a2393ff2
      Monty authored
      a2393ff2
    • Monty's avatar
      Changed database, tablename and alias to be LEX_CSTRING · a7e352b5
      Monty authored
      This was done in, among other things:
      - thd->db and thd->db_length
      - TABLE_LIST tablename, db, alias and schema_name
      - Audit plugin database name
      - lex->db
      - All db and table names in Alter_table_ctx
      - st_select_lex db
      
      Other things:
      - Changed a lot of functions to take const LEX_CSTRING* as argument
        for db, table_name and alias. See init_one_table() as an example.
      - Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
      - Changed some lists from LEX_STRING to LEX_CSTRING
      - threads_mysql.result changed because process list_db wasn't always
        correctly updated
      - New append_identifier() function that takes LEX_CSTRING* as arguments
      - Added new element tmp_buff to Alter_table_ctx to separate temp name
        handling from temporary space
      - Ensure we store the length after my_casedn_str() of table/db names
      - Removed not used version of rename_table_in_stat_tables()
      - Changed Natural_join_column::table_name and db_name() to never return
        NULL (used for print)
      - thd->get_db() now returns db as a printable string (thd->db.str or "")
      a7e352b5
    • Marko Mäkelä's avatar
      Merge bb-10.2-ext into 10.3 · 921c5e93
      Marko Mäkelä authored
      MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY
      
      Move a test from innodb.rename_table_debug to innodb.alter_copy.
      
      ha_innobase::extra(HA_EXTRA_BEGIN_ALTER_COPY): Register id-versioned
      tables so that mysql.transaction_registry will be updated, even for
      empty tables that are subjected to ALTER TABLE…ALGORITHM=COPY.
      921c5e93
    • Marko Mäkelä's avatar
      Merge bb-10.2-ext into 10.3 · 33714d20
      Marko Mäkelä authored
      33714d20
    • Marko Mäkelä's avatar
      Merge 10.2 into bb-10.2-ext · 0c1f2206
      Marko Mäkelä authored
      0c1f2206
    • Marko Mäkelä's avatar
      MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY · 0ba6aaf0
      Marko Mäkelä authored
      If a crash occurs during ALTER TABLE…ALGORITHM=COPY, InnoDB would spend
      a lot of time rolling back writes to the intermediate copy of the table.
      To reduce the amount of busy work done, a work-around was introduced in
      commit fd069e2b in MySQL 4.1.8 and 5.0.2,
      to commit the transaction after every 10,000 inserted rows.
      
      A proper fix would have been to disable the undo logging altogether and
      to simply drop the intermediate copy of the table on subsequent server
      startup. This is what happens in MariaDB 10.3 with MDEV-14717,MDEV-14585.
      In MariaDB 10.2, the intermediate copy of the table would be left behind
      with a name starting with the string #sql.
      
      This is a backport of a bug fix from MySQL 8.0.0 to MariaDB,
      contributed by jixianliang <271365745@qq.com>.
      
      Unlike recent MySQL, MariaDB supports ALTER IGNORE. For that operation
      InnoDB must for now keep the undo logging enabled, so that the latest
      row can be rolled back in case of an error.
      
      In Galera cluster, the LOAD DATA statement will retain the existing
      behaviour and commit the transaction after every 10,000 rows if
      the parameter wsrep_load_data_splitting=ON is set. The logic to do
      so (the wsrep_load_data_split() function and the call
      handler::extra(HA_EXTRA_FAKE_START_STMT)) are joint work
      by Ji Xianliang and Marko Mäkelä.
      
      The original fix:
      
      Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
      Date:   Wed Dec 2 16:09:15 2015 +0530
      
      Bug#17479594 AVOID INTERMEDIATE COMMIT WHILE DOING ALTER TABLE ALGORITHM=COPY
      
      Problem:
      
      During ALTER TABLE, we commit and restart the transaction for every
      10,000 rows, so that the rollback after recovery would not take so long.
      
      Fix:
      
      Suppress the undo logging during copy alter operation. If fts_index is
      present then insert directly into fts auxiliary table rather
      than doing at commit time.
      
      ha_innobase::num_write_row: Remove the variable.
      
      ha_innobase::write_row(): Remove the hack for committing every 10000 rows.
      
      row_lock_table_for_mysql(): Remove the extra 2 parameters.
      
      lock_get_src_table(), lock_is_table_exclusive(): Remove.
      Reviewed-by: default avatarMarko Mäkelä <marko.makela@oracle.com>
      Reviewed-by: default avatarShaohua Wang <shaohua.wang@oracle.com>
      Reviewed-by: default avatarJon Olav Hauglid <jon.hauglid@oracle.com>
      0ba6aaf0
    • Marko Mäkelä's avatar
      Merge 10.2 into bb-10.2-ext · 6d390bab
      Marko Mäkelä authored
      6d390bab
    • Jan Lindström's avatar
      MDEV-14875: galera_new_cluster crashes mysqld when existing server contains databases · 446b3d35
      Jan Lindström authored
      Fortify wsrep_hton so that wsrep calls are not done to NULL-pointers.
      446b3d35
    • Alexey Botchkov's avatar
      MDEV-14694 ALTER COLUMN IF EXISTS .. causes syntax error. · 926adcfe
      Alexey Botchkov authored
              Implementing the 'IF EXISTS' option for statements
              ALTER TABLE ALTER COLUMN SET/DROP DEFAULT.
      926adcfe
    • Monty's avatar
      Fixed failing tests · 5478547c
      Monty authored
      - Galera tests that was not updated with connection change
        messages
      - Test where out of memory error was changed (We are now using the
        standard out of memory error in most places)
      - Removed tokudb tests that uses include files that doesn't exist
        in MariaDB
      - Removed not supported mariadb startup option from option file
      5478547c
    • Monty's avatar
      Fix some wrong test result · cea431e1
      Monty authored
      - Galera tests that was not updated with connection change
        messages
      - Disabled some TokuDB tests that always timed out.
        These should be enabled again when we have an option to
        specicy timeouts per tests.
      cea431e1
    • Igor Babaev's avatar
      Fixed mdev-15017 Server crashes in in st_join_table::fix_splitting · 775aa554
      Igor Babaev authored
      Do not apply splitting for constant tables.
      775aa554
  3. 29 Jan, 2018 3 commits