1. 22 Jul, 2014 3 commits
  2. 21 Jul, 2014 2 commits
  3. 18 Jul, 2014 2 commits
    • Sergey Vojtovich's avatar
      MDEV-6459 - max_relay_log_size and sql_slave_skip_counter · 54538b48
      Sergey Vojtovich authored
                  misbehave on PPC64
      
      There was a mix of ulong and uint casts/variables which caused
      incorrect value to be passed to/retrieved from max_relay_log_size
      and sql_slave_skip_counter.
      
      This mix failed to work on big-endian PPC64 where sizeof(int)= 4,
      sizeof(long)= 8. E.g. session_var(thd, uint)= 1 will in fact store
      0x100000000.
      54538b48
    • Sergey Vojtovich's avatar
      MDEV-6450 - MariaDB crash on Power8 when built with advance tool · c0ebb3f3
      Sergey Vojtovich authored
                  chain
      
      InnoDB mutex_exit() function calls __sync_test_and_set() to release
      the lock. According to manual this function is supposed to create
      "acquire" memory barrier whereas in fact we need "release" memory
      barrier at mutex_exit().
      
      The problem isn't repeatable with gcc because it creates
      "acquire-release" memory barrier for __sync_test_and_set().
      ATC creates just "acquire" barrier.
      
      Fixed by creating proper barrier at mutex_exit() by using
      __sync_lock_release() instead of __sync_test_and_set().
      c0ebb3f3
  4. 15 Jul, 2014 1 commit
  5. 11 Jul, 2014 3 commits
    • Kristian Nielsen's avatar
      MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel... · 501c56ef
      Kristian Nielsen authored
      MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
      
      Merge the patches into MariaDB 10.0 main.
      
      With this patch, parallel replication will now automatically retry a
      transaction that fails due to deadlock or other temporary error, same as
      single-threaded replication.
      
      We catch deadlocks with InnoDB transactions due to enforced commit order. If
      T1 must commit before T2 in parallel replication and T1 ends up waiting for T2
      inside InnoDB, we kill T2 and retry it later to resolve the deadlock
      automatically.
      
      501c56ef
    • Kristian Nielsen's avatar
      Fix test failure seen in buildbot on power8. · fd0abeca
      Kristian Nielsen authored
      GTID order in @@gtid_binlog_pos depends on internal hash order,
      so requires --replace_result for stable test output.
      fd0abeca
    • Kristian Nielsen's avatar
      MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel... · e81ecc9c
      Kristian Nielsen authored
      MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
      
      Fix a bug discovered in Buildbot valgrind. The logic in checking for slave
      init thread completion was reversed, so depending on thread scheduling server
      startup could hang.
      
      Also add another variant of SSL valgrind suppression, needed for different
      library version.
      
      e81ecc9c
  6. 10 Jul, 2014 4 commits
  7. 09 Jul, 2014 1 commit
  8. 08 Jul, 2014 3 commits
    • Kristian Nielsen's avatar
      Fix small merge errors after rebase · ba4e56d8
      Kristian Nielsen authored
      ba4e56d8
    • Kristian Nielsen's avatar
      MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel... · 92577cc0
      Kristian Nielsen authored
      MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
      
      Fix small (but nasty) typo.
      92577cc0
    • Kristian Nielsen's avatar
      MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel... · 98fc5b3a
      Kristian Nielsen authored
      MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
      
      After-review changes.
      
      For this patch in 10.0, we do not introduce a new public storage engine API,
      we just fix the InnoDB/XtraDB issues. In 10.1, we will make a better public
      API that can be used for all storage engines (MDEV-6429).
      
      Eliminate the background thread that did deadlock kills asynchroneously.
      Instead, we ensure that the InnoDB/XtraDB code can handle doing the kill from
      inside the deadlock detection code (when thd_report_wait_for() needs to kill a
      later thread to resolve a deadlock).
      
      (We preserve the part of the original patch that introduces dedicated mutex
      and condition for the slave init thread, to remove the abuse of
      LOCK_thread_count for start/stop synchronisation of the slave init thread).
      
      98fc5b3a
  9. 04 Jul, 2014 1 commit
  10. 25 Jun, 2014 1 commit
    • Kristian Nielsen's avatar
      MDEV-4937: sql_slave_skip_counter does not work with GTID · 9150a0c7
      Kristian Nielsen authored
      The sql_slave_skip_counter is important to be able to recover replication from
      certain errors. Often, an appropriate solution is to set
      sql_slave_skip_counter to skip over a problem event. But setting
      sql_slave_skip_counter produced an error in GTID mode, with a suggestion to
      instead set @@gtid_slave_pos to point past the problem event. This however is
      not always possible; for example, in case of an INCIDENT event, that event
      does not have any GTID to assign to @@gtid_slave_pos.
      
      With this patch, sql_slave_skip_counter now works in GTID mode the same was as
      in non-GTID mode. When set, that many initial events are skipped when the SQL
      thread starts, plus as many extra events are needed to completely skip any
      partially skipped event group. The GTID position is updated to point past the
      skipped event(s).
      9150a0c7
  11. 09 Jul, 2014 4 commits
  12. 08 Jul, 2014 8 commits
  13. 07 Jul, 2014 1 commit
    • Kristian Nielsen's avatar
      MDEV-6120: When slave stops with error, error message should indicate the failing GTID · 2b4b857d
      Kristian Nielsen authored
      Follow-up patch. The original patch added an extra argument to the
      rli->report() function, however it was forgotten to adjust the calls
      accordingly in a few places.
      
      This patch updates the remaining calls as needed. In files log_event_old.cc
      and rpl_record_old.cc, it just adds NULL, since this is only for old event
      formats from ancient master servers, which would not have any GTID information
      to add to the error messages in any case.
      2b4b857d
  14. 04 Jul, 2014 2 commits
  15. 03 Jul, 2014 1 commit
  16. 30 Jun, 2014 2 commits
    • Alexey Botchkov's avatar
      MDEV-6073 Merge gis test cases form 5.6. · 80a02037
      Alexey Botchkov authored
              Tests were merged.
              As the implementation is different, the 'internal debugging' part
              was not merged, only a stub for it created.
      80a02037
    • Kristian Nielsen's avatar
      Fix test failures in rpl.rpl_checksum and rpl.rpl_gtid_errorlog. · 439f75f8
      Kristian Nielsen authored
      These tests use search_pattern_in_file.inc to search the error log for
      expected output. However, search_pattern_in_file.inc by default searched only
      the first 50000 bytes, so if the error log grew too big the tests would fail.
      
      This patch extends search_pattern_in_file.inc with an option to specify how
      much of the file to search, and whether to search from the start of the file
      or from the end. Then the rpl.rpl_checksum and rpl.rpl_gtid_errorlog test
      cases are fixed to search the last 50000 bytes of the error log, which will
      work no matter how large prior tests have made it.
      439f75f8
  17. 27 Jun, 2014 1 commit
    • Kristian Nielsen's avatar
      MDEV-6386: Assertion `thd->transaction.stmt.is_empty() || thd->in_sub_stmt ||... · 370318f8
      Kristian Nielsen authored
      MDEV-6386: Assertion `thd->transaction.stmt.is_empty() || thd->in_sub_stmt || (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)' fails with parallel replication
      
      The direct cause of the assertion was missing error handling in
      record_gtid(). If ha_commit_trans() fails for the statement commit, there was
      missing code to catch the error and do ha_rollback_trans() in this case; this
      caused close_thread_tables() to assert.
      
      Normally, this error case is not hit, but in this case it was triggered due to
      another bug: When a transaction T1 fails during parallel replication, the code
      would signal following transactions that they could start to run without
      properly marking the error condition. This caused subsequent transactions to
      incorrectly start replicating, only to get an error later during their own
      commit step. This was particularly serious if the subsequent transactions were
      DDL or MyISAM updates, which cannot be rolled back and would leave replication
      in an inconsistent state.
      
      Fixed by 1) in case of error, only signal following transactions to continue
      once the error has been properly marked and those transactions will know not
      to start; and 2) implement proper error handling in record_gtid() in the case
      that statement commit fails.
      
      370318f8