1. 30 Sep, 2009 6 commits
    • Luis Soares's avatar
      Partial backport for BUG#41399, more precisely, the changes to · 266d53b5
      Luis Soares authored
      wait_until_disconnected.inc.
      266d53b5
    • Alfranio Correia's avatar
      BUG#43075 rpl.rpl_sync fails sporadically on pushbuild · 9682ff8a
      Alfranio Correia authored
      NOTE: Backporting the patch to next-mr.
            
      The slave was crashing while failing to execute the init_slave() function.
            
      The issue stems from two different reasons:
            
      1 - A failure while allocating the master info structure generated a
          segfault due to a NULL pointer.
            
      2 - A failure while recovering generated a segfault due to a non-initialized
          relay log file. In other words, the mi->init and rli->init were both set to true
          before executing the recovery process thus creating an inconsistent state as the
          relay log file was not initialized.
            
      To circumvent such problems, we refactored the recovery process which is now executed
      while initializing the relay log. It is ensured that the master info structure is
      created before accessing it and any error is propagated thus avoiding to set mi->init
      and rli->init to true when for instance the relay log is not initialized or the relay
      info is not flushed.
            
      The changes related to the refactory are described below:
            
      1 - Removed call to init_recovery from init_slave.
            
      2 - Changed the signature of the function init_recovery.
            
      3 - Removed flushes. They are called while initializing the relay log and master
          info.
            
      4 - Made sure that if the relay info is not flushed the mi-init and rli-init are not
          set to true.
            
      In this patch, we also replaced the exit(1) in the fault injection by DBUG_ABORT()
      to make it compliant with the code guidelines.
      9682ff8a
    • Luis Soares's avatar
      BUG#47749: rpl_slave_skip fails sporadically on PB2 (mysql-5.1-rep+2 tree). · 9581d628
      Luis Soares authored
      rpl_slave_skip fails randomly on PB2. This patch fixes the failure by
      setting explicit wait for SQL thread to stop, instead of the 
      wait_for_slave_to_stop mysqltest command, after a start until command 
      is executed.
      9581d628
    • Alfranio Correia's avatar
      BUG#47741 rpl_ndb_extraCol fails in next-mr (mysql-5.1-rep+2) in RBR · 2108eea5
      Alfranio Correia authored
      This is a temporary fix.
      
      NOTE: Backporting the patch to next-mr.
      2108eea5
    • Alfranio Correia's avatar
      Post-fix for BUG#43789 · 47599b5d
      Alfranio Correia authored
      NOTE: Backporting the patch to next-mr.
      47599b5d
    • Luis Soares's avatar
  2. 29 Sep, 2009 11 commits
    • Alfranio Correia's avatar
      BUG#40337 Fsyncing master and relay log to disk after every event is too slow · a48ff220
      Alfranio Correia authored
      NOTE: Backporting the patch to next-mr.
            
      The fix proposed in BUG#35542 and BUG#31665 introduces a performance issue
      when fsyncing the master.info, relay.info and relay-log.bin* after #th events.
      Although such solution has been proposed to reduce the probability of corrupted
      files due to a slave-crash, the performance penalty introduced by it has
      made the approach impractical for highly intensive workloads.
            
      In a nutshell, the option --syn-relay-log proposed in BUG#35542 and BUG#31665
      simultaneously fsyncs master.info, relay-log.info and relay-log.bin* and
      this is the main source of performance issues.
            
      This patch introduces new options that give more control to the user on
      what should be fsynced and how often:
            
         1) (--sync-master-info, integer) which syncs the master.info after #th event;
         2) (--sync-relay-log, integer) which syncs the relay-log.bin* after #th
         events.
         3) (--sync-relay-log-info, integer) which syncs the relay.info after #th
         transactions.
            
         To provide both performance and increased reliability, we recommend the following
         setup:
            
         1) --sync-master-info = 0 eventually the operating system will fsync it;
         2) --sync-relay-log = 0 eventually the operating system will fsync it;
         3) --sync-relay-log-info = 1 fsyncs it after every transaction;
            
      Notice, that the previous setup does not reduce the probability of
      corrupted master.info and relay-log.bin*. To overcome the issue, this patch also
      introduces a recovery mechanism that right after restart throws away relay-log.bin*
      retrieved from a master and updates the master.info based on the relay.info:
            
            
         4) (--relay-log-recovery, boolean) which enables a recovery mechanism that
         throws away relay-log.bin* after a crash.
            
      However, it can only recover the incorrect binlog file and position in master.info,
      if other informations (host, port password, etc) are corrupted or incorrect,
      then this recovery mechanism will fail to work.
      a48ff220
    • Alfranio Correia's avatar
      BUG#35542 Add option to sync master and relay log to disk after every event · 4e0cb6db
      Alfranio Correia authored
      BUG#31665 sync_binlog should cause relay logs to be synchronized
      
      NOTE: Backporting the patch to next-mr.
            
      Add sync_relay_log option to server, this option works for relay log 
      the same as option sync_binlog for binlog. This option also synchronize
      master info to disk when set to non-zero value.
                  
      Original patches from Sinisa and Mark, with some modifications
      4e0cb6db
    • Alfranio Correia's avatar
      BUG#43789 different master/slave table defs cause crash: text/varchar null · 63278c56
      Alfranio Correia authored
                vs not null
      
      NOTE: Backporting the patch to next-mr.
                              
      The replication was generating corrupted data, warning messages on Valgrind
      and aborting on debug mode while replicating a "null" to "not null" field.
      Specifically the unpack_row routine, was considering the slave's table
      definition and trying to retrieve a field value, where there was nothing to be
      retrieved, ignoring the fact that the value was defined as "null" by the master.
                              
      To fix the problem, we proceed as follows:
                              
      1 - If it is not STRICT sql_mode, implicit default values are used, regardless
      if it is multi-row or single-row statement.
                              
      2 - However, if it is STRICT mode, then a we do what follows:
                              
      2.1 If it is a transactional engine, we do a rollback on the first NULL that is
      to be set into a NOT NULL column and return an error.
                              
      2.2 If it is a non-transactional engine and it is the first row to be inserted
      with multi-row, we also return the error. Otherwise, we proceed with the
      execution, use implicit default values and print out warning messages.
                        
      Unfortunately, the current patch cannot mimic the behavior showed by the master
      for updates on multi-tables and multi-row inserts. This happens because such
      statements are unfolded in different row events. For instance, considering the
      following updates and strict mode:
                        
      (master)
      create table t1 (a int);
      create table t2 (a int not null);
      insert into t1 values (1);
      insert into t2 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
                        
      t1 would have (10) and t2 would have (0) as this would be handled as a
      multi-row update. On the other hand, if we had the following updates:
                        
      (master)
      create table t1 (a int);
      create table t2 (a int);
                        
      (slave)
      create table t1 (a int);
      create table t2 (a int not null);
                        
      (master)
      insert into t1 values (1);
      insert into t2 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
                        
      On the master t1 would have (10) and t2 would have (NULL). On
      the slave, t1 would have (10) but the update on t1 would fail.
      63278c56
    • Luis Soares's avatar
      BUG#42928: binlog-format setting prevents server from start if binary · a11e2137
      Luis Soares authored
      logging is disabled
            
      NOTE: this is the backport to next-mr.
                  
      If one sets binlog-format but does NOT enable binary log, server
      refuses to start up. The following messages appears in the error log:
                  
      090217 12:47:14 [ERROR] You need to use --log-bin to make
      --binlog-format work.  
      090217 12:47:14 [ERROR] Aborting
                  
      This patch addresses this by making the server not to bail out if the
      binlog-format is set without the log-bin option. Additionally, the
      specified binlog-format is stored, in the global system variable
      "binlog_format", and a warning is printed instead of an error.
      a11e2137
    • Luis Soares's avatar
      BUG#40611: MySQL cannot make a binary log after sequential number · bdacc562
      Luis Soares authored
      beyond unsigned long.
      BUG#44779: binlog.binlog_max_extension may be causing failure on 
      next test in PB
            
      NOTE1: this is the backport to next-mr.
      NOTE2: already includes patch for BUG#44779.
            
      Binlog file extensions would turn into negative numbers once the
      variable used to hold the value reached maximum for signed
      long. Consequently, incrementing value to the next (negative) number
      would lead to .000000 extension, causing the server to fail.
                        
      This patch addresses this issue by not allowing negative extensions
      and by returning an error on find_uniq_filename, when the limit is
      reached. Additionally, warnings are printed to the error log when the
      limit is approaching. FLUSH LOGS will also report warnings to the
      user, if the extension number has reached the limit. The limit has been
      set to 0x7FFFFFFF as the maximum.
      bdacc562
    • Luis Soares's avatar
      Bug #30703 SHOW STATUS LIKE 'Slave_running' is not compatible with `SHOW SLAVE · ca151daf
      Luis Soares authored
      STATUS'
            
      NOTE: this is the backport to next-mr.
                  
      SHOW SHOW STATUS LIKE 'Slave_running' command believes that
      if active_mi->slave_running != 0, then io thread is running normally.
      But it isn't so in fact. When some errors happen to make io thread
      try to reconnect master, then it will become transitional status
      (MYSQL_SLAVE_RUN_NOT_CONNECT == 1), which also doesn't equal 0.
      Yet, "SHOW SLAVE STATUS" believes that only if
      active_mi->slave_running == MYSQL_SLAVE_RUN_CONNECT, then io thread is running.
      So "SHOW SLAVE STATUS" can get the correct result.
                  
                  
      Fixed to make SHOW SHOW STATUS LIKE 'Slave_running' command have the same
      check condition with "SHOW SLAVE STATUS". It only believe that the io thread
      is running when active_mi->slave_running == MYSQL_SLAVE_RUN_CONNECT.
      ca151daf
    • Luis Soares's avatar
      BUG#28796: CHANGE MASTER TO MASTER_HOST="" leads to invalid master.info · 19ac2627
      Luis Soares authored
                    
      NOTE: this is the backport to next-mr.
                      
      This patch addresses the bug reported by checking wether 
      host argument is an empty string or not. If empty, an error is
      reported to the client, otherwise continue normally.
                             
      This commit is based on the originally proposed patch and adds 
      a test case as requested during review as well as refines comments, 
      and makes test case result file less verbose (compared to previous patch).
      19ac2627
    • Luis Soares's avatar
      BUG#23300: Slow query log on slave does not log slow replicated statements · 7cf99622
      Luis Soares authored
            
      NOTE: this is the backport to next-mr.
            
      When using replication, the slave will not log any slow query logs queries 
      replicated from the master, even if the option "--log-slow-slave-statements" 
      is set and these take more than "log_query_time" to execute.
                    
      In order to log slow queries in replicated thread one needs to set the
      --log-slow-slave-statements, so that the SQL thread is initialized with the 
      correct switch. Although setting this flag correctly configures the slave 
      thread option to log slow queries, there is an issue with the condition that 
      is used to check whether to log the slow query or not. When replaying binlog 
      events the statement contains the SET TIMESTAMP clause which will force the 
      slow logging condition check to fail. Consequently, the slow query logging will
      not take place.
                    
      This patch addresses this issue by removing the second condition from the
      log_slow_statements as it prevents slow queries to be binlogged and seems 
      to be deprecated.
      7cf99622
    • Alfranio Correia's avatar
      BUG#38173 Field doesn't have a default value with row-based replication · bbc830f7
      Alfranio Correia authored
      NOTE: Backporting the patch to next-mr.
            
      The reason of  the bug was incompatibile with the master side behaviour.
      INSERT query on the master is allowed to insert into a table without specifying
      values of DEFAULT-less fields if sql_mode is not strict.
                  
      Fixed with checking sql_mode by the sql thread to decide how to react.
      Non-strict sql_mode should allow Write_rows event to complete.
                  
      todo: warnings can be shown via show slave status, still this is a 
      separate rather general issue how to show warnings for the slave threads.
      bbc830f7
    • Alfranio Correia's avatar
      WL#4828 and BUG#45747 · 5280dc82
      Alfranio Correia authored
      NOTE: Backporting the patch to next-mr.
      
      WL#4828 Augment DBUG_ENTER/DBUG_EXIT to crash MySQL in different functions
      -------
      
      The assessment of the replication code in the presence of faults is extremely
      import to increase reliability. In particular, one needs to know if servers
      will either correctly recovery or print out appropriate error messages thus
      avoiding unexpected problems in a production environment.
      
      In order to accomplish this, the current patch refactories the debug macros
      already provided in the source code and introduces three new macros that
      allows to inject faults, specifically crashes, while entering or exiting a
      function or method. For instance, to crash a server while returning from
      the init_slave function (see module sql/slave.cc), one needs to do what
      follows:
      
      1 - Modify the source replacing DBUG_RETURN by DBUG_CRASH_RETURN;
      
        DBUG_CRASH_RETURN(0);
      
      2 - Use the debug variable to activate dbug instructions:
      
        SET SESSION debug="+d,init_slave_crash_return";
      
      The new macros are briefly described below:
      
      DBUG_CRASH_ENTER (function) is equivalent to DBUG_ENTER which registers the
      beginning of a function but in addition to it allows for crashing the server
      while entering the function if the appropriate dbug instruction is activate.
      In this case, the dbug instruction should be "+d,function_crash_enter".
      
      DBUG_CRASH_RETURN (value) is equivalent to DBUG_RETURN which notifies the
      end of a function but in addition to it allows for crashing the server
      while returning from the function if the appropriate dbug instruction is
      activate. In this case, the dbug instruction should be
      "+d,function_crash_return". Note that "function" should be the same string
      used by either the DBUG_ENTER or DBUG_CRASH_ENTER.
      
      DBUG_CRASH_VOID_RETURN (value) is equivalent to DBUG_VOID_RETURN which
      notifies the end of a function but in addition to it allows for crashing
      the server while returning from the function if the appropriate dbug
      instruction is activate. In this case, the dbug instruction should be
      "+d,function_crash_return". Note that "function" should be the same string
      used by either the DBUG_ENTER or DBUG_CRASH_ENTER.
      
      To inject other faults, for instance, wrong return values, one should rely
      on the macros already available. The current patch also removes a set of
      macros that were either not being used or were redundant as other macros
      could be used to provide the same feature. In the future, we also consider
      dynamic instrumentation of the code.
      
      
      BUG#45747 DBUG_CRASH_* is not setting the strict option
      ---------
            
      When combining DBUG_CRASH_* with "--debug=d:t:i:A,file" the server crashes
      due to a call to the abort function in the DBUG_CRASH_* macro althought the
      appropriate keyword has not been set.
      5280dc82
    • Alfranio Correia's avatar
      BUG#44663 Unused replication options prevent server from starting. · f940fe75
      Alfranio Correia authored
      NOTE: Backporting the patch to next-mr.
                  
      The use of option log_slave_updates without log_bin was preventing the server
      from starting. To fix the problem, we replaced the error message and the exit
      call by a warning message.
      f940fe75
  3. 28 Sep, 2009 3 commits
  4. 25 Sep, 2009 1 commit
    • Mats Kindahl's avatar
      Bug #47645: Segmentation fault when out of memory during handlerton initialization · 14cf09c1
      Mats Kindahl authored
      There is a missing check for memory allocation failure when allocating
      memory for the handlerton structure. If the handlerton init function
      tries to de-reference the pointer, it will cause a segmentation fault
      and crash the server.
      
      This patch fixes the problem by not calling the init function if memory
      allocation failed, and instead prints an informative error message and
      reports the error to the caller.
      14cf09c1
  5. 23 Sep, 2009 3 commits
    • Mats Kindahl's avatar
      WL#5016: Fix header file include guards · 4ad8ef06
      Mats Kindahl authored
                  
      Adding header include file guards to files that are missing such.
      4ad8ef06
    • Mats Kindahl's avatar
      Bug #37221: SET AUTOCOMMIT=1 does not commit binary log · 8f35f7c9
      Mats Kindahl authored
      When setting AUTOCOMMIT=1 after starting a transaction, the binary log
      did not commit the outstanding transaction. The reason was that the binary
      log commit function saw the values of the new settings, deciding that there
      were nothing to commit.
      
      Fixed the problem by moving the implicit commit to before the thread option
      flags were changed, so that the binary log sees the old values of the flags
      instead of the values they will take after the statement.
      8f35f7c9
    • Mats Kindahl's avatar
      BUG#29288: myisam transactions replicated to a transactional · 5661e726
      Mats Kindahl authored
      slave leaves slave unstable
      
      Problem: when replicating from non-transactional to
      transactional engine with autocommit off, no BEGIN/COMMIT
      is written to the binlog. When the slave replicates, it
      will start a transaction that never ends.
      
      Fix: Force autocommit=on on slave by always replicating
      autocommit=1 from the master.
      5661e726
  6. 03 Sep, 2009 6 commits
  7. 02 Sep, 2009 10 commits