• Alfranio Correia's avatar
    BUG#40337 Fsyncing master and relay log to disk after every event is too slow · a48ff220
    Alfranio Correia authored
    NOTE: Backporting the patch to next-mr.
          
    The fix proposed in BUG#35542 and BUG#31665 introduces a performance issue
    when fsyncing the master.info, relay.info and relay-log.bin* after #th events.
    Although such solution has been proposed to reduce the probability of corrupted
    files due to a slave-crash, the performance penalty introduced by it has
    made the approach impractical for highly intensive workloads.
          
    In a nutshell, the option --syn-relay-log proposed in BUG#35542 and BUG#31665
    simultaneously fsyncs master.info, relay-log.info and relay-log.bin* and
    this is the main source of performance issues.
          
    This patch introduces new options that give more control to the user on
    what should be fsynced and how often:
          
       1) (--sync-master-info, integer) which syncs the master.info after #th event;
       2) (--sync-relay-log, integer) which syncs the relay-log.bin* after #th
       events.
       3) (--sync-relay-log-info, integer) which syncs the relay.info after #th
       transactions.
          
       To provide both performance and increased reliability, we recommend the following
       setup:
          
       1) --sync-master-info = 0 eventually the operating system will fsync it;
       2) --sync-relay-log = 0 eventually the operating system will fsync it;
       3) --sync-relay-log-info = 1 fsyncs it after every transaction;
          
    Notice, that the previous setup does not reduce the probability of
    corrupted master.info and relay-log.bin*. To overcome the issue, this patch also
    introduces a recovery mechanism that right after restart throws away relay-log.bin*
    retrieved from a master and updates the master.info based on the relay.info:
          
          
       4) (--relay-log-recovery, boolean) which enables a recovery mechanism that
       throws away relay-log.bin* after a crash.
          
    However, it can only recover the incorrect binlog file and position in master.info,
    if other informations (host, port password, etc) are corrupted or incorrect,
    then this recovery mechanism will fail to work.
    a48ff220
rpl_sync.test 5.03 KB