• Sujatha's avatar
    MDEV-19158: MariaDB 10.2.22 is writing duplicate entries into binary log · 43bbf88d
    Sujatha authored
    Problem:
    ========
    We have a Master/Master Setup on two servers, but are only writing to one of
    those servers (so it is essentially Master/Slave) We upgraded from 10.1.* to
    10.2.22 last week and starting with the upgrade, we are getting duplicate key
    errors on the slave. BINLOG=mixed.
    
    Analysis:
    =========
    This issue happens with LOCK TABLES and binlog_format=MIXED combination. When an
    UNSAFE statement is encountered in 'MIXED' mode, it is logged in the form of
    'ROW' format. For all the tables that are part of LOCK TABLES list their table maps
    are written into the binary log. For each table in the list a check is
    done to see if 'check_table_binlog_row_based_done' flag is set or not. If it is not set
    a check process is initiated to see if table qualifies for row based binary
    logging or not and 'check_table_binlog_row_based_done' is set. This flag will be
    cleared at the time of closing thread tables.
    
    But there can be special cases where the LOCK TABLES contains more number of
    tables but the unsafe query is actually using subset of tables from LOCK TABLES
    list.
    
    For example: LOCK TABLES locks t1,t2,t3 but the unsafe statement makes use of
    only two tables t1,t3. In this case the 'check_table_binlog_row_based_done' flag
    is enabled for table 't2' while writing table map, but 'close_thread_tables'
    function call will not reset this flag. Since the flag is not cleared for table
    't2' even a safe statement which used t2 will be logged in the form of row based
    format.
    
    This leads to an assert on debug builds and causes duplicate entries in release
    builds. In release builds a statement is logged in the form of both ROW and
    STATEMENT format. This causes the slave to fail with duplicate key error.
    
    Fix:
    ===
    During 'close_thread_tables' when LOCK TABLE modes are active "ha_reset" is done
    for all the tables which were part of current statement. As mentioned in the
    example 'ha_reset' is called for tables 't1' and 't3'. This will clear the
    'check_table_binlog_row_based_done' flag. At this point add a check for the rest
    of the tables to see if 'check_table_binlog_row_based_done' is enabled or not.
    If enabled clear the flag.
    43bbf88d
rpl_binlog_dup_entry.test 2.53 KB