• Venkatesh Duggirala's avatar
    Bug#17638477 UNINSTALL AND INSTALL SEMI-SYNC PLUGIN CAUSES SLAVES TO BREAK · 66d624b7
    Venkatesh Duggirala authored
    Problem: Uninstallation of semi sync plugin causes replication to
    break.
    
    Analysis: A semisync enabled replication is mutual agreement between
    Master and Slave when the connection (I/O thread) is established.
    Once I/O thread is started and if semisync is enabled on both
    master and slave, master appends special magic header to events
    using semisync plugin functions and sends it to slave. And slave
    expects that each event will have that special magic header format
    and reads those bytes using semisync plugin functions.
    
    When semi sync replication is in use if users execute
    uninstallation of the plugin on master, slave gets confused while
    interpreting that event's content because it expects special 
    magic header at the beginning of the event. Slave SQL thread will
    be stopped with "Missing magic number in the header" error.
    
    Similar problem will happen if uninstallation of the plugin happens
    on slave when semi sync replication is in in use. Master sends
    the events with magic header and slave does not know about the
    added magic header and thinks that it received a corrupted event.
    Hence slave SQL thread stops with "Found  corrupted event" error.
    
    Fix: Uninstallation of semisync plugin will be blocked when semisync
    replication is in use and will throw 'ER_UNKNOWN_ERROR' error.
    To detect that semisync replication is in use, this patch uses
    semisync status variable values.
     > On Master, it checks for 'Rpl_semi_sync_master_status' to be OFF
        before allowing the uninstallation of rpl_semi_sync_master plugin.
        >> Rpl_semi_sync_master_status is OFF when
            >>> there is no dump thread running
            >>> there are no semisync slaves
     > On Slave, it checks for 'Rpl_semi_sync_slave_status' to be OFF
        before allowing the uninstallation of rpl_semi_sync_slave plugin.
        >> Rpl_semi_sync_slave_status is OFF when
           >>> there is no I/O thread running
           >>> replication is asynchronous replication.
    66d624b7
sql_plugin.cc 112 KB