MDEV-23033: All slaves crash once in ~24 hours and loop restart with signal 11
Problem: ======= Upon deleting or updating a row in a parent table (with primary key), if the child table has virtual column and an associated key with ON UPDATE CASCADE/ON DELETE CASCADE, it will result in slave crash. Analysis: ======== Tables which are related through foreign key require prelocking similar to triggers. i.e If a table has triggers/foreign keys we should add all tables and routines used by them to the prelocking set. This prelocking happens during 'open_and_lock_tables' call. Each table being opened is checked for foreign key references. If foreign key reference exists then the child table is opened and it is linked to the table_list. Upon any modification to parent table its corresponding child tables are retried from table_list and they are updated accordingly. This prelocking work fine on master. On slave prelocking works for following cases. - Statement/mixed based replication - In row based replication when trigger execution is enabled through 'slave_run_triggers_for_rbr=YES/LOGGING/ENFORCE' Otherwise it results in an assert/crash, as the parent table will not find the corresponding child table and it will be NULL. Dereferencing NULL pointer leads to slave server exit. Fix: === Introduce a new 'slave_fk_event_map' flag similar to 'trg_event_map'. This flag will ensure that when foreign key is enabled in row based replication all the parent and child tables are prelocked, so that parent is able to locate the child table. Note: This issue is specific to slave, hence only slave needs to be upgraded.
Showing
Please register or sign in to comment