REQUIREMENTS ------------ - It should be possible to run two systems with replication using the same configuration file on both systems. FEATURES TO IMPLEMENT --------------------- - Fix so that execute and command uses ExtSender. None of them should have their own signals, this should instead by abstacted to the RepStateRequest layer. - Delete signals GSN_REP_INSERT_GCIBUFFER_CONF GSN_REP_INSERT_GCIBUFFER_REF - Fix so that all ExtSenders are set at one point in the code only. - Verify the following signals: GSN_REP_INSERT_GCIBUFFER_REQ GSN_REP_CLEAR_SS_GCIBUFFER_REQ GSN_REP_DROP_TABLE_REQ - Fix all @todo's in the code - Remove all #if 1, #if 0 etc. - Fix correct usage of dbug package used in MySQL source code. - System table storing all info about channels - Think about how channels, subscriptions etc map to SUMA Subscriptions - TableInfoPS must be secured if SS REP is restarted and PS REP still has all log records needed to sync. (This could be saved in a system table instead of using the struct.) KNOWN BUGS AND LIMITATIONS -------------------------- - REP#1: Non-consistency due to non-logging stop [LIMITATION] Problem: - Stopping replication in state other than "Logging" can lead to a non-consistent state of the destination database Suggested solution: - Implement a cleanData flag (= false) that indicates that this has happend. - REP#2: PS REP uses epochs from old subscription [BUG] The following scenario can lead to a non-correct replication: - Start replication X - Wait until replication is in "Logging" state - Kill SS REP - Let PS REP be alive - Start new replication Y - Replication Y can use old PS REP epochs from replication X. Suggested solution: - Mark PS buffers with channel ids - Make sure that all epoch requests use channel number in the requests. - REP#3: When having two node groups, there is sometimes 626 [FIXED] Problem: - Sometimes (when doing updated) there is 626 error code when using 2 node groups. - 626 = Tuple does not exists. - Current code in RepState.cpp is: if(s == Channel::App && m_channel.getState() == Channel::DATASCAN_COMPLETED && i.last() >= m_channel.getDataScanEpochs().last() && i.last() >= m_channel.getMetaScanEpochs().last()) { m_channel.setState(Channel::LOG); disableAutoStart(); } When the system gets into LOG state, force flag is turned off Suggested solution: - During DATASCAN, force=true (i.e. updates are treated as writes, deletes error due to non-existing tuple are ignored) - The code above must take ALL node groups into account. - REP#4: User requests sometime vanish when DB node is down [LIMITATION] Problem: - PS REP node does not always REF when no connection to GREP exists Suggested solution: - All REP->GREP signalsends should be checked. If they return <0, then a REF signal should be returned. - REP#5: User requests sometime vanish when PS REP is down [BUG] Scenario: - Execute "Start" with PS REP node down Solution: - When start is executed, the connect flag should be checked - REP#6: No warning if table exists [Lars, BUG!] Problem: - There is no warning if a replicated table already exists in the database. Suggested solution: - Print warning - Set cleanData = false - REP#7: Starting 2nd subscription crashes DB node (Grep.cpp:994) [FIXED] Scenario: - Start replication - Wait until replication is in "Logging" state - Kill SS REP - Let PS REP be alive - Start new replication - Now GREP crashes in Grep.cpp:994. Suggested fix: - If a new subscription is requested with same subscriberData as already exists, then SUMA (or GREP) sends a REF signal indicating that SUMA does not allow a new subscription to be created. [Now no senderData is sent from REP.] - REP#8: Dangling subscriptions in GREP/SUMA [Johan,LIMITATION] Problem: - If both REP nodes die, then there is no possibility to remove subscriptions from GREP/SUMA Suggested solution 1: - Fix so that GREP/SUMA can receive a subscription removal signal with subid 0. This means that ALL subscriptions are removed. This meaning should be documented in the signaldata class. - A new user command "STOP ALL" is implemented that sends a request to delete all subscriptions. Suggested solution 2: - When GREP detects that ALL PS REP nodes associated with a s subscription are killed, then that subscription should be deleted.