• Tuong Lien's avatar
    tipc: fix changeover issues due to large packet · 2320bcda
    Tuong Lien authored
    In conjunction with changing the interfaces' MTU (e.g. especially in
    the case of a bonding) where the TIPC links are brought up and down
    in a short time, a couple of issues were detected with the current link
    changeover mechanism:
    
    1) When one link is up but immediately forced down again, the failover
    procedure will be carried out in order to failover all the messages in
    the link's transmq queue onto the other working link. The link and node
    state is also set to FAILINGOVER as part of the process. The message
    will be transmited in form of a FAILOVER_MSG, so its size is plus of 40
    bytes (= the message header size). There is no problem if the original
    message size is not larger than the link's MTU - 40, and indeed this is
    the max size of a normal payload messages. However, in the situation
    above, because the link has just been up, the messages in the link's
    transmq are almost SYNCH_MSGs which had been generated by the link
    synching procedure, then their size might reach the max value already!
    When the FAILOVER_MSG is built on the top of such a SYNCH_MSG, its size
    will exceed the link's MTU. As a result, the messages are dropped
    silently and the failover procedure will never end up, the link will
    not be able to exit the FAILINGOVER state, so cannot be re-established.
    
    2) The same scenario above can happen more easily in case the MTU of
    the links is set differently or when changing. In that case, as long as
    a large message in the failure link's transmq queue was built and
    fragmented with its link's MTU > the other link's one, the issue will
    happen (there is no need of a link synching in advance).
    
    3) The link synching procedure also faces with the same issue but since
    the link synching is only started upon receipt of a SYNCH_MSG, dropping
    the message will not result in a state deadlock, but it is not expected
    as design.
    
    The 1) & 3) issues are resolved by the last commit that only a dummy
    SYNCH_MSG (i.e. without data) is generated at the link synching, so the
    size of a FAILOVER_MSG if any then will never exceed the link's MTU.
    
    For the 2) issue, the only solution is trying to fragment the messages
    in the failure link's transmq queue according to the working link's MTU
    so they can be failovered then. A new function is made to accomplish
    this, it will still be a TUNNEL PROTOCOL/FAILOVER MSG but if the
    original message size is too large, it will be fragmented & reassembled
    at the receiving side.
    Acked-by: default avatarYing Xue <ying.xue@windriver.com>
    Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    2320bcda
msg.h 26.4 KB