• Jon Paul Maloy's avatar
    tipc: fix stale link problem during synchronization · 2be80c2d
    Jon Paul Maloy authored
    Recent changes to the link synchronization means that we can now just
    drop packets arriving on the synchronizing link before the synch point
    is reached. This has lead to significant simplifications to the
    implementation, but also turns out to have a flip side that we need
    to consider.
    
    Under unlucky circumstances, the two endpoints may end up
    repeatedly dropping each other's packets, while immediately
    asking for retransmission of the same packets, just to drop
    them once more. This pattern will eventually be broken when
    the synch point is reached on the other link, but before that,
    the endpoints may have arrived at the retransmission limit
    (stale counter) that indicates that the link should be broken.
    We see this happen at rare occasions.
    
    The fix for this is to not ask for retransmissions when a link is in
    state LINK_SYNCHING. The fact that the link has reached this state
    means that it has already received the first SYNCH packet, and that it
    knows the synch point. Hence, it doesn't need any more packets until the
    other link has reached the synch point, whereafter it can go ahead and
    ask for the missing packets.
    
    However, because of the reduced traffic on the synching link that
    follows this change, it may now take longer to discover that the
    synch point has been reached. We compensate for this by letting all
    packets, on any of the links, trig a check for synchronization
    termination. This is possible because the packets themselves don't
    contain any information that is needed for discovering this condition.
    Reviewed-by: default avatarYing Xue <ying.xue@windriver.com>
    Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    2be80c2d
node.c 34.3 KB