• Tuong Lien's avatar
    tipc: fix unlimited bundling of small messages · ed9420dd
    Tuong Lien authored
    [ Upstream commit e95584a8 ]
    
    We have identified a problem with the "oversubscription" policy in the
    link transmission code.
    
    When small messages are transmitted, and the sending link has reached
    the transmit window limit, those messages will be bundled and put into
    the link backlog queue. However, bundles of data messages are counted
    at the 'CRITICAL' level, so that the counter for that level, instead of
    the counter for the real, bundled message's level is the one being
    increased.
    Subsequent, to-be-bundled data messages at non-CRITICAL levels continue
    to be tested against the unchanged counter for their own level, while
    contributing to an unrestrained increase at the CRITICAL backlog level.
    
    This leaves a gap in congestion control algorithm for small messages
    that can result in starvation for other users or a "real" CRITICAL
    user. Even that eventually can lead to buffer exhaustion & link reset.
    
    We fix this by keeping a 'target_bskb' buffer pointer at each levels,
    then when bundling, we only bundle messages at the same importance
    level only. This way, we know exactly how many slots a certain level
    have occupied in the queue, so can manage level congestion accurately.
    
    By bundling messages at the same level, we even have more benefits. Let
    consider this:
    - One socket sends 64-byte messages at the 'CRITICAL' level;
    - Another sends 4096-byte messages at the 'LOW' level;
    
    When a 64-byte message comes and is bundled the first time, we put the
    overhead of message bundle to it (+ 40-byte header, data copy, etc.)
    for later use, but the next message can be a 4096-byte one that cannot
    be bundled to the previous one. This means the last bundle carries only
    one payload message which is totally inefficient, as for the receiver
    also! Later on, another 64-byte message comes, now we make a new bundle
    and the same story repeats...
    
    With the new bundling algorithm, this will not happen, the 64-byte
    messages will be bundled together even when the 4096-byte message(s)
    comes in between. However, if the 4096-byte messages are sent at the
    same level i.e. 'CRITICAL', the bundling algorithm will again cause the
    same overhead.
    
    Also, the same will happen even with only one socket sending small
    messages at a rate close to the link transmit's one, so that, when one
    message is bundled, it's transmitted shortly. Then, another message
    comes, a new bundle is created and so on...
    
    We will solve this issue radically by another patch.
    
    Fixes: 365ad353 ("tipc: reduce risk of user starvation during link congestion")
    Reported-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
    Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    ed9420dd
msg.c 18.7 KB