• Neal Cardwell's avatar
    tcp: fix excessive TLP and RACK timeouts from HZ rounding · 1c2709cf
    Neal Cardwell authored
    We discovered from packet traces of slow loss recovery on kernels with
    the default HZ=250 setting (and min_rtt < 1ms) that after reordering,
    when receiving a SACKed sequence range, the RACK reordering timer was
    firing after about 16ms rather than the desired value of roughly
    min_rtt/4 + 2ms. The problem is largely due to the RACK reorder timer
    calculation adding in TCP_TIMEOUT_MIN, which is 2 jiffies. On kernels
    with HZ=250, this is 2*4ms = 8ms. The TLP timer calculation has the
    exact same issue.
    
    This commit fixes the TLP transmit timer and RACK reordering timer
    floor calculation to more closely match the intended 2ms floor even on
    kernels with HZ=250. It does this by adding in a new
    TCP_TIMEOUT_MIN_US floor of 2000 us and then converting to jiffies,
    instead of the current approach of converting to jiffies and then
    adding th TCP_TIMEOUT_MIN value of 2 jiffies.
    
    Our testing has verified that on kernels with HZ=1000, as expected,
    this does not produce significant changes in behavior, but on kernels
    with the default HZ=250 the latency improvement can be large. For
    example, our tests show that for HZ=250 kernels at low RTTs this fix
    roughly halves the latency for the RACK reorder timer: instead of
    mostly firing at 16ms it mostly fires at 8ms.
    Suggested-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
    Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
    Fixes: bb4d991a ("tcp: adjust tail loss probe timeout")
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20231015174700.2206872-1-ncardwell.sw@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    1c2709cf
tcp_output.c 123 KB