• Eric Dumazet's avatar
    tcp: adjust window probe timers to safer values · 21c8fe99
    Eric Dumazet authored
    With the advent of small rto timers in datacenter TCP,
    (ip route ... rto_min x), the following can happen :
    
    1) Qdisc is full, transmit fails.
    
       TCP sets a timer based on icsk_rto to retry the transmit, without
       exponential backoff.
       With low icsk_rto, and lot of sockets, all cpus are servicing timer
       interrupts like crazy.
       Intent of the code was to retry with a timer between 200 (TCP_RTO_MIN)
       and 500ms (TCP_RESOURCE_PROBE_INTERVAL)
    
    2) Receivers can send zero windows if they don't drain their receive queue.
    
       TCP sends zero window probes, based on icsk_rto current value, with
       exponential backoff.
       With /proc/sys/net/ipv4/tcp_retries2 being 15 (or even smaller in
       some cases), sender can abort in less than one or two minutes !
       If receiver stops the sender, it obviously doesn't care of very tight
       rto. Probability of dropping the ACK reopening the window is not
       worth the risk.
    
    Lets change the base timer to be at least 200ms (TCP_RTO_MIN) for these
    events (but not normal RTO based retransmits)
    
    A followup patch adds a new SNMP counter, as it would have helped a lot
    diagnosing this issue.
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
    Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    21c8fe99
tcp_input.c 176 KB