• Nandita Dukkipati's avatar
    tcp: Tail loss probe (TLP) · 6ba8a3b1
    Nandita Dukkipati authored
    This patch series implement the Tail loss probe (TLP) algorithm described
    in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
    first patch implements the basic algorithm.
    
    TLP's goal is to reduce tail latency of short transactions. It achieves
    this by converting retransmission timeouts (RTOs) occuring due
    to tail losses (losses at end of transactions) into fast recovery.
    TLP transmits one packet in two round-trips when a connection is in
    Open state and isn't receiving any ACKs. The transmitted packet, aka
    loss probe, can be either new or a retransmission. When there is tail
    loss, the ACK from a loss probe triggers FACK/early-retransmit based
    fast recovery, thus avoiding a costly RTO. In the absence of loss,
    there is no change in the connection state.
    
    PTO stands for probe timeout. It is a timer event indicating
    that an ACK is overdue and triggers a loss probe packet. The PTO value
    is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
    ACK timer when there is only one oustanding packet.
    
    TLP Algorithm
    
    On transmission of new data in Open state:
      -> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
      -> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
      -> PTO = min(PTO, RTO)
    
    Conditions for scheduling PTO:
      -> Connection is in Open state.
      -> Connection is either cwnd limited or no new data to send.
      -> Number of probes per tail loss episode is limited to one.
      -> Connection is SACK enabled.
    
    When PTO fires:
      new_segment_exists:
        -> transmit new segment.
        -> packets_out++. cwnd remains same.
    
      no_new_packet:
        -> retransmit the last segment.
           Its ACK triggers FACK or early retransmit based recovery.
    
    ACK path:
      -> rearm RTO at start of ACK processing.
      -> reschedule PTO if need be.
    
    In addition, the patch includes a small variation to the Early Retransmit
    (ER) algorithm, such that ER and TLP together can in principle recover any
    N-degree of tail loss through fast recovery. TLP is controlled by the same
    sysctl as ER, tcp_early_retrans sysctl.
    tcp_early_retrans==0; disables TLP and ER.
    		 ==1; enables RFC5827 ER.
    		 ==2; delayed ER.
    		 ==3; TLP and delayed ER. [DEFAULT]
    		 ==4; TLP only.
    
    The TLP patch series have been extensively tested on Google Web servers.
    It is most effective for short Web trasactions, where it reduced RTOs by 15%
    and improved HTTP response time (average by 6%, 99th percentile by 10%).
    The transmitted probes account for <0.5% of the overall transmissions.
    Signed-off-by: default avatarNandita Dukkipati <nanditad@google.com>
    Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
    Acked-by: default avatarYuchung Cheng <ycheng@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    6ba8a3b1
tcp.h 51.3 KB