• Lawrence Brakmo's avatar
    tcp: ack immediately when a cwr packet arrives · 9aee4000
    Lawrence Brakmo authored
    We observed high 99 and 99.9% latencies when doing RPCs with DCTCP. The
    problem is triggered when the last packet of a request arrives CE
    marked. The reply will carry the ECE mark causing TCP to shrink its cwnd
    to 1 (because there are no packets in flight). When the 1st packet of
    the next request arrives, the ACK was sometimes delayed even though it
    is CWR marked, adding up to 40ms to the RPC latency.
    
    This patch insures that CWR marked data packets arriving will be acked
    immediately.
    
    Packetdrill script to reproduce the problem:
    
    0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
    0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
    0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
    0.000 bind(3, ..., ...) = 0
    0.000 listen(3, 1) = 0
    
    0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
    0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
    0.110 < [ect0] . 1:1(0) ack 1 win 257
    0.200 accept(3, ..., ...) = 4
    
    0.200 < [ect0] . 1:1001(1000) ack 1 win 257
    0.200 > [ect01] . 1:1(0) ack 1001
    
    0.200 write(4, ..., 1) = 1
    0.200 > [ect01] P. 1:2(1) ack 1001
    
    0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
    0.200 write(4, ..., 1) = 1
    0.200 > [ect01] P. 2:3(1) ack 2001
    
    0.200 < [ect0] . 2001:3001(1000) ack 3 win 257
    0.200 < [ect0] . 3001:4001(1000) ack 3 win 257
    0.200 > [ect01] . 3:3(0) ack 4001
    
    0.210 < [ce] P. 4001:4501(500) ack 3 win 257
    
    +0.001 read(4, ..., 4500) = 4500
    +0 write(4, ..., 1) = 1
    +0 > [ect01] PE. 3:4(1) ack 4501
    
    +0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257
    // Previously the ACK sequence below would be 4501, causing a long RTO
    +0.040~+0.045 > [ect01] . 4:4(0) ack 5501   // delayed ack
    
    +0.311 < [ect0] . 5501:6501(1000) ack 4 win 257  // More data
    +0 > [ect01] . 4:4(0) ack 6501     // now acks everything
    
    +0.500 < F. 9501:9501(0) ack 4 win 257
    
    Modified based on comments by Neal Cardwell <ncardwell@google.com>
    Signed-off-by: default avatarLawrence Brakmo <brakmo@fb.com>
    Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
    Acked-by: default avatarYuchung Cheng <ycheng@google.com>
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    9aee4000
tcp_input.c 185 KB