• Lawrence Brakmo's avatar
    bpf: Add BPF_SOCKET_OPS_BASE_RTT support to tcp_nv · 85cce215
    Lawrence Brakmo authored
    TCP_NV will try to get the base RTT from a socket_ops BPF program if one
    is loaded. NV will then use the base RTT to bound its min RTT (its
    notion of the base RTT). It uses the base RTT as an upper bound and 80%
    of the base RTT as its lower bound.
    
    In other words, NV will consider filtered RTTs larger than base RTT as a
    sign of congestion. As a result, there is no minRTT inflation when there
    is a lot of congestion. For example, in a DC where the RTTs are less
    than 40us when there is no congestion, a base RTT value of 80us improves
    the performance of NV. The difference between the uncongested RTT and
    the base RTT provided represents how much queueing we are willing to
    have (in practice it can be higher).
    
    NV has been tunned to reduce congestion when there are many flows at the
    cost of one flow not achieving full bandwith utilization. When a
    reasonable base RTT is provided, one NV flow can now fully utilize the
    full bandwidth. In addition, the performance is also improved when there
    are many flows.
    
    In the following examples the NV results are using a kernel with this
    patch set (i.e. both NV results are using the new nv_loss_dec_factor).
    
    With one host sending to another host and only one flow the
    goodputs are:
      Cubic: 9.3 Gbps, NV: 5.5 Gbps, NV (baseRTT=80us): 9.2 Gbps
    
    With 2 hosts sending to one host (1 flow per host, the goodput per flow
    is:
      Cubic: 4.6 Gbps, NV: 4.5 Gbps, NV (baseRTT=80us)L 4.6 Gbps
    
    But the RTTs seen by a ping process in the sender is:
      Cubic: 3.3ms  NV: 97us,  NV (baseRTT=80us): 146us
    
    With a lot of flows things look even better for NV with baseRTT. Here we
    have 3 hosts sending to one host. Each sending host has 6 flows: 1
    stream, 4x1MB RPC, 1x10KB RPC. Cubic, NV and NV with baseRTT all fully
    utilize the full available bandwidth. However, the distribution of
    bandwidth among the flows is very different. For the 10KB RPC flow:
      Cubic: 27Mbps, NV: 111Mbps, NV (baseRTT=80us): 222Mbps
    
    The 99% latencies for the 10KB flows are:
      Cubic: 26ms,  NV: 1ms,  NV (baseRTT=80us): 500us
    
    The RTT seen by a ping process at the senders:
      Cubic: 3.2ms  NV: 720us,  NV (baseRTT=80us): 330us
    Signed-off-by: default avatarLawrence Brakmo <brakmo@fb.com>
    Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    85cce215
tcp_nv.c 15.6 KB