1. 13 Aug, 2023 5 commits
    • David S. Miller's avatar
      Merge branch 'tcp-oom-probe' · 86f03776
      David S. Miller authored
      Menglong Dong says:
      
      ====================
      net: tcp: support probing OOM
      
      In this series, we make some small changes to make the tcp
      retransmission become zero-window probes if the receiver drops the skb
      because of memory pressure.
      
      In the 1st patch, we reply a zero-window ACK if the skb is dropped
      because out of memory, instead of dropping the skb silently.
      
      In the 2nd patch, we allow a zero-window ACK to update the window.
      
      In the 3rd patch, fix unexcepted socket die when snd_wnd is 0 in
      tcp_retransmit_timer().
      
      In the 4th patch, we refactor the debug message in
      tcp_retransmit_timer() to make it more correct.
      
      After these changes, the tcp can probe the OOM of the receiver forever.
      
      Changes since v3:
      - make the timeout "2 * TCP_RTO_MAX" in the 3rd patch
      - tp->retrans_stamp is not based on jiffies and can't be compared with
        icsk->icsk_timeout in the 3rd patch. Fix it.
      - introduce the 4th patch
      
      Changes since v2:
      - refactor the code to avoid code duplication in the 1st patch
      - use after() instead of max() in tcp_rtx_probe0_timed_out()
      
      Changes since v1:
      - send 0 rwin ACK for the receive queue empty case when necessary in the
        1st patch
      - send the ACK immediately by using the ICSK_ACK_NOW flag in the 1st
        patch
      - consider the case of the connection restart from idle, as Neal comment,
        in the 3rd patch
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86f03776
    • Menglong Dong's avatar
      net: tcp: refactor the dbg message in tcp_retransmit_timer() · 031c44b7
      Menglong Dong authored
      The debug message in tcp_retransmit_timer() is slightly wrong, because
      they could be printed even if we did not receive a new ACK packet from
      the remote peer.
      
      Change it to probing zero-window, as it is a expected case now. The
      description may be not correct.
      
      Adding the duration since the last ACK we received, and the duration of
      the retransmission, which are useful for debugging.
      
      And the message now like this:
      
      Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 209ms ago, lasting 209ms
      Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 404ms ago, lasting 408ms
      Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 812ms ago, lasting 1224ms
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      031c44b7
    • Menglong Dong's avatar
      net: tcp: fix unexcepted socket die when snd_wnd is 0 · e89688e3
      Menglong Dong authored
      In tcp_retransmit_timer(), a window shrunk connection will be regarded
      as timeout if 'tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX'. This is not
      right all the time.
      
      The retransmits will become zero-window probes in tcp_retransmit_timer()
      if the 'snd_wnd==0'. Therefore, the icsk->icsk_rto will come up to
      TCP_RTO_MAX sooner or later.
      
      However, the timer can be delayed and be triggered after 122877ms, not
      TCP_RTO_MAX, as I tested.
      
      Therefore, 'tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX' is always true
      once the RTO come up to TCP_RTO_MAX, and the socket will die.
      
      Fix this by replacing the 'tcp_jiffies32' with '(u32)icsk->icsk_timeout',
      which is exact the timestamp of the timeout.
      
      However, "tp->rcv_tstamp" can restart from idle, then tp->rcv_tstamp
      could already be a long time (minutes or hours) in the past even on the
      first RTO. So we double check the timeout with the duration of the
      retransmission.
      
      Meanwhile, making "2 * TCP_RTO_MAX" as the timeout to avoid the socket
      dying too soon.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Link: https://lore.kernel.org/netdev/CADxym3YyMiO+zMD4zj03YPM3FBi-1LHi6gSD2XT8pyAMM096pg@mail.gmail.com/Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e89688e3
    • Menglong Dong's avatar
      net: tcp: allow zero-window ACK update the window · 800a6661
      Menglong Dong authored
      Fow now, an ACK can update the window in following case, according to
      the tcp_may_update_window():
      
      1. the ACK acknowledged new data
      2. the ACK has new data
      3. the ACK expand the window and the seq of it is valid
      
      Now, we allow the ACK update the window if the window is 0, and the
      seq/ack of it is valid. This is for the case that the receiver replies
      an zero-window ACK when it is under memory stress and can't queue the new
      data.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      800a6661
    • Menglong Dong's avatar
      net: tcp: send zero-window ACK when no memory · e2142825
      Menglong Dong authored
      For now, skb will be dropped when no memory, which makes client keep
      retrans util timeout and it's not friendly to the users.
      
      In this patch, we reply an ACK with zero-window in this case to update
      the snd_wnd of the sender to 0. Therefore, the sender won't timeout the
      connection and will probe the zero-window with the retransmits.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2142825
  2. 12 Aug, 2023 3 commits
  3. 11 Aug, 2023 32 commits