• Sagi Grimberg's avatar
    nvme-tcp: Fix possible race of io_work and direct send · 5c11f7d9
    Sagi Grimberg authored
    We may send a request (with or without its data) from two paths:
    
      1. From our I/O context nvme_tcp_io_work which is triggered from:
        - queue_rq
        - r2t reception
        - socket data_ready and write_space callbacks
      2. Directly from queue_rq if the send_list is empty (because we want to
         save the context switch associated with scheduling our io_work).
    
    However, given that now we have the send_mutex, we may run into a race
    condition where none of these contexts will send the pending payload to
    the controller. Both io_work send path and queue_rq send path
    opportunistically attempt to acquire the send_mutex however queue_rq only
    attempts to send a single request, and if io_work context fails to
    acquire the send_mutex it will complete without rescheduling itself.
    
    The race can trigger with the following sequence:
    
      1. queue_rq sends request (no incapsule data) and blocks
      2. RX path receives r2t - prepares data PDU to send, adds h2cdata PDU
         to the send_list and schedules io_work
      3. io_work triggers and cannot acquire the send_mutex - because of (1),
         ends without self rescheduling
      4. queue_rq completes the send, and completes
    
    ==> no context will send the h2cdata - timeout.
    
    Fix this by having queue_rq sending as much as it can from the send_list
    such that if it still has any left, its because the socket buffer is
    full and the socket write_space callback will trigger, thus guaranteeing
    that a context will be scheduled to send the h2cdata PDU.
    
    Fixes: db5ad6b7 ("nvme-tcp: try to send request in queue_rq context")
    Reported-by: default avatarPotnuri Bharat Teja <bharat@chelsio.com>
    Reported-by: default avatarSamuel Jones <sjones@kalrayinc.com>
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Tested-by: default avatarPotnuri Bharat Teja <bharat@chelsio.com>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    5c11f7d9
tcp.c 64.2 KB