• Sagi Grimberg's avatar
    nvme-rdma: fix controller reset hang during traffic · 9f98772b
    Sagi Grimberg authored
    commit fe35ec58 ("block: update hctx map when use multiple maps")
    exposed an issue where we may hang trying to wait for queue freeze
    during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple
    queue maps (which we have now for default/read/poll) is attempting to
    freeze the queue. However we never started queue freeze when starting the
    reset, which means that we have inflight pending requests that entered the
    queue that we will not complete once the queue is quiesced.
    
    So start a freeze before we quiesce the queue, and unfreeze the queue
    after we successfully connected the I/O queues (and make sure to call
    blk_mq_update_nr_hw_queues only after we are sure that the queue was
    already frozen).
    
    This follows to how the pci driver handles resets.
    
    Fixes: fe35ec58 ("block: update hctx map when use multiple maps")
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    9f98772b
rdma.c 63.5 KB