• James Smart's avatar
    nvme: if_ready checks to fail io to deleting controller · 6cdefc6e
    James Smart authored
    The revised if_ready checks skipped over the case of returning error when
    the controller is being deleted.  Instead it was returning BUSY, which
    caused the ios to retry, which caused the ns delete to hang waiting for
    the ios to drain.
    
    Stack trace of hang looks like:
     kworker/u64:2   D    0    74      2 0x80000000
     Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core]
     Call Trace:
      ? __schedule+0x26d/0x820
      schedule+0x32/0x80
      blk_mq_freeze_queue_wait+0x36/0x80
      ? remove_wait_queue+0x60/0x60
      blk_cleanup_queue+0x72/0x160
      nvme_ns_remove+0x106/0x140 [nvme_core]
      nvme_remove_namespaces+0x7e/0xa0 [nvme_core]
      nvme_delete_ctrl_work+0x4d/0x80 [nvme_core]
      process_one_work+0x160/0x350
      worker_thread+0x1c3/0x3d0
      kthread+0xf5/0x130
      ? process_one_work+0x350/0x350
      ? kthread_bind+0x10/0x10
      ret_from_fork+0x1f/0x30
    
    Extend nvmf_fail_nonready_command() to supply the controller pointer so
    that the controller state can be looked at. Fail any io to a controller
    that is deleting.
    
    Fixes: 3bc32bb1 ("nvme-fabrics: refactor queue ready check")
    Fixes: 35897b92 ("nvme-fabrics: fix and refine state checks in __nvmf_check_ready")
    Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    Tested-by: default avatarEwan D. Milne <emilne@redhat.com>
    Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
    6cdefc6e
fabrics.c 28 KB