• Tao Chiu's avatar
    nvme-pci: fix controller reset hang when racing with nvme_timeout · d4060d2b
    Tao Chiu authored
    reset_work() in nvme-pci may hang forever in the following scenario:
    1) A reset caused by a command timeout occurs due to a controller being
       temporarily irresponsive.
    2) nvme_reset_work() restarts admin queue at nvme_alloc_admin_tags(). At
       the same time, a user-submitted admin command is queued and waiting
       for completion. Then, reset_work() changes its state to CONNECTING,
       and submits an identify command.
    3) However, the controller does still not respond to any command,
       causing a timeout being fired at the user-submitted command.
       Unfortunately, nvme_timeout() does not see the completion on cq, and
       any timeout that takes place under CONNECTING state causes a
       controller shutdown.
    4) Normally, the identify command in reset_work() would be canceled with
       SC_HOST_ABORTED by nvme_dev_disable(), then reset_work can tear down
       the controller accordingly. But the controller happens to return
       online and respond the identify command before nvme_dev_disable()
       should have been reaped it off.
    5) reset_work() continues to setup_io_queues() as it observes no error
       in init_identify(). However, the admin queue has already been
       quiesced in dev_disable(). Thus, any following commands would be
       blocked forever in blk_execute_rq().
    
    This can be fixed by restricting usercmd commands when controller is not
    in a LIVE state in nvme_queue_rq(), as what has been done previously in
    fabrics.
    
    ```
    nvme_reset_work():                     |
        nvme_alloc_admin_tags()            |
                                           | nvme_submit_user_cmd():
        nvme_init_identify():              |     ...
            __nvme_submit_sync_cmd():      |
                ...                        |     ...
    ---------------------------------------> nvme_timeout():
    (Controller starts reponding commands) |     nvme_dev_disable(, true):
        nvme_setup_io_queues():            |
            __nvme_submit_sync_cmd():      |
                (hung in blk_execute_rq    |
                 since run_hw_queue sees   |
                 queue quiesced)           |
    
    ```
    Signed-off-by: default avatarTao Chiu <taochiu@synology.com>
    Signed-off-by: default avatarCody Wong <codywong@synology.com>
    Reviewed-by: default avatarLeon Chien <leonchien@synology.com>
    Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    d4060d2b
pci.c 86.8 KB