• Anton Eidelman's avatar
    nvme: fix possible io failures when removing multipathed ns · 942c09be
    Anton Eidelman authored
    [ Upstream commit 2181e455 ]
    
    When a shared namespace is removed, we call blk_cleanup_queue()
    when the device can still be accessed as the current path and this can
    result in submission to a dying queue. Hence, direct_make_request()
    called by our mpath device may fail (propagating the failure to userspace).
    Instead, we want to failover this I/O to a different path if one exists.
    Thus, before we cleanup the request queue, we make sure that the device is
    cleared from the current path nor it can be selected again as such.
    
    Fix this by:
    - clear the ns from the head->list and synchronize rcu to make sure there is
      no concurrent path search that restores it as the current path
    - clear the mpath current path in order to trigger a subsequent path search
      and sync srcu to wait for any ongoing request submissions
    - safely continue to namespace removal and blk_cleanup_queue
    Signed-off-by: default avatarAnton Eidelman <anton@lightbitslabs.com>
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
    942c09be
core.c 98.8 KB