• Ming Lei's avatar
    SCSI: fix queue cleanup race before queue initialization is done · 8dc765d4
    Ming Lei authored
    c2856ae2 ("blk-mq: quiesce queue before freeing queue") has
    already fixed this race, however the implied synchronize_rcu()
    in blk_mq_quiesce_queue() can slow down LUN probe a lot, so caused
    performance regression.
    
    Then 1311326c ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
    tried to quiesce queue for avoiding unnecessary synchronize_rcu()
    only when queue initialization is done, because it is usual to see
    lots of inexistent LUNs which need to be probed.
    
    However, turns out it isn't safe to quiesce queue only when queue
    initialization is done. Because when one SCSI command is completed,
    the user of sending command can be waken up immediately, then the
    scsi device may be removed, meantime the run queue in scsi_end_request()
    is still in-progress, so kernel panic can be caused.
    
    In Red Hat QE lab, there are several reports about this kind of kernel
    panic triggered during kernel booting.
    
    This patch tries to address the issue by grabing one queue usage
    counter during freeing one request and the following run queue.
    
    Fixes: 1311326c ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
    Cc: Andrew Jones <drjones@redhat.com>
    Cc: Bart Van Assche <bart.vanassche@wdc.com>
    Cc: linux-scsi@vger.kernel.org
    Cc: Martin K. Petersen <martin.petersen@oracle.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
    Cc: stable <stable@vger.kernel.org>
    Cc: jianchao.wang <jianchao.w.wang@oracle.com>
    Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    8dc765d4
blk-core.c 98.7 KB