• David Jeffery's avatar
    blk-mq: avoid double ->queue_rq() because of early timeout · 82c22947
    David Jeffery authored
    David Jeffery found one double ->queue_rq() issue, so far it can
    be triggered in VM use case because of long vmexit latency or preempt
    latency of vCPU pthread or long page fault in vCPU pthread, then block
    IO req could be timed out before queuing the request to hardware but after
    calling blk_mq_start_request() during ->queue_rq(), then timeout handler
    may handle it by requeue, then double ->queue_rq() is caused, and kernel
    panic.
    
    So far, it is driver's responsibility to cover the race between timeout
    and completion, so it seems supposed to be solved in driver in theory,
    given driver has enough knowledge.
    
    But it is really one common problem, lots of driver could have similar
    issue, and could be hard to fix all affected drivers, even it isn't easy
    for driver to handle the race. So David suggests this patch by draining
    in-progress ->queue_rq() for solving this issue.
    
    Cc: Stefan Hajnoczi <stefanha@redhat.com>
    Cc: Keith Busch <kbusch@kernel.org>
    Cc: virtualization@lists.linux-foundation.org
    Cc: Bart Van Assche <bvanassche@acm.org>
    Signed-off-by: default avatarDavid Jeffery <djeffery@redhat.com>
    Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
    Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
    Link: https://lore.kernel.org/r/20221026051957.358818-1-ming.lei@redhat.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
    82c22947
blk-mq.c 122 KB