• Jens Axboe's avatar
    block: treat poll queue enter similarly to timeouts · 33391eec
    Jens Axboe authored
    We ran into an issue where a production workload would randomly grind to
    a halt and not continue until the pending IO had timed out. This turned
    out to be a complicated interaction between queue freezing and polled
    IO:
    
    1) You have an application that does polled IO. At any point in time,
       there may be polled IO pending.
    
    2) You have a monitoring application that issues a passthrough command,
       which is marked with side effects such that it needs to freeze the
       queue.
    
    3) Passthrough command is started, which calls blk_freeze_queue_start()
       on the device. At this point the queue is marked frozen, and any
       attempt to enter the queue will fail (for non-blocking) or block.
    
    4) Now the driver calls blk_mq_freeze_queue_wait(), which will return
       when the queue is quiesced and pending IO has completed.
    
    5) The pending IO is polled IO, but any attempt to poll IO through the
       normal iocb_bio_iopoll() -> bio_poll() will fail when it gets to
       bio_queue_enter() as the queue is frozen. Rather than poll and
       complete IO, the polling threads will sit in a tight loop attempting
       to poll, but failing to enter the queue to do so.
    
    The end result is that progress for either application will be stalled
    until all pending polled IO has timed out. This causes obvious huge
    latency issues for the application doing polled IO, but also long delays
    for passthrough command.
    
    Fix this by treating queue enter for polled IO just like we do for
    timeouts. This allows quick quiesce of the queue as we still poll and
    complete this IO, while still disallowing queueing up new IO.
    Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    33391eec
blk-core.c 33.1 KB