Commits · fda0b5ba9d5a9f6bfab9bc195f7a8fce13aedf61 · Kirill Smelkov / linux

16 Jun, 2021 5 commits

docs: block/bfq: describe per-device weight · fda0b5ba

Kir Kolyshkin authored Jun 14, 2021

The functionality of setting per-device weight for BFQ was added
in v5.4 (commit 795fe54c), but the documentation was never
updated.

While at it, improve formatting a bit.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Link: https://lore.kernel.org/r/20210614214109.207430-1-kolyshkin@gmail.comAcked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

fda0b5ba

block: mark queue init done at the end of blk_register_queue · a72c374f

Ming Lei authored Jun 09, 2021

Mark queue init done when everything is done well in blk_register_queue(),
so that wbt_enable_default() can be run quickly without any RCU period
involved since adding rq qos requires to freeze queue.

Also no any side effect by delaying to mark queue init done.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/r/20210609015822.103433-3-ming.lei@redhat.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

a72c374f

block: fix race between adding/removing rq qos and normal IO · 2cafe29a

Ming Lei authored Jun 09, 2021

Yi reported several kernel panics on:

[16687.001777] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
...
[16687.163549] pc : __rq_qos_track+0x38/0x60

or

[  997.690455] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
[  997.850347] pc : __rq_qos_done+0x2c/0x50

Turns out it is caused by race between adding rq qos(wbt) and normal IO
because rq_qos_add can be run when IO is being submitted, fix this issue
by freezing queue before adding/deleting rq qos to queue.

rq_qos_exit() needn't to freeze queue because it is called after queue
has been frozen.

iolatency calls rq_qos_add() during allocating queue, so freezing won't
add delay because queue usage refcount works at atomic mode at that
time.

iocost calls rq_qos_add() when writing cgroup attribute file, that is
fine to freeze queue at that time since we usually freeze queue when
storing to queue sysfs attribute, meantime iocost only exists on the
root cgroup.

wbt_init calls it in blk_register_queue() and queue sysfs attribute
store(queue_wb_lat_store() when write it 1st time in case of !BLK_WBT_MQ),
the following patch will speedup the queue freezing in wbt_init.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/r/20210609015822.103433-2-ming.lei@redhat.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

2cafe29a

loop: fix order of cleaning up the queue and freeing the tagset · 6a03cd98

Christoph Hellwig authored Jun 16, 2021

We must release the queue before freeing the tagset.

Fixes: 1c99502f ("loop: use blk_mq_alloc_disk and blk_cleanup_disk")
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

6a03cd98

mtd_blkdevs: initialze new->rq in add_mtd_blktrans_dev · 07a719f8

Christoph Hellwig authored Jun 16, 2021

Various places expect the request_queue in ->rq.  Initialize it to
avoid NULL pointer derefences.

Fixes: 6966bb92 ("mtd_blkdevs: use blk_mq_alloc_disk")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

07a719f8

11 Jun, 2021 30 commits

z2ram: use blk_mq_alloc_disk and blk_cleanup_disk · ec06c989

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-31-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

ec06c989

ataflop: use blk_mq_alloc_disk and blk_cleanup_disk · fd71c8a8

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-30-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

fd71c8a8

amiflop: use blk_mq_alloc_disk and blk_cleanup_disk · f6d82974

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-29-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

f6d82974

scm_blk: use blk_mq_alloc_disk and blk_cleanup_disk · c06cf063

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-28-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

c06cf063

ubi: use blk_mq_alloc_disk and blk_cleanup_disk · 77567b25

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-27-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

77567b25

xen-blkfront: use blk_mq_alloc_disk and blk_cleanup_disk · 3b62c140

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-26-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

3b62c140

sx8: use blk_mq_alloc_disk and blk_cleanup_disk · 69387403

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-25-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

69387403

rnbd: use blk_mq_alloc_disk and blk_cleanup_disk · 2c6ee0ae

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-24-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

2c6ee0ae

rbd: use blk_mq_alloc_disk and blk_cleanup_disk · 195b1956

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-23-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

195b1956

pd: use blk_mq_alloc_disk and blk_cleanup_disk · 262d431f

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-22-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

262d431f

nullb: use blk_mq_alloc_disk · 6759b1a2

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-21-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

6759b1a2

nbd: use blk_mq_alloc_disk and blk_cleanup_disk · 4af5f2e0

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-20-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

4af5f2e0

loop: use blk_mq_alloc_disk and blk_cleanup_disk · 1c99502f

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-19-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

1c99502f

floppy: use blk_mq_alloc_disk and blk_cleanup_disk · 34f84aef

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-18-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

34f84aef

aoe: use blk_mq_alloc_disk and blk_cleanup_disk · 6560ec96

Christoph Hellwig authored Jun 02, 2021

Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-17-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

6560ec96

blk-mq: remove blk_mq_init_sq_queue · 08c1d480

Christoph Hellwig authored Jun 02, 2021

All users are gone now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-16-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

08c1d480

gdrom: use blk_mq_alloc_disk · 0592c3d1

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-15-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

0592c3d1

sunvdc: use blk_mq_alloc_disk · afea05a1

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-14-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

afea05a1

swim: use blk_mq_alloc_disk · 51fbfedf

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-13-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

51fbfedf

swim3: use blk_mq_alloc_disk · 9c8463e8

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-12-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

9c8463e8

ps3disk: use blk_mq_alloc_disk · 89662ac5

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Geoff Levand <geoff@infradead.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-11-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

89662ac5

mtd_blkdevs: use blk_mq_alloc_disk · 6966bb92

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-10-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

6966bb92

mspro: use blk_mq_alloc_disk · 51ed5bd5

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-9-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

51ed5bd5

ms_block: use blk_mq_alloc_disk · f368b7d7

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-8-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

f368b7d7

pf: use blk_mq_alloc_disk · c684b577

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-7-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

c684b577

pcd: use blk_mq_alloc_disk · 9c4f8971

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-6-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

9c4f8971

virtio-blk: use blk_mq_alloc_disk · 89a5f065

Christoph Hellwig authored Jun 02, 2021

Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue
allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-5-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

89a5f065

blk-mq: add the blk_mq_alloc_disk APIs · b461dfc4

Christoph Hellwig authored Jun 02, 2021

Add a new API to allocate a gendisk including the request_queue for use
with blk-mq based drivers. This is to avoid boilerplate code in drivers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

b461dfc4

blk-mq: improve the blk_mq_init_allocated_queue interface · 26a9750a

Christoph Hellwig authored Jun 02, 2021

Don't return the passed in request_queue but a normal error code, and
drop the elevator_init argument in favor of just calling elevator_init_mq
directly from dm-rq.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

26a9750a

blk-mq: factor out a blk_mq_alloc_sq_tag_set helper · cdb14e0f

Christoph Hellwig authored Jun 02, 2021

Factour out a helper to initialize a simple single hw queue tag_set from
blk_mq_init_sq_queue. This will allow to phase out blk_mq_init_sq_queue
in favor of a more symmetric and general API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210602065345.355274-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

cdb14e0f

09 Jun, 2021 1 commit

libnvdimm/pmem: Fix blk_cleanup_disk() usage · a624eb52

Dan Williams authored Jun 07, 2021

The queue_to_disk() helper can not be used after del_gendisk()
communicate @disk via the pgmap->owner.

Otherwise, queue_to_disk() returns NULL resulting in the splat below.

 Kernel attempted to read user page (330) - exploit attempt? (uid: 0)
 BUG: Kernel NULL pointer dereference on read at 0x00000330
 Faulting instruction address: 0xc000000000906344
 Oops: Kernel access of bad area, sig: 11 [#1]
 [..]
 NIP [c000000000906344] pmem_pagemap_cleanup+0x24/0x40
 LR [c0000000004701d4] memunmap_pages+0x1b4/0x4b0
 Call Trace:
 [c000000022cbb9c0] [c0000000009063c8] pmem_pagemap_kill+0x28/0x40 (unreliable)
 [c000000022cbb9e0] [c0000000004701d4] memunmap_pages+0x1b4/0x4b0
 [c000000022cbba90] [c0000000008b28a0] devm_action_release+0x30/0x50
 [c000000022cbbab0] [c0000000008b39c8] release_nodes+0x2f8/0x3e0
 [c000000022cbbb60] [c0000000008ac440] device_release_driver_internal+0x190/0x2b0
 [c000000022cbbba0] [c0000000008a8450] unbind_store+0x130/0x170
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Fixes: 87eb73b2 ("nvdimm-pmem: convert to blk_alloc_disk/blk_cleanup_disk")
Link: http://lore.kernel.org/r/DFB75BA8-603F-4A35-880B-C5B23EF8FA7D@linux.vnet.ibm.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/162310994435.1571616.334551212901820961.stgit@dwillia2-desk3.amr.corp.intel.com
[axboe: fold in compile warning fix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a624eb52

08 Jun, 2021 2 commits

rq-qos: fix missed wake-ups in rq_qos_throttle try two · 11c7aa0d

Jan Kara authored Jun 07, 2021

Commit 545fbd07 ("rq-qos: fix missed wake-ups in rq_qos_throttle")
tried to fix a problem that a process could be sleeping in rq_qos_wait()
without anyone to wake it up. However the fix is not complete and the
following can still happen:

CPU1 (waiter1)		CPU2 (waiter2)		CPU3 (waker)
rq_qos_wait()		rq_qos_wait()
  acquire_inflight_cb() -> fails
			  acquire_inflight_cb() -> fails

						completes IOs, inflight
						  decreased
  prepare_to_wait_exclusive()
			  prepare_to_wait_exclusive()
  has_sleeper = !wq_has_single_sleeper() -> true as there are two sleepers
			  has_sleeper = !wq_has_single_sleeper() -> true
  io_schedule()		  io_schedule()

Deadlock as now there's nobody to wakeup the two waiters. The logic
automatically blocking when there are already sleepers is really subtle
and the only way to make it work reliably is that we check whether there
are some waiters in the queue when adding ourselves there. That way, we
are guaranteed that at least the first process to enter the wait queue
will recheck the waiting condition before going to sleep and thus
guarantee forward progress.

Fixes: 545fbd07 ("rq-qos: fix missed wake-ups in rq_qos_throttle")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210607112613.25344-1-jack@suse.czSigned-off-by: Jens Axboe <axboe@kernel.dk>

11c7aa0d

block: return the correct bvec when checking for gaps · c9c9762d

Long Li authored Jun 07, 2021

After commit 07173c3e ("block: enable multipage bvecs"), a bvec can
have multiple pages. But bio_will_gap() still assumes one page bvec while
checking for merging. If the pages in the bvec go across the
seg_boundary_mask, this check for merging can potentially succeed if only
the 1st page is tested, and can fail if all the pages are tested.

Later, when SCSI builds the SG list the same check for merging is done in
__blk_segment_map_sg_merge() with all the pages in the bvec tested. This
time the check may fail if the pages in bvec go across the
seg_boundary_mask (but tested okay in bio_will_gap() earlier, so those
BIOs were merged). If this check fails, we end up with a broken SG list
for drivers assuming the SG list not having offsets in intermediate pages.
This results in incorrect pages written to the disk.

Fix this by returning the multi-page bvec when testing gaps for merging.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Jeffle Xu <jefflexu@linux.alibaba.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: 07173c3e ("block: enable multipage bvecs")
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/1623094445-22332-1-git-send-email-longli@linuxonhyperv.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

c9c9762d

03 Jun, 2021 2 commits

block: Update blk_update_request() documentation · 7cc2623d

Bart Van Assche authored May 19, 2021

Although the original intent was to use blk_update_request() in stacking
block drivers only, it is used much more widely today. Reflect this in the
documentation block above this function. See also:
* commit 32fab448 ("block: add request update interface").
* commit 2e60e022 ("block: clean up request completion API").
* commit ed6565e7 ("block: handle partial completions for special
  payload requests").

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210519175226.8853-1-bvanassche@acm.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

7cc2623d

block: Do not pull requests from the scheduler when we cannot dispatch them · 61347154

Jan Kara authored Jun 03, 2021

Provided the device driver does not implement dispatch budget accounting
(which only SCSI does) the loop in __blk_mq_do_dispatch_sched() pulls
requests from the IO scheduler as long as it is willing to give out any.
That defeats scheduling heuristics inside the scheduler by creating
false impression that the device can take more IO when it in fact
cannot.

For example with BFQ IO scheduler on top of virtio-blk device setting
blkio cgroup weight has barely any impact on observed throughput of
async IO because __blk_mq_do_dispatch_sched() always sucks out all the
IO queued in BFQ. BFQ first submits IO from higher weight cgroups but
when that is all dispatched, it will give out IO of lower weight cgroups
as well. And then we have to wait for all this IO to be dispatched to
the disk (which means lot of it actually has to complete) before the
IO scheduler is queried again for dispatching more requests. This
completely destroys any service differentiation.

So grab request tag for a request pulled out of the IO scheduler already
in __blk_mq_do_dispatch_sched() and do not pull any more requests if we
cannot get it because we are unlikely to be able to dispatch it. That
way only single request is going to wait in the dispatch list for some
tag to free.
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210603104721.6309-1-jack@suse.czSigned-off-by: Jens Axboe <axboe@kernel.dk>

61347154