Commits · 385411ffba0c3305491346b98ba4d2cd8063f002 · Kirill Smelkov / linux

02 Mar, 2022 2 commits

Christoph Hellwig authored Mar 01, 2022

Just use the %pg format specifier instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

385411ff

dm-zoned: remove the ->name field in struct dmz_dev · 977ff73e

Christoph Hellwig authored Mar 01, 2022

Just use the %pg format specifier to print the block device name
directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

977ff73e

22 Feb, 2022 11 commits

dm: remove unnecessary local variables in __bind · f5b4aee1

Mike Snitzer authored Feb 22, 2022

Also remove empty newline before 'out:' label at end of __bind.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

f5b4aee1

dm: requeue IO if mapping table not yet available · fa247089

Mike Snitzer authored Feb 22, 2022

Update both bio-based and request-based DM to requeue IO if the
mapping table not available.

This race of IO being submitted before the DM device ready is so
narrow, yet possible for initial table load given that the DM device's
request_queue is created prior, that it best to requeue IO to handle
this unlikely case.
Reported-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

fa247089

dm io: remove stale comment block for dm_io() · a6a4901a

Barry Song authored Feb 19, 2022

Commit 7eaceacc ("block: remove per-queue plugging") dropped
unplug_delay and blk_unplug(). Plus, the current kernel has no
fundamental difference between sync_io() and async_io() except
sync_io() uses sync_io_complete() as the notify.fn and explicitly
calls wait_for_completion_io() to sync. The comment isn't valid
any more.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

a6a4901a

dm thin metadata: remove unused dm_thin_remove_block and __remove · 75274a4b
Zhiqiang Liu authored Feb 16, 2022
```
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
75274a4b

dm thin: use time_is_before_jiffies instead of open coding it · 8ca8b1e1

Wang Qing authored Feb 14, 2022

Use time_is_before_jiffies() to improve code readability.
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

8ca8b1e1

dm crypt: fix get_key_size compiler warning if !CONFIG_KEYS · 6fc51504

Aashish Sharma authored Feb 11, 2022

Explicitly convert unsigned int in the right of the conditional
expression to int to match the left side operand and the return type,
fixing the following compiler warning:

drivers/md/dm-crypt.c:2593:43: warning: signed and unsigned
type in conditional expression [-Wsign-compare]

Fixes: c538f6ec ("dm crypt: add ability to use keys from the kernel key retention service")
Signed-off-by: Aashish Sharma <shraash@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

6fc51504

dm: fix use-after-free in dm_cleanup_zoned_dev() · 588b7f5d

Kirill Tkhai authored Feb 01, 2022

dm_cleanup_zoned_dev() uses queue, so it must be called
before blk_cleanup_disk() starts its killing:

blk_cleanup_disk->blk_cleanup_queue()->kobject_put()->blk_release_queue()->
->...RCU...->blk_free_queue_rcu()->kmem_cache_free()

Otherwise, RCU callback may be executed first and
dm_cleanup_zoned_dev() will touch free'd memory:

 BUG: KASAN: use-after-free in dm_cleanup_zoned_dev+0x33/0xd0
 Read of size 8 at addr ffff88805ac6e430 by task dmsetup/681

 CPU: 4 PID: 681 Comm: dmsetup Not tainted 5.17.0-rc2+ #6
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x57/0x7d
  print_address_description.constprop.0+0x1f/0x150
  ? dm_cleanup_zoned_dev+0x33/0xd0
  kasan_report.cold+0x7f/0x11b
  ? dm_cleanup_zoned_dev+0x33/0xd0
  dm_cleanup_zoned_dev+0x33/0xd0
  __dm_destroy+0x26a/0x400
  ? dm_blk_ioctl+0x230/0x230
  ? up_write+0xd8/0x270
  dev_remove+0x156/0x1d0
  ctl_ioctl+0x269/0x530
  ? table_clear+0x140/0x140
  ? lock_release+0xb2/0x750
  ? remove_all+0x40/0x40
  ? rcu_read_lock_sched_held+0x12/0x70
  ? lock_downgrade+0x3c0/0x3c0
  ? rcu_read_lock_sched_held+0x12/0x70
  dm_ctl_ioctl+0xa/0x10
  __x64_sys_ioctl+0xb9/0xf0
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7fb6dfa95c27

Fixes: bb37d772 ("dm: introduce zone append emulation")
Cc: stable@vger.kernel.org
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

588b7f5d

dm ioctl: prevent potential spectre v1 gadget · cd9c88da

Jordy Zomer authored Jan 29, 2022

It appears like cmd could be a Spectre v1 gadget as it's supplied by a
user and used as an array index. Prevent the contents of kernel memory
from being leaked to userspace via speculative execution by using
array_index_nospec.
Signed-off-by: Jordy Zomer <jordy@pwning.systems>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

cd9c88da

dm: cleanup double word in comment · a8b9d116

Tom Rix authored Jan 26, 2022

Remove the second 'a'.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

a8b9d116

dm ima: fix wrong length calculation for no_data string · 118f31b4

Thore Sommer authored Jan 25, 2022

All entries measured by dm ima are prefixed by a version string
(dm_version=N.N.N). When there is no data to measure, the entire buffer is
overwritten with a string containing the version string again and the
length of that string is added to the length of the version string.
The new length is now wrong because it contains the version string twice.

This caused entries like this:
dm_version=4.45.0;name=test,uuid=test;table_clear=no_data; \
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \
current_device_capacity=204808;
Signed-off-by: Thore Sommer <public@thson.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

118f31b4

dm cache policy smq: make static read-only array table const · 302f0351

Colin Ian King authored Jan 23, 2022

The 'table' static array is read-only so it make sense to make
it const. Add in the int type to clean up checkpatch warning.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

302f0351

21 Feb, 2022 19 commits

dm delay: use dm_submit_bio_remap · c3573421

Mike Snitzer authored Feb 17, 2022

Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

c3573421

dm crypt: use dm_submit_bio_remap · e5524e12

Mike Snitzer authored Feb 17, 2022

Care was taken to support kcryptd_io_read being called from crypt_map
or workqueue.  Use of an intermediate CRYPT_MAP_READ_GFP gfp_t
(defined as GFP_NOWAIT) should protect from maintenance burden if that
flag were to change for some reason.
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

e5524e12

dm: add dm_submit_bio_remap interface · 0fbb4d93

Mike Snitzer authored Feb 17, 2022

Where possible, switch from early bio-based IO accounting (at the time
DM clones each incoming bio) to late IO accounting just before each
remapped bio is issued to underlying device via submit_bio_noacct().

Allows more precise bio-based IO accounting for DM targets that use
their own workqueues to perform additional processing of each bio in
conjunction with their DM_MAPIO_SUBMITTED return from their map
function. When a target is updated to use dm_submit_bio_remap() they
must also set ti->accounts_remapped_io to true.

Use xchg() in start_io_acct(), as suggested by Mikulas, to ensure each
IO is only started once.  The xchg race only happens if
__send_duplicate_bios() sends multiple bios -- that case is reflected
via tio->is_duplicate_bio.  Given the niche nature of this race, it is
best to avoid any xchg performance penalty for normal IO.

For IO that was never submitted with dm_bio_submit_remap(), but the
target completes the clone with bio_endio, accounting is started then
ended and pending_io counter decremented.
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

0fbb4d93

dm: flag clones created by __send_duplicate_bios · e6fc9f62

Mike Snitzer authored Feb 17, 2022

Formally disallow dm_accept_partial_bio() on clones created by
__send_duplicate_bios() because their len_ptr points to a shared
unsigned int.  __send_duplicate_bios() is only used for flush bios
and other "abnormal" bios (discards, writezeroes, etc). And
dm_accept_partial_bio() already didn't support flush bios.

Also refactor __send_changing_extent_only() to reflect it cannot fail.
As such __send_changing_extent_only() can update the clone_info before
__send_duplicate_bios() is called to fan-out __map_bio() calls.
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

e6fc9f62

dm: reduce dm_io and dm_target_io struct sizes · 300432f5

Mike Snitzer authored Feb 17, 2022

Remove one 4 byte hole in dm_io struct.
Remove two 4 byte holes in dm_target_io struct.
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

300432f5

dm: move duplicate code from callers of alloc_tio into alloc_tio · 018b05eb

Mike Snitzer authored Feb 17, 2022

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

018b05eb

dm: record old_sector in dm_target_io before calling map function · 743598f0

Mike Snitzer authored Feb 17, 2022

Prep for being able to defer trace_block_bio_remap() until when the
bio is remapped and submitted by the DM target.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

743598f0

dm: remove legacy code only needed before submit_bio recursion · 77c11720

Mike Snitzer authored Feb 17, 2022

Commit 8615cb65 ("dm: remove useless loop in
__split_and_process_bio") showcased that we no longer loop.

Remove the bio_advance() in __split_and_process_bio() that was only
needed when looping was possible.

Similarly there is no need to advance the bio, using ci->sector
cursor, in __send_duplicate_bios().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

77c11720

dm: remove unused mapped_device argument from free_tio · 0119ab14

Mike Snitzer authored Feb 17, 2022

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

0119ab14

dm: remove impossible BUG_ON in __send_empty_flush · 5b27b8dd

Mike Snitzer authored Feb 17, 2022

The flush_bio in question was just initialized to be empty, so there
is no way bio_has_data() will return true.  So remove stale BUG_ON().
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

5b27b8dd

dm: reduce code duplication in __map_bio · 90a2326e

Mike Snitzer authored Feb 17, 2022

Error path code (for handling DM_MAPIO_REQUEUE and DM_MAPIO_KILL) is
effectively identical.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

90a2326e

dm: refactor dm_split_and_process_bio a bit · d41e077a

Mike Snitzer authored Feb 17, 2022

Remove needless branching and indentation. Leaves code to catch
malformed op_is_zone_mgmt bios (they shouldn't have a payload).
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

d41e077a

dm: fold __clone_and_map_data_bio into __split_and_process_bio · 66bdaa43

Mike Snitzer authored Feb 17, 2022

Fold __clone_and_map_data_bio into its only caller.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

66bdaa43

dm: rename split functions · 96c9865c

Mike Snitzer authored Feb 17, 2022

Rename __split_and_process_bio to dm_split_and_process_bio.
Rename __split_and_process_non_flush to __split_and_process_bio.

Also fix a stale comment and whitespace.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

96c9865c

dm: reorder members in mapped_device struct · 205649d8

Mike Snitzer authored Feb 17, 2022

Improves alignment and groups related members relative to cachelines.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

205649d8

dm: eliminate copying of dm_io fields in dm_io_dec_pending · 0ab30b40

Mike Snitzer authored Feb 20, 2022

There is no need for dm_io_dec_pending() to copy dm_io fields
anymore now that DM provides its own pending_io counters again.

The race documented in commit d208b894 ("dm: fix mempool NULL
pointer race when completing IO") no longer exists now that block
core's in_flight counters aren't used to signal all dm_io is
complete.

Also, rename {start,end}_io_acct to dm_{start,end}_io_acct.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

0ab30b40

dm stats: fix too short end duration_ns when using precise_timestamps · 0cdb90f0

Mike Snitzer authored Feb 17, 2022

dm_stats_account_io()'s STAT_PRECISE_TIMESTAMPS support doesn't handle
the fact that with commit b879f915 ("dm: properly fix redundant
bio-based IO accounting") io->start_time _may_ be in the past (meaning
the start_io_acct() was deferred until later).

Add a new dm_stats_recalc_precise_timestamps() helper that will
set/clear a new 'precise_timestamps' flag in the dm_stats struct based
on whether any configured stats enable STAT_PRECISE_TIMESTAMPS.
And update DM core's alloc_io() to use dm_stats_record_start() to set
stats_aux.duration_ns if stats->precise_timestamps is true.

Also, remove unused 'last_sector' and 'last_rw' members from the
dm_stats struct.

Fixes: b879f915 ("dm: properly fix redundant bio-based IO accounting")
Cc: stable@vger.kernel.org
Co-developed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

0cdb90f0

dm: fix double accounting of flush with data · 8d394bc4

Mike Snitzer authored Feb 17, 2022

DM handles a flush with data by first issuing an empty flush and then
once it completes the REQ_PREFLUSH flag is removed and the payload is
issued.  The problem fixed by this commit is that both the empty flush
bio and the data payload will account the full extent of the data
payload.

Fix this by factoring out dm_io_acct() and having it wrap all IO
accounting to set the size of  bio with REQ_PREFLUSH to 0, account the
IO, and then restore the original size.

Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

8d394bc4

dm: interlock pending dm_io and dm_wait_for_bios_completion · 9f6dc633

Mike Snitzer authored Feb 17, 2022

Commit d208b894 ("dm: fix mempool NULL pointer race when
completing IO") didn't go far enough.

When bio_end_io_acct ends the count of in-flight I/Os may reach zero
and the DM device may be suspended. There is a possibility that the
suspend races with dm_stats_account_io.

Fix this by adding percpu "pending_io" counters to track outstanding
dm_io. Move kicking of suspend queue to dm_io_dec_pending(). Also,
rename md_in_flight_bios() to dm_in_flight_bios() and update it to
iterate all pending_io counters.

Fixes: d208b894 ("dm: fix mempool NULL pointer race when completing IO")
Cc: stable@vger.kernel.org
Co-developed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

9f6dc633

17 Feb, 2022 8 commits

block/bfq_wf2q: correct weight to ioprio · bcd2be76

Yahu Gao authored Jan 07, 2022

The return value is ioprio * BFQ_WEIGHT_CONVERSION_COEFF or 0.
What we want is ioprio or 0.
Correct this by changing the calculation.
Signed-off-by: Yahu Gao <gaoyahu19@gmail.com>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20220107065859.25689-1-gaoyahu19@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

bcd2be76

blk-mq: avoid extending delays of active hctx from blk_mq_delay_run_hw_queues · 8f5fea65

David Jeffery authored Jan 31, 2022

When blk_mq_delay_run_hw_queues sets an hctx to run in the future, it can
reset the delay length for an already pending delayed work run_work. This
creates a scenario where multiple hctx may have their queues set to run,
but if one runs first and finds nothing to do, it can reset the delay of
another hctx and stall the other hctx's ability to run requests.

To avoid this I/O stall when an hctx's run_work is already pending,
leave it untouched to run at its current designated time rather than
extending its delay. The work will still run which keeps closed the race
calling blk_mq_delay_run_hw_queues is needed for while also avoiding the
I/O stall.
Signed-off-by: David Jeffery <djeffery@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220131203337.GA17666@redhatSigned-off-by: Jens Axboe <axboe@kernel.dk>

8f5fea65

virtio_blk: simplify refcounting · 24b45e6c

Christoph Hellwig authored Feb 15, 2022

Implement the ->free_disk method to free the virtio_blk structure only
once the last gendisk reference goes away instead of keeping a local
refcount.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20220215094514.3828912-6-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

24b45e6c

memstick/mspro_block: simplify refcounting · 185ed423

Christoph Hellwig authored Feb 15, 2022

Implement the ->free_disk method to free the msb_data structure only once
the last gendisk reference goes away instead of keeping a local
refcount.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220215094514.3828912-5-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

185ed423

memstick/mspro_block: fix handling of read-only devices · 6dab421b

Christoph Hellwig authored Feb 15, 2022

Use set_disk_ro to propagate the read-only state to the block layer
instead of checking for it in ->open and leaking a reference in case
of a read-only device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220215094514.3828912-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

6dab421b

memstick/ms_block: simplify refcounting · e2efa079

Christoph Hellwig authored Feb 15, 2022

Implement the ->free_disk method to free the msb_data structure only once
the last gendisk reference goes away instead of keeping a local refcount.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220215094514.3828912-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

e2efa079

block: add a ->free_disk method · 76792055

Christoph Hellwig authored Feb 15, 2022

Add a method to notify the driver that the gendisk is about to be freed.
This allows drivers to tie the lifetime of their private data to that of
the gendisk and thus deal with device removal races without expensive
synchronization and boilerplate code.

A new flag is added so that ->free_disk is only called after a successful
call to add_disk, which significantly simplifies the error handling path
during probing.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220215094514.3828912-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

76792055

block: revert ("blk-throtl: optimize IOPS throttle for large IO scenarios") · 34841e6f

Ming Lei authored Feb 16, 2022

Revert commit 4f1e9630 ("blk-throtl: optimize IOPS throttle for large
IO scenarios") since we have another easier way to address this issue and
get better iops throttling result.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20220216044514.2903784-9-ming.lei@redhat.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

34841e6f