1. 09 May, 2024 3 commits
  2. 07 May, 2024 8 commits
  3. 06 May, 2024 1 commit
  4. 03 May, 2024 4 commits
    • INAGAKI Hiroshi's avatar
      block: fix and simplify blkdevparts= cmdline parsing · bc2e07df
      INAGAKI Hiroshi authored
      Fix the cmdline parsing of the "blkdevparts=" parameter using strsep(),
      which makes the code simpler.
      
      Before commit 146afeb2 ("block: use strscpy() to instead of
      strncpy()"), we used a strncpy() to copy a block device name and partition
      names. The commit simply replaced a strncpy() and NULL termination with
      a strscpy(). It did not update calculations of length passed to strscpy().
      While the length passed to strncpy() is just a length of valid characters
      without NULL termination ('\0'), strscpy() takes it as a length of the
      destination buffer, including a NULL termination.
      
      Since the source buffer is not necessarily NULL terminated, the current
      code copies "length - 1" characters and puts a NULL character in the
      destination buffer. It replaces the last character with NULL and breaks
      the parsing.
      
      As an example, that buffer will be passed to parse_parts() and breaks
      parsing sub-partitions due to the missing ')' at the end, like the
      following.
      
      example (Check Point V-80 & OpenWrt):
      
      - Linux Kernel 6.6
      
        [    0.000000] Kernel command line: console=ttyS0,115200 earlycon=uart8250,mmio32,0xf0512000 crashkernel=30M mvpp2x.queue_mode=1 blkdevparts=mmcblk1:48M@10M(kernel-1),1M(dtb-1),720M(rootfs-1),48M(kernel-2),1M(dtb-2),720M(rootfs-2),300M(default_sw),650M(logs),1M(preset_cfg),1M(adsl),-(storage) maxcpus=4
        ...
        [    0.884016] mmc1: new HS200 MMC card at address 0001
        [    0.889951] mmcblk1: mmc1:0001 004GA0 3.69 GiB
        [    0.895043] cmdline partition format is invalid.
        [    0.895704]  mmcblk1: p1
        [    0.903447] mmcblk1boot0: mmc1:0001 004GA0 2.00 MiB
        [    0.908667] mmcblk1boot1: mmc1:0001 004GA0 2.00 MiB
        [    0.913765] mmcblk1rpmb: mmc1:0001 004GA0 512 KiB, chardev (248:0)
      
        1. "48M@10M(kernel-1),..." is passed to strscpy() with length=17
           from parse_parts()
        2. strscpy() returns -E2BIG and the destination buffer has
           "48M@10M(kernel-1\0"
        3. "48M@10M(kernel-1\0" is passed to parse_subpart()
        4. parse_subpart() fails to find ')' when parsing a partition name,
           and returns error
      
      - Linux Kernel 6.1
      
        [    0.000000] Kernel command line: console=ttyS0,115200 earlycon=uart8250,mmio32,0xf0512000 crashkernel=30M mvpp2x.queue_mode=1 blkdevparts=mmcblk1:48M@10M(kernel-1),1M(dtb-1),720M(rootfs-1),48M(kernel-2),1M(dtb-2),720M(rootfs-2),300M(default_sw),650M(logs),1M(preset_cfg),1M(adsl),-(storage) maxcpus=4
        ...
        [    0.953142] mmc1: new HS200 MMC card at address 0001
        [    0.959114] mmcblk1: mmc1:0001 004GA0 3.69 GiB
        [    0.964259]  mmcblk1: p1(kernel-1) p2(dtb-1) p3(rootfs-1) p4(kernel-2) p5(dtb-2) 6(rootfs-2) p7(default_sw) p8(logs) p9(preset_cfg) p10(adsl) p11(storage)
        [    0.979174] mmcblk1boot0: mmc1:0001 004GA0 2.00 MiB
        [    0.984674] mmcblk1boot1: mmc1:0001 004GA0 2.00 MiB
        [    0.989926] mmcblk1rpmb: mmc1:0001 004GA0 512 KiB, chardev (248:0
      
      By the way, strscpy() takes a length of destination buffer and it is
      often confusing when copying characters with a specified length. Using
      strsep() helps to separate the string by the specified character. Then,
      we can use strscpy() naturally with the size of the destination buffer.
      
      Separating the string on the fly is also useful to omit the redundant
      string copy, reducing memory usage and improve the code readability.
      
      Fixes: 146afeb2 ("block: use strscpy() to instead of strncpy()")
      Suggested-by: default avatarNaohiro Aota <naota@elisp.net>
      Signed-off-by: default avatarINAGAKI Hiroshi <musashino.open@gmail.com>
      Reviewed-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Link: https://lore.kernel.org/r/20240421074005.565-1-musashino.open@gmail.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bc2e07df
    • Christoph Hellwig's avatar
      block: refine the EOF check in blkdev_iomap_begin · 0c12028a
      Christoph Hellwig authored
      blkdev_iomap_begin rounds down the offset to the logical block size
      before stashing it in iomap->offset and checking that it still is
      inside the inode size.
      
      Check the i_size check to the raw pos value so that we don't try a
      zero size write if iter->pos is unaligned.
      
      Fixes: 487c607d ("block: use iomap for writes to block devices")
      Reported-by: syzbot+0a3683a0a6fecf909244@syzkaller.appspotmail.com
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: syzbot+0a3683a0a6fecf909244@syzkaller.appspotmail.com
      Link: https://lore.kernel.org/r/20240503081042.2078062-1-hch@lst.deSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0c12028a
    • Christoph Hellwig's avatar
      block: add a partscan sysfs attribute for disks · a4217c67
      Christoph Hellwig authored
      Userspace had been unknowingly relying on a non-stable interface of
      kernel internals to determine if partition scanning is enabled for a
      given disk. Provide a stable interface for this purpose instead.
      
      Cc: stable@vger.kernel.org # 6.3+
      Depends-on: 140ce28d ("block: add a disk_has_partscan helper")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/linux-block/ZhQJf8mzq_wipkBH@gardel-login/
      Link: https://lore.kernel.org/r/20240502130033.1958492-3-hch@lst.de
      [axboe: add links and commit message from Keith]
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a4217c67
    • Christoph Hellwig's avatar
      block: add a disk_has_partscan helper · 140ce28d
      Christoph Hellwig authored
      Add a helper to check if partition scanning is enabled instead of
      open coding the check in a few places.  This now always checks for
      the hidden flag even if all but one of the callers are never reachable
      for hidden gendisks.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20240502130033.1958492-2-hch@lst.deSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      140ce28d
  5. 02 May, 2024 2 commits
  6. 01 May, 2024 14 commits
  7. 26 Apr, 2024 1 commit
  8. 25 Apr, 2024 3 commits
  9. 23 Apr, 2024 1 commit
    • Damien Le Moal's avatar
      block: use a per disk workqueue for zone write plugging · a8f59e5a
      Damien Le Moal authored
      A zone write plug BIO work function blk_zone_wplug_bio_work() calls
      submit_bio_noacct_nocheck() to execute the next unplugged BIO. This
      function may block. So executing zone plugs BIO works using the block
      layer global kblockd workqueue can potentially lead to preformance or
      latency issues as the number of concurrent work for a workqueue is
      limited to WQ_DFL_ACTIVE (256).
      1) For a system with a large number of zoned disks, issuing write
         requests to otherwise unused zones may be delayed wiating for a work
         thread to become available.
      2) Requeue operations which use kblockd but are independent of zone
         write plugging may alsoi end up being delayed.
      
      To avoid these potential performance issues, create a workqueue per
      zoned device to execute zone plugs BIO work. The workqueue max active
      parameter is set to the maximum number of zone write plugs allocated
      with the zone write plug mempool. This limit is equal to the maximum
      number of open zones of the disk and defaults to 128 for disks that do
      not have a limit on the number of open zones.
      
      Fixes: dd291d77 ("block: Introduce zone write plugging")
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20240420075811.1276893-3-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a8f59e5a
  10. 19 Apr, 2024 1 commit
  11. 17 Apr, 2024 2 commits
    • Damien Le Moal's avatar
      null_blk: Simplify null_zone_write() · e994ff5b
      Damien Le Moal authored
      In null_zone_write, we do not need to first check if the target zone
      condition is FULL, READONLY or OFFLINE: for theses conditions, the check
      of the command sector against the zone write pointer will always result
      in the command failing. Remove these checks.
      
      We still however need to check that the target zone write pointer is not
      invalid for zone append operations. To do so, add the macro
      NULL_ZONE_INVALID_WP and use it in null_set_zone_cond() when changing a
      zone to READONLY or OFFLINE condition.
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Link: https://lore.kernel.org/r/20240411085502.728558-4-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e994ff5b
    • Damien Le Moal's avatar
      null_blk: Do zone resource management only if necessary · 3bdde070
      Damien Le Moal authored
      For zoned null_blk devices setup without any limit on the maximum number
      of open and active zones, there is no need to count the number of zones
      that are implicitly open, explicitly open and closed. This is indicated
      by the boolean field need_zone_res_mgmt of sturct nullb_device.
      
      Modify the zone management functions null_reset_zone(),
      null_finish_zone(), null_open_zone() and null_close_zone() to manage
      the zone condition counters only if the device need_zone_res_mgmt field
      is true. With this change, the function __null_close_zone() is removed
      and integrated into the 2 caller sites directly, with the
      null_close_imp_open_zone() call site greatly simplified as this function
      closes zones that are known to be in the implicit open condition.
      
      null_zone_write() is modified in a similar manner to do zone condition
      accouting only when the device need_zone_res_mgmt field is true.
      
      With these changes, the inline helpers null_lock_zone_res() and
      null_unlock_zone_res() are removed and replaced with direct calls to
      spin_lock()/spin_unlock().
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Link: https://lore.kernel.org/r/20240411085502.728558-3-dlemoal@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3bdde070