1. 12 Jan, 2023 2 commits
    • Adrian Huang's avatar
      md: fix incorrect declaration about claim_rdev in md_import_device · b0907cad
      Adrian Huang authored
      Commit fb541ca4 ("md: remove lock_bdev / unlock_bdev") removes
      wrappers for blkdev_get/blkdev_put. However, the uninitialized local
      static variable of pointer type 'claim_rdev' in md_import_device()
      is NULL, which leads to the following warning call trace:
      
        WARNING: CPU: 22 PID: 1037 at block/bdev.c:577 bd_prepare_to_claim+0x131/0x150
        CPU: 22 PID: 1037 Comm: mdadm Not tainted 6.2.0-rc3+ #69
        ..
        RIP: 0010:bd_prepare_to_claim+0x131/0x150
        ..
        Call Trace:
         <TASK>
         ? _raw_spin_unlock+0x15/0x30
         ? iput+0x6a/0x220
         blkdev_get_by_dev.part.0+0x4b/0x300
         md_import_device+0x126/0x1d0
         new_dev_store+0x184/0x240
         md_attr_store+0x80/0xf0
         kernfs_fop_write_iter+0x128/0x1c0
         vfs_write+0x2be/0x3c0
         ksys_write+0x5f/0xe0
         do_syscall_64+0x38/0x90
         entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      It turns out the md device cannot be used:
      
        md: could not open device unknown-block(259,0).
        md: md127 stopped.
      
      Fix the issue by declaring the local static variable of struct type
      and passing the pointer of the variable to blkdev_get_by_dev().
      
      Fixes: fb541ca4 ("md: remove lock_bdev / unlock_bdev")
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAdrian Huang <ahuang12@lenovo.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      b0907cad
    • Jens Axboe's avatar
      Merge tag 'nvme-6.2-2023-01-12' of git://git.infradead.org/nvme into block-6.2 · 3d25b1e8
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 6.2
      
       - Identify quirks for Apple controllers (Hector Martin)
       - fix error handling in nvme_pci_enable (Tong Zhang)
       - refuse unprivileged passthrough on partitions (Christoph Hellwig)
       - fix MAINTAINERS to not match nvmem subsystem headers (Russell King)"
      
      * tag 'nvme-6.2-2023-01-12' of git://git.infradead.org/nvme:
        MAINTAINERS: stop nvme matching for nvmem files
        nvme: don't allow unprivileged passthrough on partitions
        nvme: replace the "bool vec" arguments with flags in the ioctl path
        nvme: remove __nvme_ioctl
        nvme-pci: fix error handling in nvme_pci_enable()
        nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers
        nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression
      3d25b1e8
  2. 10 Jan, 2023 7 commits
  3. 09 Jan, 2023 1 commit
  4. 05 Jan, 2023 1 commit
  5. 04 Jan, 2023 6 commits
  6. 29 Dec, 2022 1 commit
    • Jens Axboe's avatar
      Merge tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme into block-6.2 · 1551ed5a
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 6.2
      
       - fix various problems in handling the Command Supported and Effects log
         (Christoph Hellwig)
       - don't allow unprivileged passthrough of commands that don't transfer
         data but modify logical block content (Christoph Hellwig)
       - add a features and quirks policy document (Christoph Hellwig)
       - fix some really nasty code that was correct but made smatch complain
         (Sagi Grimberg)"
      
      * tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme:
        nvme-auth: fix smatch warning complaints
        nvme: consult the CSE log page for unprivileged passthrough
        nvme: also return I/O command effects from nvme_command_effects
        nvmet: don't defer passthrough commands with trivial effects to the workqueue
        nvmet: set the LBCC bit for commands that modify data
        nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it
        nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition
        docs, nvme: add a feature and quirk policy document
      1551ed5a
  7. 28 Dec, 2022 8 commits
  8. 26 Dec, 2022 3 commits
  9. 22 Dec, 2022 2 commits
    • Jens Axboe's avatar
      Merge tag 'nvme-6.2-2022-12-22' of git://git.infradead.org/nvme into block-6.2 · fb857b0b
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 6.2
      
       - fix doorbell buffer value endianness (Klaus Jensen)
       - fix Linux vs NVMe page size mismatch (Keith Busch)
       - fix a potential use memory access beyong the allocation limit
         (Keith Busch)
       - fix a multipath vs blktrace NULL pointer dereference
         (Yanjun Zhang)"
      
      * tag 'nvme-6.2-2022-12-22' of git://git.infradead.org/nvme:
        nvme: fix multipath crash caused by flush request when blktrace is enabled
        nvme-pci: fix page size checks
        nvme-pci: fix mempool alloc size
        nvme-pci: fix doorbell buffer value endianness
      fb857b0b
    • Yanjun Zhang's avatar
      nvme: fix multipath crash caused by flush request when blktrace is enabled · 3659fb5a
      Yanjun Zhang authored
      The flush request initialized by blk_kick_flush has NULL bio,
      and it may be dealt with nvme_end_req during io completion.
      When blktrace is enabled, nvme_trace_bio_complete with multipath
      activated trying to access NULL pointer bio from flush request
      results in the following crash:
      
      [ 2517.831677] BUG: kernel NULL pointer dereference, address: 000000000000001a
      [ 2517.835213] #PF: supervisor read access in kernel mode
      [ 2517.838724] #PF: error_code(0x0000) - not-present page
      [ 2517.842222] PGD 7b2d51067 P4D 0
      [ 2517.845684] Oops: 0000 [#1] SMP NOPTI
      [ 2517.849125] CPU: 2 PID: 732 Comm: kworker/2:1H Kdump: loaded Tainted: G S                5.15.67-0.cl9.x86_64 #1
      [ 2517.852723] Hardware name: XFUSION 2288H V6/BC13MBSBC, BIOS 1.13 07/27/2022
      [ 2517.856358] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
      [ 2517.859993] RIP: 0010:blk_add_trace_bio_complete+0x6/0x30
      [ 2517.863628] Code: 1f 44 00 00 48 8b 46 08 31 c9 ba 04 00 10 00 48 8b 80 50 03 00 00 48 8b 78 50 e9 e5 fe ff ff 0f 1f 44 00 00 41 54 49 89 f4 55 <0f> b6 7a 1a 48 89 d5 e8 3e 1c 2b 00 48 89 ee 4c 89 e7 5d 89 c1 ba
      [ 2517.871269] RSP: 0018:ff7f6a008d9dbcd0 EFLAGS: 00010286
      [ 2517.875081] RAX: ff3d5b4be00b1d50 RBX: 0000000002040002 RCX: ff3d5b0a270f2000
      [ 2517.878966] RDX: 0000000000000000 RSI: ff3d5b0b021fb9f8 RDI: 0000000000000000
      [ 2517.882849] RBP: ff3d5b0b96a6fa00 R08: 0000000000000001 R09: 0000000000000000
      [ 2517.886718] R10: 000000000000000c R11: 000000000000000c R12: ff3d5b0b021fb9f8
      [ 2517.890575] R13: 0000000002000000 R14: ff3d5b0b021fb1b0 R15: 0000000000000018
      [ 2517.894434] FS:  0000000000000000(0000) GS:ff3d5b42bfc80000(0000) knlGS:0000000000000000
      [ 2517.898299] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2517.902157] CR2: 000000000000001a CR3: 00000004f023e005 CR4: 0000000000771ee0
      [ 2517.906053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 2517.909930] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 2517.913761] PKRU: 55555554
      [ 2517.917558] Call Trace:
      [ 2517.921294]  <TASK>
      [ 2517.924982]  nvme_complete_rq+0x1c3/0x1e0 [nvme_core]
      [ 2517.928715]  nvme_tcp_recv_pdu+0x4d7/0x540 [nvme_tcp]
      [ 2517.932442]  nvme_tcp_recv_skb+0x4f/0x240 [nvme_tcp]
      [ 2517.936137]  ? nvme_tcp_recv_pdu+0x540/0x540 [nvme_tcp]
      [ 2517.939830]  tcp_read_sock+0x9c/0x260
      [ 2517.943486]  nvme_tcp_try_recv+0x65/0xa0 [nvme_tcp]
      [ 2517.947173]  nvme_tcp_io_work+0x64/0x90 [nvme_tcp]
      [ 2517.950834]  process_one_work+0x1e8/0x390
      [ 2517.954473]  worker_thread+0x53/0x3c0
      [ 2517.958069]  ? process_one_work+0x390/0x390
      [ 2517.961655]  kthread+0x10c/0x130
      [ 2517.965211]  ? set_kthread_struct+0x40/0x40
      [ 2517.968760]  ret_from_fork+0x1f/0x30
      [ 2517.972285]  </TASK>
      
      To avoid this situation, add a NULL check for req->bio before
      calling trace_block_bio_complete.
      Signed-off-by: default avatarYanjun Zhang <zhangyanjun@cestc.cn>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      3659fb5a
  10. 21 Dec, 2022 3 commits
  11. 16 Dec, 2022 1 commit
  12. 15 Dec, 2022 2 commits
  13. 14 Dec, 2022 3 commits
    • Tejun Heo's avatar
      blk-iolatency: Fix memory leak on add_disk() failures · 813e6930
      Tejun Heo authored
      When a gendisk is successfully initialized but add_disk() fails such as when
      a loop device has invalid number of minor device numbers specified,
      blkcg_init_disk() is called during init and then blkcg_exit_disk() during
      error handling. Unfortunately, iolatency gets initialized in the former but
      doesn't get cleaned up in the latter.
      
      This is because, in non-error cases, the cleanup is performed by
      del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
      policies, iolatency being one of them, can only be activated once the disk
      is fully registered and visible. That assumption is true for wbt and iocost,
      but not so for iolatency as it gets initialized before add_disk() is called.
      
      It is desirable to lazy-init rq_qos policies because they are optional
      features and add to hot path overhead once initialized - each IO has to walk
      all the registered rq_qos policies. So, we want to switch iolatency to lazy
      init too. However, that's a bigger change. As a fix for the immediate
      problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk().
      This is safe because duplicate calls to rq_qos_exit() become noop's.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: darklight2357@icloud.com
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Fixes: d7067512 ("block: introduce blk-iolatency io controller")
      Cc: stable@vger.kernel.org # v4.19+
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/Y5TQ5gm3O4HXrXR3@slm.duckdns.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      813e6930
    • Isaac J. Manjarres's avatar
      loop: Fix the max_loop commandline argument treatment when it is set to 0 · 85c50197
      Isaac J. Manjarres authored
      Currently, the max_loop commandline argument can be used to specify how
      many loop block devices are created at init time. If it is not
      specified on the commandline, CONFIG_BLK_DEV_LOOP_MIN_COUNT loop block
      devices will be created.
      
      The max_loop commandline argument can be used to override the value of
      CONFIG_BLK_DEV_LOOP_MIN_COUNT. However, when max_loop is set to 0
      through the commandline, the current logic treats it as if it had not
      been set, and creates CONFIG_BLK_DEV_LOOP_MIN_COUNT devices anyway.
      
      Fix this by starting max_loop off as set to CONFIG_BLK_DEV_LOOP_MIN_COUNT.
      This preserves the intended behavior of creating
      CONFIG_BLK_DEV_LOOP_MIN_COUNT loop block devices if the max_loop
      commandline parameter is not specified, and allowing max_loop to
      be respected for all values, including 0.
      
      This allows environments that can create all of their required loop
      block devices on demand to not have to unnecessarily preallocate loop
      block devices.
      
      Fixes: 73285082 ("remove artificial software max_loop limit")
      Cc: stable@vger.kernel.org
      Cc: Ken Chen <kenchen@google.com>
      Signed-off-by: default avatarIsaac J. Manjarres <isaacmanjarres@google.com>
      Link: https://lore.kernel.org/r/20221208212902.765781-1-isaacmanjarres@google.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      85c50197
    • Jiri Slaby (SUSE)'s avatar
      block/blk-iocost (gcc13): keep large values in a new enum · ff1cc97b
      Jiri Slaby (SUSE) authored
      Since gcc13, each member of an enum has the same type as the enum [1]. And
      that is inherited from its members. Provided:
        VTIME_PER_SEC_SHIFT     = 37,
        VTIME_PER_SEC           = 1LLU << VTIME_PER_SEC_SHIFT,
        ...
        AUTOP_CYCLE_NSEC        = 10LLU * NSEC_PER_SEC,
      the named type is unsigned long.
      
      This generates warnings with gcc-13:
        block/blk-iocost.c: In function 'ioc_weight_prfill':
        block/blk-iocost.c:3037:37: error: format '%u' expects argument of type 'unsigned int', but argument 4 has type 'long unsigned int'
      
        block/blk-iocost.c: In function 'ioc_weight_show':
        block/blk-iocost.c:3047:34: error: format '%u' expects argument of type 'unsigned int', but argument 3 has type 'long unsigned int'
      
      So split the anonymous enum with large values to a separate enum, so
      that they don't affect other members.
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36113
      
      Cc: Martin Liska <mliska@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: cgroups@vger.kernel.org
      Cc: linux-block@vger.kernel.org
      Signed-off-by: default avatarJiri Slaby (SUSE) <jirislaby@kernel.org>
      Link: https://lore.kernel.org/r/20221213120826.17446-1-jirislaby@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ff1cc97b