1. 07 Mar, 2022 10 commits
  2. 05 Mar, 2022 1 commit
    • Zhang Wensheng's avatar
      bfq: fix use-after-free in bfq_dispatch_request · ab552fcb
      Zhang Wensheng authored
      KASAN reports a use-after-free report when doing normal scsi-mq test
      
      [69832.239032] ==================================================================
      [69832.241810] BUG: KASAN: use-after-free in bfq_dispatch_request+0x1045/0x44b0
      [69832.243267] Read of size 8 at addr ffff88802622ba88 by task kworker/3:1H/155
      [69832.244656]
      [69832.245007] CPU: 3 PID: 155 Comm: kworker/3:1H Not tainted 5.10.0-10295-g576c6382529e #8
      [69832.246626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
      [69832.249069] Workqueue: kblockd blk_mq_run_work_fn
      [69832.250022] Call Trace:
      [69832.250541]  dump_stack+0x9b/0xce
      [69832.251232]  ? bfq_dispatch_request+0x1045/0x44b0
      [69832.252243]  print_address_description.constprop.6+0x3e/0x60
      [69832.253381]  ? __cpuidle_text_end+0x5/0x5
      [69832.254211]  ? vprintk_func+0x6b/0x120
      [69832.254994]  ? bfq_dispatch_request+0x1045/0x44b0
      [69832.255952]  ? bfq_dispatch_request+0x1045/0x44b0
      [69832.256914]  kasan_report.cold.9+0x22/0x3a
      [69832.257753]  ? bfq_dispatch_request+0x1045/0x44b0
      [69832.258755]  check_memory_region+0x1c1/0x1e0
      [69832.260248]  bfq_dispatch_request+0x1045/0x44b0
      [69832.261181]  ? bfq_bfqq_expire+0x2440/0x2440
      [69832.262032]  ? blk_mq_delay_run_hw_queues+0xf9/0x170
      [69832.263022]  __blk_mq_do_dispatch_sched+0x52f/0x830
      [69832.264011]  ? blk_mq_sched_request_inserted+0x100/0x100
      [69832.265101]  __blk_mq_sched_dispatch_requests+0x398/0x4f0
      [69832.266206]  ? blk_mq_do_dispatch_ctx+0x570/0x570
      [69832.267147]  ? __switch_to+0x5f4/0xee0
      [69832.267898]  blk_mq_sched_dispatch_requests+0xdf/0x140
      [69832.268946]  __blk_mq_run_hw_queue+0xc0/0x270
      [69832.269840]  blk_mq_run_work_fn+0x51/0x60
      [69832.278170]  process_one_work+0x6d4/0xfe0
      [69832.278984]  worker_thread+0x91/0xc80
      [69832.279726]  ? __kthread_parkme+0xb0/0x110
      [69832.280554]  ? process_one_work+0xfe0/0xfe0
      [69832.281414]  kthread+0x32d/0x3f0
      [69832.282082]  ? kthread_park+0x170/0x170
      [69832.282849]  ret_from_fork+0x1f/0x30
      [69832.283573]
      [69832.283886] Allocated by task 7725:
      [69832.284599]  kasan_save_stack+0x19/0x40
      [69832.285385]  __kasan_kmalloc.constprop.2+0xc1/0xd0
      [69832.286350]  kmem_cache_alloc_node+0x13f/0x460
      [69832.287237]  bfq_get_queue+0x3d4/0x1140
      [69832.287993]  bfq_get_bfqq_handle_split+0x103/0x510
      [69832.289015]  bfq_init_rq+0x337/0x2d50
      [69832.289749]  bfq_insert_requests+0x304/0x4e10
      [69832.290634]  blk_mq_sched_insert_requests+0x13e/0x390
      [69832.291629]  blk_mq_flush_plug_list+0x4b4/0x760
      [69832.292538]  blk_flush_plug_list+0x2c5/0x480
      [69832.293392]  io_schedule_prepare+0xb2/0xd0
      [69832.294209]  io_schedule_timeout+0x13/0x80
      [69832.295014]  wait_for_common_io.constprop.1+0x13c/0x270
      [69832.296137]  submit_bio_wait+0x103/0x1a0
      [69832.296932]  blkdev_issue_discard+0xe6/0x160
      [69832.297794]  blk_ioctl_discard+0x219/0x290
      [69832.298614]  blkdev_common_ioctl+0x50a/0x1750
      [69832.304715]  blkdev_ioctl+0x470/0x600
      [69832.305474]  block_ioctl+0xde/0x120
      [69832.306232]  vfs_ioctl+0x6c/0xc0
      [69832.306877]  __se_sys_ioctl+0x90/0xa0
      [69832.307629]  do_syscall_64+0x2d/0x40
      [69832.308362]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [69832.309382]
      [69832.309701] Freed by task 155:
      [69832.310328]  kasan_save_stack+0x19/0x40
      [69832.311121]  kasan_set_track+0x1c/0x30
      [69832.311868]  kasan_set_free_info+0x1b/0x30
      [69832.312699]  __kasan_slab_free+0x111/0x160
      [69832.313524]  kmem_cache_free+0x94/0x460
      [69832.314367]  bfq_put_queue+0x582/0x940
      [69832.315112]  __bfq_bfqd_reset_in_service+0x166/0x1d0
      [69832.317275]  bfq_bfqq_expire+0xb27/0x2440
      [69832.318084]  bfq_dispatch_request+0x697/0x44b0
      [69832.318991]  __blk_mq_do_dispatch_sched+0x52f/0x830
      [69832.319984]  __blk_mq_sched_dispatch_requests+0x398/0x4f0
      [69832.321087]  blk_mq_sched_dispatch_requests+0xdf/0x140
      [69832.322225]  __blk_mq_run_hw_queue+0xc0/0x270
      [69832.323114]  blk_mq_run_work_fn+0x51/0x60
      [69832.323942]  process_one_work+0x6d4/0xfe0
      [69832.324772]  worker_thread+0x91/0xc80
      [69832.325518]  kthread+0x32d/0x3f0
      [69832.326205]  ret_from_fork+0x1f/0x30
      [69832.326932]
      [69832.338297] The buggy address belongs to the object at ffff88802622b968
      [69832.338297]  which belongs to the cache bfq_queue of size 512
      [69832.340766] The buggy address is located 288 bytes inside of
      [69832.340766]  512-byte region [ffff88802622b968, ffff88802622bb68)
      [69832.343091] The buggy address belongs to the page:
      [69832.344097] page:ffffea0000988a00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88802622a528 pfn:0x26228
      [69832.346214] head:ffffea0000988a00 order:2 compound_mapcount:0 compound_pincount:0
      [69832.347719] flags: 0x1fffff80010200(slab|head)
      [69832.348625] raw: 001fffff80010200 ffffea0000dbac08 ffff888017a57650 ffff8880179fe840
      [69832.354972] raw: ffff88802622a528 0000000000120008 00000001ffffffff 0000000000000000
      [69832.356547] page dumped because: kasan: bad access detected
      [69832.357652]
      [69832.357970] Memory state around the buggy address:
      [69832.358926]  ffff88802622b980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [69832.360358]  ffff88802622ba00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [69832.361810] >ffff88802622ba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [69832.363273]                       ^
      [69832.363975]  ffff88802622bb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
      [69832.375960]  ffff88802622bb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [69832.377405] ==================================================================
      
      In bfq_dispatch_requestfunction, it may have function call:
      
      bfq_dispatch_request
      	__bfq_dispatch_request
      		bfq_select_queue
      			bfq_bfqq_expire
      				__bfq_bfqd_reset_in_service
      					bfq_put_queue
      						kmem_cache_free
      In this function call, in_serv_queue has beed expired and meet the
      conditions to free. In the function bfq_dispatch_request, the address
      of in_serv_queue pointing to has been released. For getting the value
      of idle_timer_disabled, it will get flags value from the address which
      in_serv_queue pointing to, then the problem of use-after-free happens;
      
      Fix the problem by check in_serv_queue == bfqd->in_service_queue, to
      get the value of idle_timer_disabled if in_serve_queue is equel to
      bfqd->in_service_queue. If the space of in_serv_queue pointing has
      been released, this judge will aviod use-after-free problem.
      And if in_serv_queue may be expired or finished, the idle_timer_disabled
      will be false which would not give effects to bfq_update_dispatch_stats.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZhang Wensheng <zhangwensheng5@huawei.com>
      Link: https://lore.kernel.org/r/20220303070334.3020168-1-zhangwensheng5@huawei.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ab552fcb
  3. 28 Feb, 2022 3 commits
    • Eric Biggers's avatar
      blk-crypto: show crypto capabilities in sysfs · 20f01f16
      Eric Biggers authored
      Add sysfs files that expose the inline encryption capabilities of
      request queues:
      
      	/sys/block/$disk/queue/crypto/max_dun_bits
      	/sys/block/$disk/queue/crypto/modes/$mode
      	/sys/block/$disk/queue/crypto/num_keyslots
      
      Userspace can use these new files to decide what encryption settings to
      use, or whether to use inline encryption at all.  This also brings the
      crypto capabilities in line with the other queue properties, which are
      already discoverable via the queue directory in sysfs.
      
      Design notes:
      
        - Place the new files in a new subdirectory "crypto" to group them
          together and to avoid complicating the main "queue" directory.  This
          also makes it possible to replace "crypto" with a symlink later if
          we ever make the blk_crypto_profiles into real kobjects (see below).
      
        - It was necessary to define a new kobject that corresponds to the
          crypto subdirectory.  For now, this kobject just contains a pointer
          to the blk_crypto_profile.  Note that multiple queues (and hence
          multiple such kobjects) may refer to the same blk_crypto_profile.
      
          An alternative design would more closely match the current kernel
          data structures: the blk_crypto_profile could be a kobject itself,
          located directly under the host controller device's kobject, while
          /sys/block/$disk/queue/crypto would be a symlink to it.
      
          I decided not to do that for now because it would require a lot more
          changes, such as no longer embedding blk_crypto_profile in other
          structures, and also because I'm not sure we can rule out moving the
          crypto capabilities into 'struct queue_limits' in the future.  (Even
          if multiple queues share the same crypto engine, maybe the supported
          data unit sizes could differ due to other queue properties.)  It
          would also still be possible to switch to that design later without
          breaking userspace, by replacing the directory with a symlink.
      
        - Use "max_dun_bits" instead of "max_dun_bytes".  Currently, the
          kernel internally stores this value in bytes, but that's an
          implementation detail.  It probably makes more sense to talk about
          this value in bits, and choosing bits is more future-proof.
      
        - "modes" is a sub-subdirectory, since there may be multiple supported
          crypto modes, sysfs is supposed to have one value per file, and it
          makes sense to group all the mode files together.
      
        - Each mode had to be named.  The crypto API names like "xts(aes)" are
          not appropriate because they don't specify the key size.  Therefore,
          I assigned new names.  The exact names chosen are arbitrary, but
          they happen to match the names used in log messages in fs/crypto/.
      
        - The "num_keyslots" file is a bit different from the others in that
          it is only useful to know for performance reasons.  However, it's
          included as it can still be useful.  For example, a user might not
          want to use inline encryption if there aren't very many keyslots.
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Link: https://lore.kernel.org/r/20220124215938.2769-4-ebiggers@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      20f01f16
    • Eric Biggers's avatar
      block: don't delete queue kobject before its children · 0f692882
      Eric Biggers authored
      kobjects aren't supposed to be deleted before their child kobjects are
      deleted.  Apparently this is usually benign; however, a WARN will be
      triggered if one of the child kobjects has a named attribute group:
      
          sysfs group 'modes' not found for kobject 'crypto'
          WARNING: CPU: 0 PID: 1 at fs/sysfs/group.c:278 sysfs_remove_group+0x72/0x80
          ...
          Call Trace:
            sysfs_remove_groups+0x29/0x40 fs/sysfs/group.c:312
            __kobject_del+0x20/0x80 lib/kobject.c:611
            kobject_cleanup+0xa4/0x140 lib/kobject.c:696
            kobject_release lib/kobject.c:736 [inline]
            kref_put include/linux/kref.h:65 [inline]
            kobject_put+0x53/0x70 lib/kobject.c:753
            blk_crypto_sysfs_unregister+0x10/0x20 block/blk-crypto-sysfs.c:159
            blk_unregister_queue+0xb0/0x110 block/blk-sysfs.c:962
            del_gendisk+0x117/0x250 block/genhd.c:610
      
      Fix this by moving the kobject_del() and the corresponding
      kobject_uevent() to the correct place.
      
      Fixes: 2c2086af ("block: Protect less code with sysfs_lock in blk_{un,}register_queue()")
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220124215938.2769-3-ebiggers@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0f692882
    • Eric Biggers's avatar
      block: simplify calling convention of elv_unregister_queue() · f5ec592d
      Eric Biggers authored
      Make elv_unregister_queue() a no-op if q->elevator is NULL or is not
      registered.
      
      This simplifies the existing callers, as well as the future caller in
      the error path of blk_register_queue().
      
      Also don't bother checking whether q is NULL, since it never is.
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220124215938.2769-2-ebiggers@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f5ec592d
  4. 27 Feb, 2022 2 commits
  5. 22 Feb, 2022 1 commit
  6. 18 Feb, 2022 3 commits
    • Yu Kuai's avatar
      block, bfq: don't move oom_bfqq · 8410f709
      Yu Kuai authored
      Our test report a UAF:
      
      [ 2073.019181] ==================================================================
      [ 2073.019188] BUG: KASAN: use-after-free in __bfq_put_async_bfqq+0xa0/0x168
      [ 2073.019191] Write of size 8 at addr ffff8000ccf64128 by task rmmod/72584
      [ 2073.019192]
      [ 2073.019196] CPU: 0 PID: 72584 Comm: rmmod Kdump: loaded Not tainted 4.19.90-yk #5
      [ 2073.019198] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      [ 2073.019200] Call trace:
      [ 2073.019203]  dump_backtrace+0x0/0x310
      [ 2073.019206]  show_stack+0x28/0x38
      [ 2073.019210]  dump_stack+0xec/0x15c
      [ 2073.019216]  print_address_description+0x68/0x2d0
      [ 2073.019220]  kasan_report+0x238/0x2f0
      [ 2073.019224]  __asan_store8+0x88/0xb0
      [ 2073.019229]  __bfq_put_async_bfqq+0xa0/0x168
      [ 2073.019233]  bfq_put_async_queues+0xbc/0x208
      [ 2073.019236]  bfq_pd_offline+0x178/0x238
      [ 2073.019240]  blkcg_deactivate_policy+0x1f0/0x420
      [ 2073.019244]  bfq_exit_queue+0x128/0x178
      [ 2073.019249]  blk_mq_exit_sched+0x12c/0x160
      [ 2073.019252]  elevator_exit+0xc8/0xd0
      [ 2073.019256]  blk_exit_queue+0x50/0x88
      [ 2073.019259]  blk_cleanup_queue+0x228/0x3d8
      [ 2073.019267]  null_del_dev+0xfc/0x1e0 [null_blk]
      [ 2073.019274]  null_exit+0x90/0x114 [null_blk]
      [ 2073.019278]  __arm64_sys_delete_module+0x358/0x5a0
      [ 2073.019282]  el0_svc_common+0xc8/0x320
      [ 2073.019287]  el0_svc_handler+0xf8/0x160
      [ 2073.019290]  el0_svc+0x10/0x218
      [ 2073.019291]
      [ 2073.019294] Allocated by task 14163:
      [ 2073.019301]  kasan_kmalloc+0xe0/0x190
      [ 2073.019305]  kmem_cache_alloc_node_trace+0x1cc/0x418
      [ 2073.019308]  bfq_pd_alloc+0x54/0x118
      [ 2073.019313]  blkcg_activate_policy+0x250/0x460
      [ 2073.019317]  bfq_create_group_hierarchy+0x38/0x110
      [ 2073.019321]  bfq_init_queue+0x6d0/0x948
      [ 2073.019325]  blk_mq_init_sched+0x1d8/0x390
      [ 2073.019330]  elevator_switch_mq+0x88/0x170
      [ 2073.019334]  elevator_switch+0x140/0x270
      [ 2073.019338]  elv_iosched_store+0x1a4/0x2a0
      [ 2073.019342]  queue_attr_store+0x90/0xe0
      [ 2073.019348]  sysfs_kf_write+0xa8/0xe8
      [ 2073.019351]  kernfs_fop_write+0x1f8/0x378
      [ 2073.019359]  __vfs_write+0xe0/0x360
      [ 2073.019363]  vfs_write+0xf0/0x270
      [ 2073.019367]  ksys_write+0xdc/0x1b8
      [ 2073.019371]  __arm64_sys_write+0x50/0x60
      [ 2073.019375]  el0_svc_common+0xc8/0x320
      [ 2073.019380]  el0_svc_handler+0xf8/0x160
      [ 2073.019383]  el0_svc+0x10/0x218
      [ 2073.019385]
      [ 2073.019387] Freed by task 72584:
      [ 2073.019391]  __kasan_slab_free+0x120/0x228
      [ 2073.019394]  kasan_slab_free+0x10/0x18
      [ 2073.019397]  kfree+0x94/0x368
      [ 2073.019400]  bfqg_put+0x64/0xb0
      [ 2073.019404]  bfqg_and_blkg_put+0x90/0xb0
      [ 2073.019408]  bfq_put_queue+0x220/0x228
      [ 2073.019413]  __bfq_put_async_bfqq+0x98/0x168
      [ 2073.019416]  bfq_put_async_queues+0xbc/0x208
      [ 2073.019420]  bfq_pd_offline+0x178/0x238
      [ 2073.019424]  blkcg_deactivate_policy+0x1f0/0x420
      [ 2073.019429]  bfq_exit_queue+0x128/0x178
      [ 2073.019433]  blk_mq_exit_sched+0x12c/0x160
      [ 2073.019437]  elevator_exit+0xc8/0xd0
      [ 2073.019440]  blk_exit_queue+0x50/0x88
      [ 2073.019443]  blk_cleanup_queue+0x228/0x3d8
      [ 2073.019451]  null_del_dev+0xfc/0x1e0 [null_blk]
      [ 2073.019459]  null_exit+0x90/0x114 [null_blk]
      [ 2073.019462]  __arm64_sys_delete_module+0x358/0x5a0
      [ 2073.019467]  el0_svc_common+0xc8/0x320
      [ 2073.019471]  el0_svc_handler+0xf8/0x160
      [ 2073.019474]  el0_svc+0x10/0x218
      [ 2073.019475]
      [ 2073.019479] The buggy address belongs to the object at ffff8000ccf63f00
       which belongs to the cache kmalloc-1024 of size 1024
      [ 2073.019484] The buggy address is located 552 bytes inside of
       1024-byte region [ffff8000ccf63f00, ffff8000ccf64300)
      [ 2073.019486] The buggy address belongs to the page:
      [ 2073.019492] page:ffff7e000333d800 count:1 mapcount:0 mapping:ffff8000c0003a00 index:0x0 compound_mapcount: 0
      [ 2073.020123] flags: 0x7ffff0000008100(slab|head)
      [ 2073.020403] raw: 07ffff0000008100 ffff7e0003334c08 ffff7e00001f5a08 ffff8000c0003a00
      [ 2073.020409] raw: 0000000000000000 00000000001c001c 00000001ffffffff 0000000000000000
      [ 2073.020411] page dumped because: kasan: bad access detected
      [ 2073.020412]
      [ 2073.020414] Memory state around the buggy address:
      [ 2073.020420]  ffff8000ccf64000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 2073.020424]  ffff8000ccf64080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 2073.020428] >ffff8000ccf64100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 2073.020430]                                   ^
      [ 2073.020434]  ffff8000ccf64180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 2073.020438]  ffff8000ccf64200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 2073.020439] ==================================================================
      
      The same problem exist in mainline as well.
      
      This is because oom_bfqq is moved to a non-root group, thus root_group
      is freed earlier.
      
      Thus fix the problem by don't move oom_bfqq.
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarPaolo Valente <paolo.valente@linaro.org>
      Link: https://lore.kernel.org/r/20220129015924.3958918-4-yukuai3@huawei.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8410f709
    • Yu Kuai's avatar
      block, bfq: avoid moving bfqq to it's parent bfqg · c5e4cb0f
      Yu Kuai authored
      Moving bfqq to it's parent bfqg is pointless.
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Link: https://lore.kernel.org/r/20220129015924.3958918-3-yukuai3@huawei.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c5e4cb0f
    • Yu Kuai's avatar
      block, bfq: cleanup bfq_bfqq_to_bfqg() · 43a4b1fe
      Yu Kuai authored
      Use bfq_group() instead, which do the same thing.
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarPaolo Valente <paolo.valente@linaro.org>
      Link: https://lore.kernel.org/r/20220129015924.3958918-2-yukuai3@huawei.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      43a4b1fe
  7. 17 Feb, 2022 20 commits