1. 14 Nov, 2018 2 commits
    • Ming Lei's avatar
      SCSI: fix queue cleanup race before queue initialization is done · 8dc765d4
      Ming Lei authored
      c2856ae2 ("blk-mq: quiesce queue before freeing queue") has
      already fixed this race, however the implied synchronize_rcu()
      in blk_mq_quiesce_queue() can slow down LUN probe a lot, so caused
      performance regression.
      
      Then 1311326c ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
      tried to quiesce queue for avoiding unnecessary synchronize_rcu()
      only when queue initialization is done, because it is usual to see
      lots of inexistent LUNs which need to be probed.
      
      However, turns out it isn't safe to quiesce queue only when queue
      initialization is done. Because when one SCSI command is completed,
      the user of sending command can be waken up immediately, then the
      scsi device may be removed, meantime the run queue in scsi_end_request()
      is still in-progress, so kernel panic can be caused.
      
      In Red Hat QE lab, there are several reports about this kind of kernel
      panic triggered during kernel booting.
      
      This patch tries to address the issue by grabing one queue usage
      counter during freeing one request and the following run queue.
      
      Fixes: 1311326c ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
      Cc: Andrew Jones <drjones@redhat.com>
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: linux-scsi@vger.kernel.org
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
      Cc: stable <stable@vger.kernel.org>
      Cc: jianchao.wang <jianchao.w.wang@oracle.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8dc765d4
    • Dave Chinner's avatar
      block: fix 32 bit overflow in __blkdev_issue_discard() · 4800bf7b
      Dave Chinner authored
      A discard cleanup merged into 4.20-rc2 causes fstests xfs/259 to
      fall into an endless loop in the discard code. The test is creating
      a device that is exactly 2^32 sectors in size to test mkfs boundary
      conditions around the 32 bit sector overflow region.
      
      mkfs issues a discard for the entire device size by default, and
      hence this throws a sector count of 2^32 into
      blkdev_issue_discard(). It takes the number of sectors to discard as
      a sector_t - a 64 bit value.
      
      The commit ba5d7385 ("block: cleanup __blkdev_issue_discard")
      takes this sector count and casts it to a 32 bit value before
      comapring it against the maximum allowed discard size the device
      has. This truncates away the upper 32 bits, and so if the lower 32
      bits of the sector count is zero, it starts issuing discards of
      length 0. This causes the code to fall into an endless loop, issuing
      a zero length discards over and over again on the same sector.
      
      Fixes: ba5d7385 ("block: cleanup __blkdev_issue_discard")
      Tested-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      
      Killed pointless WARN_ON().
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4800bf7b
  2. 12 Nov, 2018 3 commits
  3. 10 Nov, 2018 1 commit
    • Jens Axboe's avatar
      floppy: fix race condition in __floppy_read_block_0() · de7b75d8
      Jens Axboe authored
      LKP recently reported a hang at bootup in the floppy code:
      
      [  245.678853] INFO: task mount:580 blocked for more than 120 seconds.
      [  245.679906]       Tainted: G                T 4.19.0-rc6-00172-ga9f38e1d #1
      [  245.680959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  245.682181] mount           D 6372   580      1 0x00000004
      [  245.683023] Call Trace:
      [  245.683425]  __schedule+0x2df/0x570
      [  245.683975]  schedule+0x2d/0x80
      [  245.684476]  schedule_timeout+0x19d/0x330
      [  245.685090]  ? wait_for_common+0xa5/0x170
      [  245.685735]  wait_for_common+0xac/0x170
      [  245.686339]  ? do_sched_yield+0x90/0x90
      [  245.686935]  wait_for_completion+0x12/0x20
      [  245.687571]  __floppy_read_block_0+0xfb/0x150
      [  245.688244]  ? floppy_resume+0x40/0x40
      [  245.688844]  floppy_revalidate+0x20f/0x240
      [  245.689486]  check_disk_change+0x43/0x60
      [  245.690087]  floppy_open+0x1ea/0x360
      [  245.690653]  __blkdev_get+0xb4/0x4d0
      [  245.691212]  ? blkdev_get+0x1db/0x370
      [  245.691777]  blkdev_get+0x1f3/0x370
      [  245.692351]  ? path_put+0x15/0x20
      [  245.692871]  ? lookup_bdev+0x4b/0x90
      [  245.693539]  blkdev_get_by_path+0x3d/0x80
      [  245.694165]  mount_bdev+0x2a/0x190
      [  245.694695]  squashfs_mount+0x10/0x20
      [  245.695271]  ? squashfs_alloc_inode+0x30/0x30
      [  245.695960]  mount_fs+0xf/0x90
      [  245.696451]  vfs_kern_mount+0x43/0x130
      [  245.697036]  do_mount+0x187/0xc40
      [  245.697563]  ? memdup_user+0x28/0x50
      [  245.698124]  ksys_mount+0x60/0xc0
      [  245.698639]  sys_mount+0x19/0x20
      [  245.699167]  do_int80_syscall_32+0x61/0x130
      [  245.699813]  entry_INT80_32+0xc7/0xc7
      
      showing that we never complete that read request. The reason is that
      the completion setup is racy - it initializes the completion event
      AFTER submitting the IO, which means that the IO could complete
      before/during the init. If it does, we are passing garbage to
      complete() and we may sleep forever waiting for the event to
      occur.
      
      Fixes: 7b7b68bb ("floppy: bail out in open() if drive is not responding to block0 read")
      Reviewed-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      de7b75d8
  4. 09 Nov, 2018 10 commits
  5. 08 Nov, 2018 12 commits
  6. 07 Nov, 2018 11 commits
  7. 06 Nov, 2018 1 commit