1. 09 Mar, 2018 3 commits
    • Maurizio Lombardi's avatar
      cdrom: do not call check_disk_change() inside cdrom_open() · 2bbea6e1
      Maurizio Lombardi authored
      when mounting an ISO filesystem sometimes (very rarely)
      the system hangs because of a race condition between two tasks.
      
      PID: 6766   TASK: ffff88007b2a6dd0  CPU: 0   COMMAND: "mount"
       #0 [ffff880078447ae0] __schedule at ffffffff8168d605
       #1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
       #2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
       #3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
       #4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
       #5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
       #6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
       #7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
       #8 [ffff880078447da8] mount_bdev at ffffffff81202570
       #9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
      #10 [ffff880078447e28] mount_fs at ffffffff81202d09
      #11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
      #12 [ffff880078447ea8] do_mount at ffffffff81220fee
      #13 [ffff880078447f28] sys_mount at ffffffff812218d6
      #14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
          RIP: 00007fd9ea914e9a  RSP: 00007ffd5d9bf648  RFLAGS: 00010246
          RAX: 00000000000000a5  RBX: ffffffff81698c49  RCX: 0000000000000010
          RDX: 00007fd9ec2bc210  RSI: 00007fd9ec2bc290  RDI: 00007fd9ec2bcf30
          RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000010
          R10: 00000000c0ed0001  R11: 0000000000000206  R12: 00007fd9ec2bc040
          R13: 00007fd9eb6b2380  R14: 00007fd9ec2bc210  R15: 00007fd9ec2bcf30
          ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b
      
      This task was trying to mount the cdrom.  It allocated and configured a
      super_block struct and owned the write-lock for the super_block->s_umount
      rwsem. While exclusively owning the s_umount lock, it called
      sr_block_ioctl and waited to acquire the global sr_mutex lock.
      
      PID: 6785   TASK: ffff880078720fb0  CPU: 0   COMMAND: "systemd-udevd"
       #0 [ffff880078417898] __schedule at ffffffff8168d605
       #1 [ffff880078417900] schedule at ffffffff8168dc59
       #2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
       #3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
       #4 [ffff8800784179d0] down_read at ffffffff8168cde0
       #5 [ffff8800784179e8] get_super at ffffffff81201cc7
       #6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
       #7 [ffff880078417a40] flush_disk at ffffffff8123a94b
       #8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
       #9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
      #10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
      #11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
      #12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
      #13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
      #14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
      #15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
      #16 [ffff880078417d00] do_last at ffffffff8120d53d
      #17 [ffff880078417db0] path_openat at ffffffff8120e6b2
      #18 [ffff880078417e48] do_filp_open at ffffffff8121082b
      #19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
      #20 [ffff880078417f70] sys_open at ffffffff811fde4e
      #21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
          RIP: 00007f29438b0c20  RSP: 00007ffc76624b78  RFLAGS: 00010246
          RAX: 0000000000000002  RBX: ffffffff81698c49  RCX: 0000000000000000
          RDX: 00007f2944a5fa70  RSI: 00000000000a0800  RDI: 00007f2944a5fa70
          RBP: 00007f2944a5f540   R8: 0000000000000000   R9: 0000000000000020
          R10: 00007f2943614c40  R11: 0000000000000246  R12: ffffffff811fde4e
          R13: ffff880078417f78  R14: 000000000000000c  R15: 00007f2944a4b010
          ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b
      
      This task tried to open the cdrom device, the sr_block_open function
      acquired the global sr_mutex lock. The call to check_disk_change()
      then saw an event flag indicating a possible media change and tried
      to flush any cached data for the device.
      As part of the flush, it tried to acquire the super_block->s_umount
      lock associated with the cdrom device.
      This was the same super_block as created and locked by the previous task.
      
      The first task acquires the s_umount lock and then the sr_mutex_lock;
      the second task acquires the sr_mutex_lock and then the s_umount lock.
      
      This patch fixes the issue by moving check_disk_change() out of
      cdrom_open() and let the caller take care of it.
      Signed-off-by: default avatarMaurizio Lombardi <mlombard@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2bbea6e1
    • Randy Dunlap's avatar
      Documentation/cdrom: fix German sharp s in LaTex · da5ff37c
      Randy Dunlap authored
      Apparently the LaTex abbreviation for the German "sharp s" (ß)
      (Unicode U+00DF) has changed from {\sz} to {\ss}.  With {\sz},
      I get this error at line 1016 (line number after another patch):
      
      ! Undefined control sequence.
      l.1016 ...nel~2.0.  Further thanks to Heiko Ei{\sz
                                                        }feldt,
      
      This is fixed by changing the {\sz} to {\ss}.
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      da5ff37c
    • Randy Dunlap's avatar
      Documentation/cdrom: update cdrom-standard.tex for kernel changes · 0a703c1f
      Randy Dunlap authored
      Documentation updates for Documentation/cdrom/cdrom-standard.tex:
      
      cdrom_device_ops:
      - add check_events() and generic_packet()
      
      cdrom_device_info:
      - add one 'const' modifier
      - correct some field descriptions
      - add some missing fields
      - drop 'kdev_t dev;' field
      
      Also drop <n_discs> sentence from documentation because it is not
      referenced anywhere in the kernel header or C files.
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0a703c1f
  2. 08 Mar, 2018 11 commits
  3. 07 Mar, 2018 1 commit
  4. 06 Mar, 2018 1 commit
  5. 01 Mar, 2018 2 commits
    • Arnd Bergmann's avatar
      staging: rts5208: rename SG_END macro · b5d013bc
      Arnd Bergmann authored
      A change to the generic scatterlist code caused a conflict with
      the rtsx card reader driver:
      
      In file included from drivers/staging/rts5208/rtsx.h:180,
                       from drivers/staging/rts5208/rtsx.c:28:
      drivers/staging/rts5208/rtsx_chip.h:343: error: "SG_END" redefined [-Werror]
      
      This changes one instance of the driver to prefix SG_END and
      related constants.
      
      Fixes: 723fbf56 ("lib/scatterlist: Add SG_CHAIN and SG_END macros for LSB encodings")
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Reviewed-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b5d013bc
    • Arnd Bergmann's avatar
      misc: rtsx: rename SG_END macro · f16ee7c7
      Arnd Bergmann authored
      A change to the generic scatterlist code caused a conflict with
      the rtsx card reader driver:
      
      In file included from drivers/misc/cardreader/rtsx_pcr.c:32:
      include/linux/rtsx_pci.h:40: error: "SG_END" redefined [-Werror]
      
      This changes one instance of the driver to prefix SG_END and
      related constants.
      
      Fixes: 723fbf56 ("lib/scatterlist: Add SG_CHAIN and SG_END macros for LSB encodings")
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f16ee7c7
  6. 28 Feb, 2018 16 commits
  7. 27 Feb, 2018 3 commits
    • Gustavo A. R. Silva's avatar
      nbd: fix return value in error handling path · 0979962f
      Gustavo A. R. Silva authored
      It seems that the proper value to return in this particular case is the
      one contained into variable new_index instead of ret.
      
      Addresses-Coverity-ID: 1465148 ("Copy-paste error")
      Fixes: e46c7287 ("nbd: add a basic netlink interface")
      Reviewed-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0979962f
    • Tang Junhui's avatar
      bcache: fix kcrashes with fio in RAID5 backend dev · 60eb34ec
      Tang Junhui authored
      Kernel crashed when run fio in a RAID5 backend bcache device, the call
      trace is bellow:
      [  440.012034] kernel BUG at block/blk-ioc.c:146!
      [  440.012696] invalid opcode: 0000 [#1] SMP NOPTI
      [  440.026537] CPU: 2 PID: 2205 Comm: md127_raid5 Not tainted 4.15.0 #8
      [  440.027441] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16
      /2015
      [  440.028615] RIP: 0010:put_io_context+0x8b/0x90
      [  440.029246] RSP: 0018:ffffa8c882b43af8 EFLAGS: 00010246
      [  440.029990] RAX: 0000000000000000 RBX: ffffa8c88294fca0 RCX: 0000000000
      0f4240
      [  440.031006] RDX: 0000000000000004 RSI: 0000000000000286 RDI: ffffa8c882
      94fca0
      [  440.032030] RBP: ffffa8c882b43b10 R08: 0000000000000003 R09: ffff949cb8
      0c1700
      [  440.033206] R10: 0000000000000104 R11: 000000000000b71c R12: 00000000000
      01000
      [  440.034222] R13: 0000000000000000 R14: ffff949cad84db70 R15: ffff949cb11
      bd1e0
      [  440.035239] FS:  0000000000000000(0000) GS:ffff949cba280000(0000) knlGS:
      0000000000000000
      [  440.060190] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  440.084967] CR2: 00007ff0493ef000 CR3: 00000002f1e0a002 CR4: 00000000001
      606e0
      [  440.110498] Call Trace:
      [  440.135443]  bio_disassociate_task+0x1b/0x60
      [  440.160355]  bio_free+0x1b/0x60
      [  440.184666]  bio_put+0x23/0x30
      [  440.208272]  search_free+0x23/0x40 [bcache]
      [  440.231448]  cached_dev_write_complete+0x31/0x70 [bcache]
      [  440.254468]  closure_put+0xb6/0xd0 [bcache]
      [  440.277087]  request_endio+0x30/0x40 [bcache]
      [  440.298703]  bio_endio+0xa1/0x120
      [  440.319644]  handle_stripe+0x418/0x2270 [raid456]
      [  440.340614]  ? load_balance+0x17b/0x9c0
      [  440.360506]  handle_active_stripes.isra.58+0x387/0x5a0 [raid456]
      [  440.380675]  ? __release_stripe+0x15/0x20 [raid456]
      [  440.400132]  raid5d+0x3ed/0x5d0 [raid456]
      [  440.419193]  ? schedule+0x36/0x80
      [  440.437932]  ? schedule_timeout+0x1d2/0x2f0
      [  440.456136]  md_thread+0x122/0x150
      [  440.473687]  ? wait_woken+0x80/0x80
      [  440.491411]  kthread+0x102/0x140
      [  440.508636]  ? find_pers+0x70/0x70
      [  440.524927]  ? kthread_associate_blkcg+0xa0/0xa0
      [  440.541791]  ret_from_fork+0x35/0x40
      [  440.558020] Code: c2 48 00 5b 41 5c 41 5d 5d c3 48 89 c6 4c 89 e7 e8 bb c2
      48 00 48 8b 3d bc 36 4b 01 48 89 de e8 7c f7 e0 ff 5b 41 5c 41 5d 5d c3 <0f> 0b
      0f 1f 00 0f 1f 44 00 00 55 48 8d 47 b8 48 89 e5 41 57 41
      [  440.610020] RIP: put_io_context+0x8b/0x90 RSP: ffffa8c882b43af8
      [  440.628575] ---[ end trace a1fd79d85643a73e ]--
      
      All the crash issue happened when a bypass IO coming, in such scenario
      s->iop.bio is pointed to the s->orig_bio. In search_free(), it finishes the
      s->orig_bio by calling bio_complete(), and after that, s->iop.bio became
      invalid, then kernel would crash when calling bio_put(). Maybe its upper
      layer's faulty, since bio should not be freed before we calling bio_put(),
      but we'd better calling bio_put() first before calling bio_complete() to
      notify upper layer ending this bio.
      
      This patch moves bio_complete() under bio_put() to avoid kernel crash.
      
      [mlyle: fixed commit subject for character limits]
      Reported-by: default avatarMatthias Ferdinand <bcache@mfedv.net>
      Tested-by: default avatarMatthias Ferdinand <bcache@mfedv.net>
      Signed-off-by: default avatarTang Junhui <tang.junhui@zte.com.cn>
      Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      60eb34ec
    • Coly Li's avatar
      bcache: correct flash only vols (check all uuids) · 02aa8a8b
      Coly Li authored
      Commit 2831231d ("bcache: reduce cache_set devices iteration by
      devices_max_used") adds c->devices_max_used to reduce iteration of
      c->uuids elements, this value is updated in bcache_device_attach().
      
      But for flash only volume, when calling flash_devs_run(), the function
      bcache_device_attach() is not called yet and c->devices_max_used is not
      updated. The unexpected result is, the flash only volume won't be run
      by flash_devs_run().
      
      This patch fixes the issue by iterate all c->uuids elements in
      flash_devs_run(). c->devices_max_used will be updated properly when
      bcache_device_attach() gets called.
      
      [mlyle: commit subject edited for character limit]
      
      Fixes: 2831231d ("bcache: reduce cache_set devices iteration by devices_max_used")
      Reported-by: default avatarTang Junhui <tang.junhui@zte.com.cn>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarMichael Lyle <mlyle@lyle.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      02aa8a8b
  8. 26 Feb, 2018 3 commits
    • Eric Biggers's avatar
      blktrace_api.h: fix comment for struct blk_user_trace_setup · 9c722588
      Eric Biggers authored
      'struct blk_user_trace_setup' is passed to BLKTRACESETUP, not
      BLKTRACESTART.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9c722588
    • Jan Kara's avatar
      blockdev: Avoid two active bdev inodes for one device · 560e7cb2
      Jan Kara authored
      When blkdev_open() races with device removal and creation it can happen
      that unhashed bdev inode gets associated with newly created gendisk
      like:
      
      CPU0					CPU1
      blkdev_open()
        bdev = bd_acquire()
      					del_gendisk()
      					  bdev_unhash_inode(bdev);
      					remove device
      					create new device with the same number
        __blkdev_get()
          disk = get_gendisk()
            - gets reference to gendisk of the new device
      
      Now another blkdev_open() will not find original 'bdev' as it got
      unhashed, create a new one and associate it with the same 'disk' at
      which point problems start as we have two independent page caches for
      one device.
      
      Fix the problem by verifying that the bdev inode didn't get unhashed
      before we acquired gendisk reference. That way we make sure gendisk can
      get associated only with visible bdev inodes.
      Tested-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      560e7cb2
    • Jan Kara's avatar
      genhd: Fix BUG in blkdev_open() · 56c0908c
      Jan Kara authored
      When two blkdev_open() calls for a partition race with device removal
      and recreation, we can hit BUG_ON(!bd_may_claim(bdev, whole, holder)) in
      blkdev_open(). The race can happen as follows:
      
      CPU0				CPU1			CPU2
      							del_gendisk()
      							  bdev_unhash_inode(part1);
      
      blkdev_open(part1, O_EXCL)	blkdev_open(part1, O_EXCL)
        bdev = bd_acquire()		  bdev = bd_acquire()
        blkdev_get(bdev)
          bd_start_claiming(bdev)
            - finds old inode 'whole'
            bd_prepare_to_claim() -> 0
      							  bdev_unhash_inode(whole);
      							<device removed>
      							<new device under same
      							 number created>
      				  blkdev_get(bdev);
      				    bd_start_claiming(bdev)
      				      - finds new inode 'whole'
      				      bd_prepare_to_claim()
      					- this also succeeds as we have
      					  different 'whole' here...
      					- bad things happen now as we
      					  have two exclusive openers of
      					  the same bdev
      
      The problem here is that block device opens can see various intermediate
      states while gendisk is shutting down and then being recreated.
      
      We fix the problem by introducing new lookup_sem in gendisk that
      synchronizes gendisk deletion with get_gendisk() and furthermore by
      making sure that get_gendisk() does not return gendisk that is being (or
      has been) deleted. This makes sure that once we ever manage to look up
      newly created bdev inode, we are also guaranteed that following
      get_gendisk() will either return failure (and we fail open) or it
      returns gendisk for the new device and following bdget_disk() will
      return new bdev inode (i.e., blkdev_open() follows the path as if it is
      completely run after new device is created).
      Reported-and-analyzed-by: default avatarHou Tao <houtao1@huawei.com>
      Tested-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      56c0908c