1. 14 Mar, 2012 1 commit
    • Xiaotian Feng's avatar
      block: fix ioc leak in put_io_context · ff8c1474
      Xiaotian Feng authored
      When put_io_context is called, if ioc->icq_list is empty and refcount
      is 1, kernel will not free the ioc.
      
      This is caught by following kmemleak:
      
      unreferenced object 0xffff880036349fe0 (size 216):
        comm "sh", pid 2137, jiffies 4294931140 (age 290579.412s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          01 00 01 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
        backtrace:
          [<ffffffff8169f926>] kmemleak_alloc+0x26/0x50
          [<ffffffff81195a9c>] kmem_cache_alloc_node+0x1cc/0x2a0
          [<ffffffff81356b67>] create_io_context_slowpath+0x27/0x130
          [<ffffffff81356d2b>] get_task_io_context+0xbb/0xf0
          [<ffffffff81055f0e>] copy_process+0x188e/0x18b0
          [<ffffffff8105609b>] do_fork+0x11b/0x420
          [<ffffffff810247f8>] sys_clone+0x28/0x30
          [<ffffffff816d3373>] stub_clone+0x13/0x20
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      ioc should be freed if ioc->icq_list is empty.
      Signed-off-by: default avatarXiaotian Feng <dannyfeng@tencent.com>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ff8c1474
  2. 03 Mar, 2012 1 commit
  3. 02 Mar, 2012 6 commits
    • Alan Stern's avatar
      Block: use a freezable workqueue for disk-event polling · 62d3c543
      Alan Stern authored
      This patch (as1519) fixes a bug in the block layer's disk-events
      polling.  The polling is done by a work routine queued on the
      system_nrt_wq workqueue.  Since that workqueue isn't freezable, the
      polling continues even in the middle of a system sleep transition.
      
      Obviously, polling a suspended drive for media changes and such isn't
      a good thing to do; in the case of USB mass-storage devices it can
      lead to real problems requiring device resets and even re-enumeration.
      
      The patch fixes things by creating a new system-wide, non-reentrant,
      freezable workqueue and using it for disk-events polling.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      CC: <stable@kernel.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      62d3c543
    • Danny Kukawka's avatar
      drivers/block/DAC960: fix -Wuninitialized warning · cecd353a
      Danny Kukawka authored
      Set CommandMailbox with memset before use it. Fix for:
      
      drivers/block/DAC960.c: In function ‘DAC960_V1_EnableMemoryMailboxInterface’:
      arch/x86/include/asm/io.h:61:1: warning: ‘CommandMailbox.Bytes[12]’
       may be used uninitialized in this function [-Wuninitialized]
      drivers/block/DAC960.c:1175:30: note: ‘CommandMailbox.Bytes[12]’
       was declared here
      Signed-off-by: default avatarDanny Kukawka <danny.kukawka@bisect.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cecd353a
    • Danny Kukawka's avatar
      drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning · bca505f1
      Danny Kukawka authored
      Fixed compiler warning:
      
      comparison between ‘DAC960_V2_IOCTL_Opcode_T’ and ‘enum <anonymous>’
      
      Renamed enum, added a new enum for SCSI_10.CommandOpcode in
      DAC960_V2_ProcessCompletedCommand().
      Signed-off-by: default avatarDanny Kukawka <danny.kukawka@bisect.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bca505f1
    • Stanislaw Gruszka's avatar
      block: fix __blkdev_get and add_disk race condition · 9f53d2fe
      Stanislaw Gruszka authored
      The following situation might occur:
      
      __blkdev_get:			add_disk:
      
      				register_disk()
      get_gendisk()
      
      disk_block_events()
      	disk->ev == NULL
      
      				disk_add_events()
      
      __disk_unblock_events()
      	disk->ev != NULL
      	--ev->block
      
      Then we unblock events, when they are suppose to be blocked. This can
      trigger events related block/genhd.c warnings, but also can crash in
      sd_check_events() or other places.
      
      I'm able to reproduce crashes with the following scripts (with
      connected usb dongle as sdb disk).
      
      <snip>
      DEV=/dev/sdb
      ENABLE=/sys/bus/usb/devices/1-2/bConfigurationValue
      
      function stop_me()
      {
      	for i in `jobs -p` ; do kill $i 2> /dev/null ; done
      	exit
      }
      
      trap stop_me SIGHUP SIGINT SIGTERM
      
      for ((i = 0; i < 10; i++)) ; do
      	while true; do fdisk -l $DEV  2>&1 > /dev/null ; done &
      done
      
      while true ; do
      echo 1 > $ENABLE
      sleep 1
      echo 0 > $ENABLE
      done
      </snip>
      
      I use the script to verify patch fixing oops in sd_revalidate_disk
      http://marc.info/?l=linux-scsi&m=132935572512352&w=2
      Without Jun'ichi Nomura patch titled "Fix NULL pointer dereference in
      sd_revalidate_disk" or this one, script easily crash kernel within
      a few seconds. With both patches applied I do not observe crash.
      Unfortunately after some time (dozen of minutes), script will hung in:
      
      [ 1563.906432]  [<c08354f5>] schedule_timeout_uninterruptible+0x15/0x20
      [ 1563.906437]  [<c04532d5>] msleep+0x15/0x20
      [ 1563.906443]  [<c05d60b2>] blk_drain_queue+0x32/0xd0
      [ 1563.906447]  [<c05d6e00>] blk_cleanup_queue+0xd0/0x170
      [ 1563.906454]  [<c06d278f>] scsi_free_queue+0x3f/0x60
      [ 1563.906459]  [<c06d7e6e>] __scsi_remove_device+0x6e/0xb0
      [ 1563.906463]  [<c06d4aff>] scsi_forget_host+0x4f/0x60
      [ 1563.906468]  [<c06cd84a>] scsi_remove_host+0x5a/0xf0
      [ 1563.906482]  [<f7f030fb>] quiesce_and_remove_host+0x5b/0xa0 [usb_storage]
      [ 1563.906490]  [<f7f03203>] usb_stor_disconnect+0x13/0x20 [usb_storage]
      
      Anyway I think this patch is some step forward.
      
      As drawback, I do not teardown on sysfs file create error, because I do
      not know how to nullify disk->ev (since it can be used). However add_disk
      error handling practically does not exist too, and things will work
      without this sysfs file, except events will not be exported to user
      space.
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9f53d2fe
    • Muthukumar R's avatar
      block: Fix setting bio flags in drivers (sd_dif/floppy) · 12ebffd1
      Muthukumar R authored
      Fix setting bio flags in drivers (sd_dif/floppy).
      Signed-off-by: default avatarMuthukumar R <muthur@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      12ebffd1
    • Jun'ichi Nomura's avatar
      block: Fix NULL pointer dereference in sd_revalidate_disk · fe316bf2
      Jun'ichi Nomura authored
      Since 2.6.39 (1196f8b8), when a driver returns -ENOMEDIUM for open(),
      __blkdev_get() calls rescan_partitions() to remove
      in-kernel partition structures and raise KOBJ_CHANGE uevent.
      
      However it ends up calling driver's revalidate_disk without open
      and could cause oops.
      
      In the case of SCSI:
      
        process A                  process B
        ----------------------------------------------
        sys_open
          __blkdev_get
            sd_open
              returns -ENOMEDIUM
                                   scsi_remove_device
                                     <scsi_device torn down>
            rescan_partitions
              sd_revalidate_disk
                <oops>
      Oopses are reported here:
      http://marc.info/?l=linux-scsi&m=132388619710052
      
      This patch separates the partition invalidation from rescan_partitions()
      and use it for -ENOMEDIUM case.
      Reported-by: default avatarHuajun Li <huajun.li.lee@gmail.com>
      Signed-off-by: default avatarJun'ichi Nomura <j-nomura@ce.jp.nec.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      fe316bf2
  4. 15 Feb, 2012 3 commits
    • Tejun Heo's avatar
      block: exit_io_context() should call elevator_exit_icq_fn() · 621032ad
      Tejun Heo authored
      While updating locking, b2efa052 "block, cfq: unlink
      cfq_io_context's immediately" moved elevator_exit_icq_fn() invocation
      from exit_io_context() to the final ioc put.  While this doesn't cause
      catastrophic failure, it effectively removes task exit notification to
      elevator and cause noticeable IO performance degradation with CFQ.
      
      On task exit, CFQ used to immediately expire the slice if it was being
      used by the exiting task as no more IO would be issued by the task;
      however, after b2efa052, the notification is lost and disk could sit
      idle needlessly, leading to noticeable IO performance degradation for
      certain workloads.
      
      This patch renames ioc_exit_icq() to ioc_destroy_icq(), separates
      elevator_exit_icq_fn() invocation into ioc_exit_icq() and invokes it
      from exit_io_context().  ICQ_EXITED flag is added to avoid invoking
      the callback more than once for the same icq.
      
      Walking icq_list from ioc side and invoking elevator callback requires
      reverse double locking.  This may be better implemented using RCU;
      unfortunately, using RCU isn't trivial.  e.g. RCU protection would
      need to cover request_queue and queue_lock switch on cleanup makes
      grabbing queue_lock from RCU unsafe.  Reverse double locking should
      do, at least for now.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-and-bisected-by: default avatarShaohua Li <shli@kernel.org>
      LKML-Reference: <CANejiEVzs=pUhQSTvUppkDcc2TNZyfohBRLygW5zFmXyk5A-xQ@mail.gmail.com>
      Tested-by: default avatarShaohua Li <shaohua.li@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      621032ad
    • Tejun Heo's avatar
      block: simplify ioc_release_fn() · 2274b029
      Tejun Heo authored
      Reverse double lock dancing in ioc_release_fn() can be simplified by
      just using trylock on the queue_lock and back out from ioc lock on
      trylock failure.  Simplify it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Tested-by: default avatarShaohua Li <shaohua.li@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2274b029
    • Tejun Heo's avatar
      block: replace icq->changed with icq->flags · d705ae6b
      Tejun Heo authored
      icq->changed was used for ICQ_*_CHANGED bits.  Rename it to flags and
      access it under ioc->lock instead of using atomic bitops.
      ioc_get_changed() is added so that the changed part can be fetched and
      cleared as before.
      
      icq->flags will be used to carry other flags.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Tested-by: default avatarShaohua Li <shaohua.li@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d705ae6b
  5. 14 Feb, 2012 29 commits