1. 16 Nov, 2004 1 commit
    • Brian King's avatar
      [PATCH] sg: Fix oops of sg_cmd_done and sg_release race · fad44d87
      Brian King authored
      The following patch fixes a race condition in sg of sg_cmd_done racing
      with sg_release. I've seen this bug hit several times on test machines
      and the following patch fixes it. The race is that if srp->done is set
      and the waiting thread gets a spurious wakeup immediately afterwards,
      then the waiting thread can end up executing and completing, then getting
      closed, freeing sfp before the wake_up_interruptible is called, which
      then will result in an oops. The oops is fixed by locking around the
      setting srp->done to 1 and the wake_up, and also locking around the
      checking of srp->done, which guarantees that the wake_up_interruptible
      will always occur before the sleeping thread gets a chance to run.
      Signed-off-by: default avatarBrian King <brking@us.ibm.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@SteelEye.com>
      fad44d87
  2. 15 Nov, 2004 1 commit
    • James Bottomley's avatar
      Fix badness in scsi_lib.c · 0212fe0b
      James Bottomley authored
      From: Mike Christie <mikenc@us.ibm.com>
      
      > Oct 26 23:32:55 mina kernel: Unable to handle kernel paging request at 
      > virtual address 6b6b6c7b
      > Oct 26 23:32:55 mina kernel:  printing eip:
      > Oct 26 23:32:55 mina kernel: f882b8ce
      > Oct 26 23:32:55 mina kernel: *pde = 00000000
      > Oct 26 23:32:55 mina kernel: Oops: 0000 [#1]
      > Oct 26 23:32:55 mina kernel: PREEMPT
      > Oct 26 23:32:55 mina kernel: Modules linked in: sd_mod usb_storage 
      > ide_cd cdrom sg scsi_mod rd
      > Oct 26 23:32:55 mina kernel: CPU:    0
      > Oct 26 23:32:55 mina kernel: EIP:    0060:[<f882b8ce>]    Not tainted VLI
      > Oct 26 23:32:55 mina kernel: EFLAGS: 00010296   (2.6.10-rc1-mm1y)
      > Oct 26 23:32:55 mina kernel: EIP is at 
      > scsi_block_when_processing_errors+0xe/0xe0 [scsi_mod]
      > Oct 26 23:32:55 mina kernel: eax: 00000000   ebx: 6b6b6b6b   ecx: 
      > f88ef640   edx: ec5b6578
      > Oct 26 23:32:55 mina kernel: esi: e9baa4b8   edi: c17e9268   ebp: 
      > e9677f0c   esp: e9677eb4
      > Oct 26 23:32:55 mina kernel: ds: 007b   es: 007b   ss: 0068
      > Oct 26 23:32:55 mina kernel: Process fdisk (pid: 2891, 
      > threadinfo=e9676000 task=ea0c61f0)
      > Oct 26 23:32:55 mina kernel: Stack: 00000000 00000001 00000000 00000000 
      > 00000000 00000000 00000000 00000000
      > Oct 26 23:32:55 mina kernel:        00000000 00000000 00000003 e9677ef0 
      > c17e9268 0000006b c17e9268 e9677f0c
      > Oct 26 23:32:55 mina kernel:        c0159761 c17e93e4 00000000 e9b98780 
      > e9baa4b8 c17e9268 e9677f24 f88ef6a8
      > Oct 26 23:32:55 mina kernel: Call Trace:
      > Oct 26 23:32:55 mina kernel:  [<c01056cf>] show_stack+0x7f/0xa0
      > Oct 26 23:32:55 mina kernel:  [<c0105876>] show_registers+0x156/0x1c0
      > Oct 26 23:32:55 mina kernel:  [<c0105af6>] die+0x156/0x2e0
      > Oct 26 23:32:55 mina kernel:  [<c011628d>] do_page_fault+0x36d/0x69c
      > Oct 26 23:32:55 mina kernel:  [<c01051ed>] error_code+0x2d/0x38
      > Oct 26 23:32:55 mina kernel:  [<f88ef6a8>] sd_release+0x68/0xa0 [sd_mod]
      > Oct 26 23:32:55 mina kernel:  [<c0184c03>] blkdev_put+0x183/0x1b0
      > Oct 26 23:32:55 mina kernel:  [<c01784ad>] __fput+0x14d/0x160
      > Oct 26 23:32:55 mina kernel:  [<c0176797>] filp_close+0x57/0x90
      > Oct 26 23:32:55 mina kernel:  [<c01768e3>] sys_close+0x113/0x240
      > Oct 26 23:32:55 mina kernel:  [<c0104ff1>] sysenter_past_esp+0x52/0x71
      > Oct 26 23:32:55 mina kernel: Code: 0c 8b 43 04 8b 00 89 5c 24 04 c7 04 
      > 24 e4 d1 83 f8 89 44 24 08 e8 d3 21 8f c7 8d 76 00 55 89 e5 57 56 53 83 
      > ec 4c 8b 75 08 8b 1e <8b> 83 10 01 00 00 a8 08 74 7c fc 31 c0 8d 7d b4 
      > b9 05 00 00 00
      
      
      The problem with using shost_for_each_device wrt to the above oops is
      that scsi_forget_host sets the state to SDEV_CANCEL, so that when
      scsi_host_cancel iterates over the devices using shost_for_each_device
      it cannot get a handle to the sdev (scsi_device_get fails becuase the
      state is set to SDEV_CANCEL). And, __scsi_iterate_devices does not clear
      the next pointer if this happens, so I think this is needed to fix just
      the refcount bug in shost_for_each_device.
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@SteelEye.com>
      0212fe0b
  3. 14 Nov, 2004 3 commits
  4. 15 Nov, 2004 1 commit
  5. 14 Nov, 2004 28 commits
  6. 13 Nov, 2004 6 commits