1. 03 Nov, 2011 2 commits
  2. 31 Oct, 2011 18 commits
    • Robin H. Johnson's avatar
      [SCSI] mv_sas: OCZ RevoDrive3 & zDrive R4 support · 99a700bc
      Robin H. Johnson authored
      In the OCZ RevoDrive3/zDrive R4 series, the "OCZ SuperScale Storage
      Controller" with "Virtualized Controller Architecture 2.0" really seems
      to be a Marvell 88SE9485 part, with OCZ firmware/BIOS.
      
      Developed and tested on OCZ RevoDrive3 120GB [PCI 1b85:1021]
      
      Should work on:
      - OCZ RevoDrive3 (2x SandForce 2281)
      - OCZ RevoDrive3 X2 (4x SandForce 2281)
      - OCZ zDrive R4 CM84 (4x SandForce 2281)
      - OCZ zDrive R4 CM88 (8x SandForce 2281)
      - OCZ zDrive R4 RM84 (4x SandForce 2582)
      - OCZ zDrive R4 RM88 (8x SandForce 2582)
      
      All of this because a friend recently bought a OCZ RevoDrive3 and was
      bitten by the lack of Linux support.
      
      Notes from testing:
      -------------------
      - SMART works.
      - VPD Device Identification is "OCZ-REVODRIVE3"
      - Thin provisioning/TRIM seems to be implemented as WRITE SAME UNMAP,
        with deterministic (non-zero) read after TRIM, but I'm not sure if it
        works 100% in my testing.
      - Some of the tuning in the firmware seems to ensure much better
        performance when in a RAID0 setup than using the two devices
        seperately.
      
      I have not tested booting from the SSD, because all of this was
      developed and tested remotely from the actual hardware.
      Signed-off-by: default avatarRobin H. Johnson <robbat2@gentoo.org>
      Thanks-To: Gordon Pritchard <gordp@sfu.ca>
      Acked-by: default avatarXiangliang Yu <yuxiangl@marvell.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      99a700bc
    • Vasu Dev's avatar
      [SCSI] libfc: improve flogi retries to avoid lport stuck · 907c07d4
      Vasu Dev authored
      Adds more cases to do flogi retry, now also retry
      on getting bad response due to either no ELS response
      or flogi response payload length not large enough.
      In those cases flogi was not retried and that
      was leaving lport offline.
      Signed-off-by: default avatarVasu Dev <vasu.dev@intel.com>
      Tested-by: default avatarBhanu Prakash Gollapudi <bprakash@broadcom.com>
      Signed-off-by: default avatarYi Zou <yi.zou@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      907c07d4
    • Vasu Dev's avatar
      [SCSI] libfc: avoid exchanges collision during lport reset · b6e3c840
      Vasu Dev authored
      Currently timer delay is large and is using msleep to avoid
      avoid exchanges collision across lport reset, so instead
      do this by initializing exches pool indexes during
      reset also.
      Signed-off-by: default avatarVasu Dev <vasu.dev@intel.com>
      Tested-by: default avatarBhanu Prakash Gollapudi <bprakash@broadcom.com>
      Signed-off-by: default avatarYi Zou <yi.zou@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      b6e3c840
    • Vasu Dev's avatar
      [SCSI] libfc: fix checking FC_TYPE_BLS · 14fc315f
      Vasu Dev authored
      Its checked after skb freed, so instead have fh_type
      cached and then check FC_TYPE_BLS against cached
      fh_type value.
      
      This wrong check was causing double exch locking as
      reported by Bhanu at
      https://lists.open-fcoe.org/pipermail/devel/2011-October/011793.htmlSigned-off-by: default avatarVasu Dev <vasu.dev@intel.com>
      Tested-by: default avatarBhanu Prakash Gollapudi <bprakash@broadcom.com>
      Signed-off-by: default avatarYi Zou <yi.zou@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      14fc315f
    • Michael Chan's avatar
      [SCSI] edd: Treat "XPRS" host bus type the same as "PCI" · 044aceef
      Michael Chan authored
      PCI Express devices will return "XPRS" host bus type during BIOS EDD
      call.  "XPRS" should be treated just like "PCI" so that the proper
      pci_dev symlink will be created.  Scripts such as fcoe_edd.sh will
      then work correctly.
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Reviewed-by: default avatarMatt Domsch <Matt_Domsch@dell.com>
      Signed-off-by: default avatarYi Zou <yi.zou@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      044aceef
    • Andrzej Jakowski's avatar
      [SCSI] isci: overriding max_concurr_spinup oem parameter by max(oem, user) · 7000f7c7
      Andrzej Jakowski authored
      Fixes bug where max_concurr_spinup oem parameter should be
      overriden by max_concurr_spinup user parameter. Override should
      happen only when max_concurr_spinup user parameter is specified
      in command line (greater than 0). Also this fix shortens variables
      representing max_conxurr_spinup for oem and user parameters.
      Signed-off-by: default avatarAndrzej Jakowski <andrzej.jakowski@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      7000f7c7
    • Dan Williams's avatar
      [SCSI] isci: revert bcn filtering · 52d74634
      Dan Williams authored
      The initial bcn filtering implementation was validated on a kernel
      baseline that predated the switch to new libata error handling.  Also,
      prior to that conversion we borrowed the mvsas MVS_DEV_EH approach to
      prevent the unwanted extra ap->ops->phy_reset(ap) that occurred in the
      ata_bus_probe() path.
      
      After the conversion to new libata eh resets at discovery are more
      frequent and get filtered prematurely by IDEV_EH.  The result is that
      our bcn filtering has been blocked from running and at discovery and it
      appears to stall discovery completion to the point of triggering hung
      task timeouts.  So, revert the implementation for now.  When it returns
      it will go into libsas proper.
      
      The domain rediscovery that takes place due to ->lldd_I_T_nexus_reset()
      events should now be properly waited for by the ata_port_wait_eh() call
      in ata_port_probe().  So the hard coded delay in the isci
      ->lldd_I_T_nexus_reset() and other libsas drivers should help debounce
      the libsas thread from seeing temporary device removals.
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      52d74634
    • Jeff Skirvin's avatar
      [SCSI] isci: Fix hard reset timeout conditions. · 8e35a139
      Jeff Skirvin authored
      A hard reset can timeout before or after the last phy in the
      port goes away.  If after, then notify the OS that the last
      phy has failed.
      
      The recovery for the failed hard reset has been removed.
      This recovery code was unecessary in that the link would
      recover from the failure normally by a new link reset sequence
      or hotplug of the remote device.
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      8e35a139
    • Jeff Skirvin's avatar
      [SCSI] isci: No need to manage the pending reset bit on pending requests. · 5412e25c
      Jeff Skirvin authored
      The lldd does not need to look at or manage the pending device
      reset bit in pending sas_tasks.
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      5412e25c
    • Jeff Skirvin's avatar
      [SCSI] isci: Remove redundant isci_request.ttype field. · 3b34c169
      Jeff Skirvin authored
      Use the existing IREQ_TMF flag as a request type indicator.
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      3b34c169
    • Jeff Skirvin's avatar
      [SCSI] isci: Fix task management for SMP, SATA and on dev remove. · 98145cb7
      Jeff Skirvin authored
      libsas uses the LLDD abort task interface to handle I/O timeouts
      in the SATA/STP and SMP discovery paths, so this change will terminate
      STP/SMP requests. Also, if the device is gone, the lldd will prevent
      libsas from further escalations in the error handler.
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      98145cb7
    • Jeff Skirvin's avatar
      [SCSI] isci: No task_done callbacks in error handler paths. · db49c2d0
      Jeff Skirvin authored
      libsas will cleanup pending sas_tasks after error handler
      path functions are called; do not call task_done callbacks.
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      db49c2d0
    • Jeff Skirvin's avatar
      [SCSI] isci: Handle task request timeouts correctly. · b343dff1
      Jeff Skirvin authored
      In the case where "task" requests timeout (note that this class of
      requests can also include SATA/STP soft reset FIS transmissions),
      handle the case where the task was being managed by some call to
      terminate the task request by completing both the tmf and the aborting
      process.
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      b343dff1
    • Jeff Skirvin's avatar
      [SCSI] isci: Fix tag leak in tasks and terminated requests. · d6891682
      Jeff Skirvin authored
      Make sure terminated requests and completed task tags are freed.
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      d6891682
    • Jeff Skirvin's avatar
      [SCSI] isci: Immediately fail I/O to removed devices. · c2cb8a5f
      Jeff Skirvin authored
      In the case where an I/O fails to start in isci_request_execute,
      only allow retries if the device is not already gone.
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      c2cb8a5f
    • Jeff Skirvin's avatar
      [SCSI] isci: Lookup device references through requests in completions. · 0e2e2799
      Jeff Skirvin authored
      The LLDD needs to obtain a reference to the device through the request
      itself and not through the domain_device, because the
      domain_device.lldd_dev is set to NULL early in the lldd_dev_gone call.
      This relies on the fact that the isci_remote_device object is keeping a
      seperate reference count of outstanding requests.  TODO: unify the
      request count tracking with the isci_remote_device kref.
      
      The failure signature of this condition looks like the following
      log, where the important bits are the call to lldd_dev_gone followed
      by a crash in isci_terminate_request_core:
      
      [  229.151541] isci 0000:0b:00.0: isci_remote_device_gone: domain_device = ffff8801492d4800, isci_device = ffff880143c657d0, isci_port = ffff880143c63658
      [  229.166007] isci 0000:0b:00.0: isci_remote_device_stop: isci_device = ffff880143c657d0
      [  229.175317] isci 0000:0b:00.0: isci_terminate_pending_requests: idev=ffff880143c657d0 request=ffff88014741f000; task=ffff8801470f46c0 old_state=2
      [  229.189702] isci 0000:0b:00.0: isci_terminate_request_core: device = ffff880143c657d0; request = ffff88014741f000
      [  229.201339] isci 0000:0b:00.0: isci_terminate_request_core: before completion wait (ffff88014741f000/ffff880149715ad0)
      [  229.213414] isci 0000:0b:00.0: sci_controller_process_completions: completion queue entry:0x8000a0e9
      [  229.214401] BUG: unable to handle kernel NULL pointer dereference at 0000000000000228
      [  229.214401] IP:jdskirvi-testlbo [<ffffffffa00a58be>] sci_request_completed_state_enter+0x50/0xafb [isci]
      [  229.214401] PGD 13d19e067 PUD 13d104067 PMD 0
      [  229.214401] Oops: 0000 [#1] SMP
      [  229.214401] CPU 0 x kernel: [  226
      [  229.214401] Modules linked in: ipv6 dm_multipath uinput nouveau snd_hda_codec_realtek snd_hda_intel ttm drm_kms_helper drm snd_hda_codec snd_hwdep snd_pcm snd_timer i2c_algo_bit isci snd libsas ioatdma mxm_wmi iTCO_wdt soundcore snd_page_alloc scsi_transport_sas iTCO_vendor_support wmi dca video i2c_i801 i2c_core [last unloaded: speedstep_lib]
      [  229.214401]
      [  229.214401] Pid: 5, comm: kworker/u:0 Not tainted 3.0.0-isci-11.7.29+ #30.353196] Buffer  Intel Corporation Stoakley/Pearlcity Workstation
      [  229.214401] RIP: 0010:[<ffffffffa00a58be>] I/O error on dev [<ffffffffa00a58be>] sci_request_completed_state_enter+0x50/0xafb [isci]
      [  229.214401] RSP: 0018:ffff88014fc03d20  EFLAGS: 00010046
      [  229.214401] RAX: 0000000000000000 RBX: ffff88014741f000 RCX: 0000000000000000
      [  229.214401] RDX: ffffffffa00b2c90 RSI: 0000000000000017 RDI: ffff88014741f0a0
      [  229.214401] RBP: ffff88014fc03d90 R08: 0000000000000018 R09: 0000000000000000
      [  229.214401] R10: 0000000000000000 R11: ffffffff81a17d98 R12: 000000000000001d
      [  229.214401] R13: ffff8801470f46c0 R14: 0000000000000000 R15: 0000000000008000
      [  229.214401] FS:  0000000000000000(0000) GS:ffff88014fc00000(0000) knlGS:0000000000000000
      [  229.214401] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  229.214401] CR2: 0000000000000228 CR3: 000000013ceaa000 CR4: 00000000000406f0
      [  229.214401] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  229.214401] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  229.214401] Process kworker/u:0 (pid: 5, threadinfo ffff880149714000, task ffff880149718000)
      [  229.214401] Call Trace:
      [  229.214401]  <IRQ>
      [  229.214401]  [<ffffffffa00aa6ce>] sci_change_state+0x4a/0x4f [isci]
      [  229.214401]  [<ffffffffa00a4ca6>] sci_io_request_tc_completion+0x79c/0x7a0 [isci]
      [  229.214401]  [<ffffffffa00acf35>] sci_controller_process_completions+0x14f/0x396 [isci]
      [  229.214401]  [<ffffffffa00abbda>] ? spin_lock_irq+0xe/0x10 [isci]
      [  229.214401]  [<ffffffffa00ad2cf>] isci_host_completion_routine+0x71/0x2be [isci]
      [  229.214401]  [<ffffffff8107c6b3>] ? mark_held_locks+0x52/0x70
      [  229.214401]  [<ffffffff810538e8>] tasklet_action+0x90/0xf1
      [  229.214401]  [<ffffffff81054050>] __do_softirq+0xe5/0x1bf
      [  229.214401]  [<ffffffff8106d9d1>] ? hrtimer_interrupt+0x129/0x1bb
      [  229.214401]  [<ffffffff814ff69c>] call_softirq+0x1c/0x30
      [  229.214401]  [<ffffffff8100bb67>] do_softirq+0x4b/0xa3
      [  229.214401]  [<ffffffff81053d84>] irq_exit+0x53/0xb4
      [  229.214401]  [<ffffffff814fffe7>] smp_apic_timer_interrupt+0x83/0x91
      [  229.214401]  [<ffffffff814fee53>] apic_timer_interrupt+0x13/0x20
      [  229.214401]  <EOI>
      [  229.214401]  [<ffffffff814f7ad4>] ? retint_restore_args+0x13/0x13
      [  229.214401]  [<ffffffff8107af29>] ? trace_hardirqs_off+0xd/0xf
      [  229.214401]  [<ffffffff8104ea71>] ? vprintk+0x40b/0x452
      [  229.214401]  [<ffffffff814f4b5a>] printk+0x41/0x47
      [  229.214401]  [<ffffffff81314484>] __dev_printk+0x78/0x7a
      [  229.214401]  [<ffffffff8131471e>] dev_printk+0x45/0x47
      [  229.214401]  [<ffffffffa00ae2a3>] isci_terminate_request_core+0x15d/0x317 [isci]
      [  229.214401]  [<ffffffffa00af1ad>] isci_terminate_pending_requests+0x1a4/0x204 [isci]
      [  229.214401]  [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
      [  229.214401]  [<ffffffffa00a7d9e>] isci_remote_device_nuke_requests+0xa6/0xff [isci]
      [  229.214401]  [<ffffffffa00a811a>] isci_remote_device_stop+0x7c/0x166 [isci]
      [  229.214401]  [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
      [  229.214401]  [<ffffffffa00a827a>] isci_remote_device_gone+0x76/0x7e [isci]
      [  229.214401]  [<ffffffffa002363e>] sas_notify_lldd_dev_gone+0x34/0x36 [libsas]
      [  229.214401]  [<ffffffffa0023945>] sas_unregister_dev+0x57/0x9c [libsas]
      [  229.214401]  [<ffffffffa00239c0>] sas_unregister_domain_devices+0x36/0x65 [libsas]
      [  229.214401]  [<ffffffffa0022cb8>] sas_deform_port+0x72/0x1ac [libsas]
      [  229.214401]  [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
      [  229.214401]  [<ffffffffa0022a34>] sas_phye_loss_of_signal+0x3e/0x42 [libsas]
      Signed-off-by: default avatarJeff Skirvin <jeffrey.d.skirvin@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      0e2e2799
    • Wayne Boyer's avatar
      [SCSI] ipr: add definitions for additional adapter · 5a918353
      Wayne Boyer authored
      Add the appropriate definition and table entry for an additional adapter.
      Signed-off-by: default avatarWayne Boyer <wayneb@linux.vnet.ibm.com>
      Acked-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      5a918353
    • Moger, Babu's avatar
      [SCSI] scsi_dh: check queuedata pointer before proceeding further · a18a920c
      Moger, Babu authored
      This patch validates sdev pointer in scsi_dh_activate before proceeding further.
      
      Without this check we might see the panic as below. I have seen this
      panic multiple times..
      
      Call trace:
      
       #0 [ffff88007d647b50] machine_kexec at ffffffff81020902
       #1 [ffff88007d647ba0] crash_kexec at ffffffff810875b0
       #2 [ffff88007d647c70] oops_end at ffffffff8139c650
       #3 [ffff88007d647c90] __bad_area_nosemaphore at ffffffff8102dd15
       #4 [ffff88007d647d50] page_fault at ffffffff8139b8cf
          [exception RIP: scsi_dh_activate+0x82]
          RIP: ffffffffa0041922  RSP: ffff88007d647e00  RFLAGS: 00010046
          RAX: 0000000000000000  RBX: 0000000000000000  RCX: 00000000000093c5
          RDX: 00000000000093c5  RSI: ffffffffa02e6640  RDI: ffff88007cc88988
          RBP: 000000000000000f   R8: ffff88007d646000   R9: 0000000000000000
          R10: ffff880082293790  R11: 00000000ffffffff  R12: ffff88007cc88988
          R13: 0000000000000000  R14: 0000000000000286  R15: ffff880037b845e0
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
       #5 [ffff88007d647e38] run_workqueue at ffffffff81060268
       #6 [ffff88007d647e78] worker_thread at ffffffff81060386
       #7 [ffff88007d647ee8] kthread at ffffffff81064436
       #8 [ffff88007d647f48] kernel_thread at ffffffff81003fba
      Signed-off-by: default avatarBabu Moger <babu.moger@netapp.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      a18a920c
  3. 30 Oct, 2011 20 commits