1. 06 Jan, 2010 8 commits
    • Thomas Gleixner's avatar
      clockevents: Prevent clockevent_devices list corruption on cpu hotplug · fa3f5a5c
      Thomas Gleixner authored
      commit bb6eddf7 upstream.
      
      Xiaotian Feng triggered a list corruption in the clock events list on
      CPU hotplug and debugged the root cause.
      
      If a CPU registers more than one per cpu clock event device, then only
      the active clock event device is removed on CPU_DEAD. The unused
      devices are kept in the clock events device list.
      
      On CPU up the clock event devices are registered again, which means
      that we list_add an already enqueued list_head. That results in list
      corruption.
      
      Resolve this by removing all devices which are associated to the dead
      CPU on CPU_DEAD.
      Reported-by: default avatarXiaotian Feng <dfeng@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarXiaotian Feng <dfeng@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      fa3f5a5c
    • Peter Zijlstra's avatar
      sched: Select_task_rq_fair() must honour SD_LOAD_BALANCE · 8e04c81a
      Peter Zijlstra authored
      commit e4f42888 upstream.
      
      We should skip !SD_LOAD_BALANCE domains.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <20091216170517.653578430@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      8e04c81a
    • Suresh Siddha's avatar
      x86, cpuid: Add "volatile" to asm in native_cpuid() · c9ac6a9e
      Suresh Siddha authored
      commit 45a94d7c upstream.
      
      xsave_cntxt_init() does something like:
      
      	cpuid(0xd, ..);	// find out what features FP/SSE/.. etc are supported
      
      	xsetbv();	// enable the features known to OS
      
      	cpuid(0xd, ..);	// find out the size of the context for features enabled
      
      Depending on what features get enabled in xsetbv(), value of the
      cpuid.eax=0xd.ecx=0.ebx changes correspondingly (representing the
      size of the context that is enabled).
      
      As we don't have volatile keyword for native_cpuid(), gcc 4.1.2
      optimizes away the second cpuid and the kernel continues to use
      the cpuid information obtained before xsetbv(), ultimately leading to kernel
      crash on processors supporting more state than the legacy FP/SSE.
      
      Add "volatile" for native_cpuid().
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1261009542.2745.55.camel@sbs-t61.sc.intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      c9ac6a9e
    • Peter Zijlstra's avatar
      sched: Fix task_hot() test order · 14ae0820
      Peter Zijlstra authored
      commit e6c8fba7 upstream.
      
      Make sure not to access sched_fair fields before verifying it is
      indeed a sched_fair task.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <20091216170517.577998058@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      14ae0820
    • Mike Christie's avatar
      SCSI: fc class: fix fc_transport_init error handling · fdf26751
      Mike Christie authored
      commit 48de68a4 upstream.
      
      If transport_class_register fails we should unregister any
      registered classes, or we will leak memory or other
      resources.
      
      I did a quick modprobe of scsi_transport_fc to test the
      patch.
      Signed-off-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      fdf26751
    • FUJITA Tomonori's avatar
      SCSI: st: fix mdata->page_order handling · 1ab0714d
      FUJITA Tomonori authored
      commit c982c368 upstream.
      
      dio transfer always resets mdata->page_order to zero. It breaks
      high-order pages previously allocated for non-dio transfer.
      
      This patches adds reserved_page_order to st_buffer structure to save
      page order for non-dio transfer.
      
      http://bugzilla.kernel.org/show_bug.cgi?id=14563
      
      When enlarge_buffer() allocates 524288 from 0, st uses six-order page
      allocation. So mdata->page_order is 6 and frp_seg is 2.
      
      After that, if st uses dio, sgl_map_user_pages() sets
      mdata->page_order to 0 for st_do_scsi(). After that, when we call
      normalize_buffer(), it frees only free frp_seg * PAGE_SIZE (2 * 4096)
      though we should free frp_seg * PAGE_SIZE << 6 (2 * 4096 << 6). So we
      see buffer_size is set to 516096 (524288 - 8192).
      Reported-by: default avatarJoachim Breuer <linux-kernel@jmbreuer.net>
      Tested-by: default avatarJoachim Breuer <linux-kernel@jmbreuer.net>
      Acked-by: default avatarKai Makisara <kai.makisara@kolumbus.fi>
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      1ab0714d
    • Michael Reed's avatar
      SCSI: qla2xxx: dpc thread can execute before scsi host has been added · 9f63d27c
      Michael Reed authored
      commit 1486400f upstream.
      
      Fix crash in qla2x00_fdmi_register() due to the dpc
      thread executing before the scsi host has been fully
      added.
      
      Unable to handle kernel NULL pointer dereference (address 00000000000001d0)
      qla2xxx_7_dpc[4140]: Oops 8813272891392 [1]
      
      Call Trace:
       [<a000000100016910>] show_stack+0x50/0xa0
                                      sp=e00000b07c59f930 bsp=e00000b07c591400
       [<a000000100017180>] show_regs+0x820/0x860
                                      sp=e00000b07c59fb00 bsp=e00000b07c5913a0
       [<a00000010003bd60>] die+0x1a0/0x2e0
                                      sp=e00000b07c59fb00 bsp=e00000b07c591360
       [<a0000001000681a0>] ia64_do_page_fault+0x8c0/0x9e0
                                      sp=e00000b07c59fb00 bsp=e00000b07c591310
       [<a00000010000c8e0>] ia64_native_leave_kernel+0x0/0x270
                                      sp=e00000b07c59fb90 bsp=e00000b07c591310
       [<a000000207197350>] qla2x00_fdmi_register+0x850/0xbe0 [qla2xxx]
                                      sp=e00000b07c59fd60 bsp=e00000b07c591290
       [<a000000207171570>] qla2x00_configure_loop+0x1930/0x34c0 [qla2xxx]
                                      sp=e00000b07c59fd60 bsp=e00000b07c591128
       [<a0000002071732b0>] qla2x00_loop_resync+0x1b0/0x2e0 [qla2xxx]
                                      sp=e00000b07c59fdf0 bsp=e00000b07c5910c0
       [<a000000207166d40>] qla2x00_do_dpc+0x9a0/0xce0 [qla2xxx]
                                      sp=e00000b07c59fdf0 bsp=e00000b07c590fa0
       [<a0000001000d5bb0>] kthread+0x110/0x140
                                      sp=e00000b07c59fe00 bsp=e00000b07c590f68
       [<a000000100014a30>] kernel_thread_helper+0xd0/0x100
                                      sp=e00000b07c59fe30 bsp=e00000b07c590f40
       [<a00000010000a4c0>] start_kernel_thread+0x20/0x40
                                      sp=e00000b07c59fe30 bsp=e00000b07c590f40
      
      crash> dis a000000207197350
      0xa000000207197350 <qla2x00_fdmi_register+2128>:        [MMI]       ld1 r45=[r14];;
      crash> scsi_qla_host.host 0xe00000b058c73ff8
        host = 0xe00000b058c73be0,
      crash> Scsi_Host.shost_data 0xe00000b058c73be0
        shost_data = 0x0,  <<<<<<<<<<<
      
      The fc_transport fc_* workqueue threads have yet to be created.
      
      crash> ps | grep _7
         3891      2   2  e00000b075c80000  IN   0.0       0      0  [scsi_eh_7]
         4140      2   3  e00000b07c590000  RU   0.0       0      0  [qla2xxx_7_dpc]
      
      The thread creating adding the Scsi_Host is blocked due to other
      activity in sysfs.
      
      crash> bt 3762
      PID: 3762   TASK: e00000b071e70000  CPU: 3   COMMAND: "modprobe"
       #0 [BSP:e00000b071e71548] schedule at a000000100727e00
       #1 [BSP:e00000b071e714c8] __mutex_lock_slowpath at a0000001007295a0
       #2 [BSP:e00000b071e714a8] mutex_lock at a000000100729830
       #3 [BSP:e00000b071e71478] sysfs_addrm_start at a0000001002584f0
       #4 [BSP:e00000b071e71440] create_dir at a000000100259350
       #5 [BSP:e00000b071e71410] sysfs_create_subdir at a000000100259510
       #6 [BSP:e00000b071e713b0] internal_create_group at a00000010025c880
       #7 [BSP:e00000b071e71388] sysfs_create_group at a00000010025cc50
       #8 [BSP:e00000b071e71368] dpm_sysfs_add at a000000100425050
       #9 [BSP:e00000b071e71310] device_add at a000000100417d90
      #10 [BSP:e00000b071e712d8] scsi_add_host at a00000010045a380
      #11 [BSP:e00000b071e71268] qla2x00_probe_one at a0000002071be950
      #12 [BSP:e00000b071e71248] local_pci_probe at a00000010032e490
      #13 [BSP:e00000b071e71218] pci_device_probe at a00000010032ecd0
      #14 [BSP:e00000b071e711d8] driver_probe_device at a00000010041d480
      #15 [BSP:e00000b071e711a8] __driver_attach at a00000010041d6e0
      #16 [BSP:e00000b071e71170] bus_for_each_dev at a00000010041c240
      #17 [BSP:e00000b071e71150] driver_attach at a00000010041d0a0
      #18 [BSP:e00000b071e71108] bus_add_driver at a00000010041b080
      #19 [BSP:e00000b071e710c0] driver_register at a00000010041dea0
      #20 [BSP:e00000b071e71088] __pci_register_driver at a00000010032f610
      #21 [BSP:e00000b071e71058] (unknown) at a000000207200270
      #22 [BSP:e00000b071e71018] do_one_initcall at a00000010000a9c0
      #23 [BSP:e00000b071e70f98] sys_init_module at a0000001000fef00
      #24 [BSP:e00000b071e70f98] ia64_ret_from_syscall at a00000010000c740
      
      So, it appears that qla2xxx dpc thread is moving forward before the
      scsi host has been completely added.
      
      This patch moves the setting of the init_done (and online) flag to
      after the call to scsi_add_host() to hold off the dpc thread.
      
      Found via large lun count testing using 2.6.31.
      Signed-off-by: default avatarMichael Reed <mdr@sgi.com>
      Acked-by: default avatarGiridhar Malavali <giridhar.malavali@qlogic.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      9f63d27c
    • Kleber Sacilotto de Souza's avatar
      SCSI: ipr: fix EEH recovery · c1d17da3
      Kleber Sacilotto de Souza authored
      commit 99c965dd upstream.
      
      After commits c82f63e4 (PCI: check saved
      state before restore) and 4b77b0a2 (PCI:
      Clear saved_state after the state has been restored) PCI drivers are
      prevented from restoring the device standard configuration registers
      twice in a row. These changes introduced a regression on ipr EEH
      recovery.
      
      The ipr device driver saves the PCI state only during the device probe
      and restores it on ipr_reset_restore_cfg_space() during IOA resets. This
      behavior is causing the EEH recovery to fail after the second error
      detected, since the registers are not being restored.
      
      One possible solution would be saving the registers after restoring
      them. The problem with this approach is that while recovering from an
      EEH error if pci_save_state() results in an EEH error, the adapter/slot
      will be reset, and end up back in ipr_reset_restore_cfg_space(), but it
      won't have a valid saved state to restore, so pci_restore_state() will
      fail.
      
      The following patch introduces a workaround for this problem, hacking
      around the PCI API by setting pdev->state_saved = true before we do the
      restore. It fixes the EEH regression and prevents that we hit another
      EEH error during EEH recovery.
      
      
      [jejb: fix is a hack ... Jesse and Rafael will fix properly]
      Signed-off-by: default avatarKleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
      Acked-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      c1d17da3
  2. 18 Dec, 2009 32 commits