1. 14 Sep, 2009 5 commits
    • Hidetoshi Seto's avatar
      [IA64] kdump: Mask INIT first in panic-kdump path · 1726b088
      Hidetoshi Seto authored
      Summary:
      
        Asserting INIT might block kdump if the system is already going to
        start kdump via panic.
      
      Description:
      
        INIT can interrupt anywhere in panic path, so it can interrupt in
        middle of kdump kicked by panic.  Therefore there is a race if kdump
        is kicked concurrently, via Panic and via INIT.
      
        INIT could fail to invoke kdump if the system is already going to
        start kdump via panic.  It could not restart kdump from INIT handler
        if some of cpus are already playing dead with INIT masked.  It also
        means that INIT could block kdump's progress if no monarch is entered
        in the INIT rendezvous.
      
        Panic+INIT is a rare, but possible situation since it can be assumed
        that the kernel or an internal agent decides to panic the unstable
        system while another external agent decides to send an INIT to the
        system at same time.
      
      How to reproduce:
      
        Assert INIT just after panic, before all other cpus have frozen
      
      Expected results:
      
        continue kdump invoked by panic, or restart kdump from INIT
      
      Actual results:
      
        might be hang, crashdump not retrieved
      
      Proposed Fix:
      
        This patch masks INIT first in panic path to take the initiative on
        kdump, and reuse atomic value kdump_in_progress to make sure there is
        only one initiator of kdump.  All INITs asserted later should be used
        only for freezing all other cpus.
      
        This mask will be removed soon by rfi in relocate_kernel.S, before jump
        into kdump kernel, after all cpus are frozen and no-op INIT handler is
        registered.  So if INIT was in the interval while it is masked, it will
        pend on the system and will received just after the rfi, and handled by
        the no-op handler.
      
        If there was a MCA event while psr.mc is 1, in theory the event will
        pend on the system and will received just after the rfi same as above.
        MCA handler is unregistered here at the time, so received MCA will not
        reach to OS_MCA and will result in warmboot by SAL.
      
        Note that codes in this masked interval are relatively simpler than
        that in MCA/INIT handler which also executed with the mask.  So it can
        be said that probability of error in this interval is supposed not so
        higher than that in MCA/INIT handler.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Haren Myneni <hbabu@us.ibm.com>
      Cc: kexec@lists.infradead.org
      Acked-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      1726b088
    • Hidetoshi Seto's avatar
      [IA64] kdump: Don't return APs to SAL from kdump · 68cb14c7
      Hidetoshi Seto authored
      Summary:
      
        Asserting INIT on cpu going to be offline will result in unexpected
        behavior.  It will be a real problem in kdump cases where INIT might
        be asserted to unstable APs going to be offline by returning to SAL.
      
      Description:
      
        Since psr.mc is cleared when bits in psr are set to SAL_PSR_BITS_TO_SET
        in ia64_jump_to_sal(), there is a small window (~few msecs) that the
        cpu can receive INIT even if the cpu enter there via INIT handler.
        In this window we do restore of registers for SAL, so INIT asserted
        here will not work properly.
      
        It is hard to remove this window by masking INIT (i.e. setting psr.mc)
        because we have to unmask it later in OS, because we have to use branch
        instruction (br.ret, not rfi) to return SAL, due to OS_BOOT_RENDEZ to
        SAL return convention.
      
        I suppose this window will not be a real problem on cpu offline if we
        can educate people not to push INIT button during hotplug operation.
        However, only exception is a race in kdump and INIT.  Now kdump returns
        APs to SAL before processing dump, but the kernel might receive INIT at
        that point in time.  Such INIT might be asserted by kdump itself if an
        AP doesn't react IPI soon and kdump decided to use INIT to stop the AP.
        Or it might be asserted by operator or an external agent to start dump
        on the unstable system.
      
        Such panic+INIT or INIT+INIT cases should be rare, but it will be happy
        if we can retrieve crashdump even in such cases.
      
      How to reproduce:
      
        panic+INIT or INIT+INIT, with kdump configured
      
      Expected results:
      
        crashdump is retrieved anyway
      
      Actual results:
      
        panic, hang etc. (unexpected)
      
      Proposed fix
      
        To avoid the window on the way to SAL, this patch stops returning APs
        to SAL in case of kdump.  In other words, this patch makes APs spin
        in OS instead of spinning in SAL.
      
        (* Note: What impact would be there?  If a cpu is spinning in SAL,
         the cpu is in BOOT_RENDEZ loop, as same as offlined cpu.
         In theory if an INIT is asserted there, cpus in the BOOT_RENDEZ loop
         should not invoke OS_INIT on it.  So in either way, no matter where
         the cpu is spinning actually in, once cpu starts spin and act as
         "frozen," INIT on the cpu have no effects.
         From another point of view, all debug information on the cpu should
         have stored to memory before the cpu start to be frozen.  So no more
         action on the cpu is required.)
      
        I confirmed that the kdump sometime hangs by concurrent INITs (another
        INIT after an INIT), and it doesn't hang after applying this patch.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Haren Myneni <hbabu@us.ibm.com>
      Cc: kexec@lists.infradead.org
      Acked-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      68cb14c7
    • Hidetoshi Seto's avatar
      [IA64] kexec: Unregister MCA handler before kexec · 6cc3efcd
      Hidetoshi Seto authored
      Summary:
      
        MCA on the beginning of kdump/kexec kernel will result in unexpected
        behavior because MCA handler for previous kernel is invoked on the
        kdump kernel.
      
      Description:
      
        Once a cpu is passed to new kernel, all resources in previous kernel
        should not be used from the cpu.  Even the resources for MCA handler
        are no exception.  So we cannot handle MCAs and its machine check
        errors during kernel transition, until new handler for new kernel is
        registered with new resources ready for handling the MCA.
      
      How to reproduce:
      
        Assert MCA while kdump kernel is booting, before new MCA handler for
        kdump kernel is registered.
      
      Expected(Desirable) results:
      
        No recovery, cancel kdump and reboot the system.
      
      Actual results:
      
        MCA handler for previous kernel is invoked on the kdump kernel.
        => panic, hang etc. (unexpected)
      
      Proposed fix:
      
        To avoid entering MCA handler from early stage of new kernel,
        unregister the entry point from SAL before leave from current
        kernel.  Then SAL will make all MCAs to warmboot safely, without
        invoking OS_MCA.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Haren Myneni <hbabu@us.ibm.com>
      Cc: kexec@lists.infradead.org
      Acked-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      6cc3efcd
    • Hidetoshi Seto's avatar
      [IA64] kexec: Make INIT safe while transition to · 07a6a4ae
      Hidetoshi Seto authored
      kdump/kexec kernel
      
      Summary:
      
        Asserting INIT on the beginning of kdump/kexec kernel will result
        in unexpected behavior because INIT handler for previous kernel is
        invoked on new kernel.
      
      Description:
      
        In panic situation, we can receive INIT while kernel transition,
        i.e. from beginning of panic to bootstrap of kdump kernel.
        Since we initialize registers on leave from current kernel, no
        longer monarch/slave handlers of current kernel in virtual mode are
        called safely.  (In fact system goes hang as far as I confirmed)
      
      How to Reproduce:
      
        Start kdump
          # echo c > /proc/sysrq-trigger
        Then assert INIT while kdump kernel is booting, before new INIT
        handler for kdump kernel is registered.
      
      Expected(Desirable) result:
      
        kdump kernel boots without any problem, crashdump retrieved
      
      Actual result:
      
        INIT handler for previous kernel is invoked on kdump kernel
        => panic, hang etc. (unexpected)
      
      Proposed fix:
      
        We can unregister these init handlers from SAL before jumping into
        new kernel, however then the INIT will fallback to default behavior,
        result in warmboot by SAL (according to the SAL specification) and
        we cannot retrieve the crashdump.
      
        Therefore this patch introduces a NOP init handler and register it
        to SAL before leave from current kernel, to start kdump safely by
        preventing INITs from entering virtual mode and resulting in warmboot.
      
        On the other hand, in case of kexec that not for kdump, it also
        has same problem with INIT while kernel transition.
        This patch handles this case differently, because for kexec
        unregistering handlers will be preferred than registering NOP
        handler, since the situation "no handlers registered" is usual
        state for kernel's entry.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Haren Myneni <hbabu@us.ibm.com>
      Cc: kexec@lists.infradead.org
      Acked-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      07a6a4ae
    • Hidetoshi Seto's avatar
      [IA64] kdump: Mask MCA/INIT on frozen cpus · 4295ab34
      Hidetoshi Seto authored
      Summary:
      
        INIT asserted on kdump kernel invokes INIT handler not only on a
        cpu that running on the kdump kernel, but also BSP of the panicked
        kernel, because the (badly) frozen BSP can be thawed by INIT.
      
      Description:
      
        The kdump_cpu_freeze() is called on cpus except one that initiates
        panic and/or kdump, to stop/offline the cpu (on ia64, it means we
        pass control of cpus to SAL, or put them in spinloop).  Note that
        CPU0(BSP) always go to spinloop, so if panic was happened on an AP,
        there are at least 2cpus (= the AP and BSP) which not back to SAL.
      
        On the spinning cpus, interrupts are disabled (rsm psr.i), but INIT
        is still interruptible because psr.mc for mask them is not set unless
        kdump_cpu_freeze() is not called from MCA/INIT context.
      
        Therefore, assume that a panic was happened on an AP, kdump was
        invoked, new INIT handlers for kdump kernel was registered and then
        an INIT is asserted.  From the viewpoint of SAL, there are 2 online
        cpus, so INIT will be delivered to both of them.  It likely means
        that not only the AP (= a cpu executing kdump) enters INIT handler
        which is newly registered, but also BSP (= another cpu spinning in
        panicked kernel) enters the same INIT handler.  Of course setting of
        registers in BSP are still old (for panicked kernel), so what happen
        with running handler with wrong setting will be extremely unexpected.
        I believe this is not desirable behavior.
      
      How to Reproduce:
      
        Start kdump on one of APs (e.g. cpu1)
          # taskset 0x2 echo c > /proc/sysrq-trigger
        Then assert INIT after kdump kernel is booted, after new INIT handler
        for kdump kernel is registered.
      
      Expected results:
      
        An INIT handler is invoked only on the AP.
      
      Actual results:
      
        An INIT handler is invoked on the AP and BSP.
      
      Sample of results:
      
        I got following console log by asserting INIT after prompt "root:/>".
        It seems that two monarchs appeared by one INIT, and one panicked at
        last.  And it also seems that the panicked one supposed there were
        4 online cpus and no one did rendezvous:
      
          :
          [  0 %]dropping to initramfs shell
          exiting this shell will reboot your system
          root:/> Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=0
          ia64_init_handler: Promoting cpu 0 to monarch.
          Delaying for 5 seconds...
          All OS INIT slaves have reached rendezvous
          Processes interrupted by INIT - 0 (cpu 0 task 0xa000000100af0000)
          :
          <<snip>>
          :
          Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=1
          Delaying for 5 seconds...
          mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fail.
          OS INIT slave did not rendezvous on cpu 1 2 3
          INIT swapper 0[0]: bugcheck! 0 [1]
          :
          <<snip>>
          :
          Kernel panic - not syncing: Attempted to kill the idle task!
      
      Proposed fix:
      
        To avoid this problem, this patch inserts ia64_set_psr_mc() to mask
        INIT on cpus going to be frozen.  This masking have no effect if the
        kdump_cpu_freeze() is called from INIT handler when kdump_on_init == 1,
        because psr.mc is already turned on to 1 before entering OS_INIT.
        I confirmed that weird log like above are disappeared after applying
        this patch.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Haren Myneni <hbabu@us.ibm.com>
      Cc: kexec@lists.infradead.org
      Acked-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      4295ab34
  2. 09 Sep, 2009 3 commits
    • Linus Torvalds's avatar
      Linux 2.6.31 · 74fca6a4
      Linus Torvalds authored
      74fca6a4
    • Ed Cashin's avatar
      aoe: allocate unused request_queue for sysfs · 7135a71b
      Ed Cashin authored
      Andy Whitcroft reported an oops in aoe triggered by use of an
      incorrectly initialised request_queue object:
      
        [ 2645.959090] kobject '<NULL>' (ffff880059ca22c0): tried to add
      		an uninitialized object, something is seriously wrong.
        [ 2645.959104] Pid: 6, comm: events/0 Not tainted 2.6.31-5-generic #24-Ubuntu
        [ 2645.959107] Call Trace:
        [ 2645.959139] [<ffffffff8126ca2f>] kobject_add+0x5f/0x70
        [ 2645.959151] [<ffffffff8125b4ab>] blk_register_queue+0x8b/0xf0
        [ 2645.959155] [<ffffffff8126043f>] add_disk+0x8f/0x160
        [ 2645.959161] [<ffffffffa01673c4>] aoeblk_gdalloc+0x164/0x1c0 [aoe]
      
      The request queue of an aoe device is not used but can be allocated in
      code that does not sleep.
      
      Bruno bisected this regression down to
      
        cd43e26f
      
        block: Expose stacked device queues in sysfs
      
      "This seems to generate /sys/block/$device/queue and its contents for
       everyone who is using queues, not just for those queues that have a
       non-NULL queue->request_fn."
      
      Addresses http://bugs.launchpad.net/bugs/410198
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13942
      
      Note that embedding a queue inside another object has always been
      an illegal construct, since the queues are reference counted and
      must persist until the last reference is dropped. So aoe was
      always buggy in this respect (Jens).
      Signed-off-by: default avatarEd Cashin <ecashin@coraid.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Bruno Premont <bonbons@linux-vserver.org>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      7135a71b
    • Linus Torvalds's avatar
      i915: disable interrupts before tearing down GEM state · e6890f6f
      Linus Torvalds authored
      Reinette Chatre reports a frozen system (with blinking keyboard LEDs)
      when switching from graphics mode to the text console, or when
      suspending (which does the same thing). With netconsole, the oops
      turned out to be
      
      	BUG: unable to handle kernel NULL pointer dereference at 0000000000000084
      	IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915]
      
      and it's due to the i915_gem.c code doing drm_irq_uninstall() after
      having done i915_gem_idle(). And the i915_gem_idle() path will do
      
        i915_gem_idle() ->
          i915_gem_cleanup_ringbuffer() ->
            i915_gem_cleanup_hws() ->
              dev_priv->hw_status_page = NULL;
      
      but if an i915 interrupt comes in after this stage, it may want to
      access that hw_status_page, and gets the above NULL pointer dereference.
      
      And since the NULL pointer dereference happens from within an interrupt,
      and with the screen still in graphics mode, the common end result is
      simply a silently hung machine.
      
      Fix it by simply uninstalling the irq handler before idling rather than
      after. Fixes
      
          http://bugzilla.kernel.org/show_bug.cgi?id=13819Reported-and-tested-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Acked-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e6890f6f
  3. 08 Sep, 2009 1 commit
  4. 07 Sep, 2009 7 commits
  5. 06 Sep, 2009 1 commit
    • David S. Miller's avatar
      gianfar: Fix build. · d9d8e041
      David S. Miller authored
      Reported by Michael Guntsche <mike@it-loops.com>
      
      --------------------
      Commit
      38bddf04 gianfar: gfar_remove needs to call unregister_netdev()
      
      breaks the build of the gianfar driver because "dev" is undefined in
      this function. To quickly test rc9 I changed this to priv->ndev but I do
      not know if this is the correct one.
      --------------------
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9d8e041
  6. 05 Sep, 2009 23 commits