1. 26 Nov, 2004 1 commit
    • Paul Mackerras's avatar
      [PATCH] ppc64: fix hang on legacy iSeries · 5bc58b21
      Paul Mackerras authored
      Recently we have uncovered a bug in the kernel exception exit path
      which can cause iSeries machines to hang with interrupts disabled,
      typically when unloading a module.  This patch fixes the bug and
      should go in 2.6.10.  Here is the detailed explanation:
      
      There are a couple of places in the exception exit path in entry.S
      where we disable interrupts and then later reenable them.  We
      hard-disable interrupts even on legacy iSeries (rather than
      soft-disabling them) because the final part of the exception exit path
      needs interrupts hard-disabled (even on legacy iSeries), because
      otherwise an incoming interrupt could trash SRR0 and SRR1 and cause us
      to lose state.
      
      The intention was that each path that hard-disabled interrupts would
      hard-enable them again, either explicitly or by executing an rfid
      instruction (return from interrupt, doubleword).  However there was
      one path where we didn't correctly hard-enable interrupts.  This meant
      we could end up calling schedule() with interrupts hard-disabled and
      then switch to the stopmachine thread (used in removing a module),
      which spins polling a variable until another cpu changes it.  Since
      local_irq_enable() etc. on legacy iSeries only soft-enable interrupts,
      we got into the stopmachine thread with interrupts hard-disabled, and
      the machine hung at that point.
      
      This patch fixes it by making sure that when we go to re-enable
      interrupts, the MSR value we are loading up actually does have the
      MSR.EE (external interrupt enable) bit set.  Stephen Rothwell has
      verified that this actually does fix the bug on iSeries.  The bug
      also potentially exists on pSeries (and this patch fixes it), but
      there it doesn't really matter, because schedule() will enable
      interrupts (and on pSeries that means hard-enabling them), and because
      the hypervisor doesn't mind you having interrupts hard-disabled for
      extended periods on pSeries.  Note that all these comments about
      pSeries also apply to POWER5 iSeries (i5) machines.
      
      While I was there I noticed that we were jumping to ret_from_except
      after calling do_IRQ on iSeries, rather than ret_from_except_lite,
      meaning that we will restore registers 14-31 twice, unnecessarily.  I
      changed it to jump to ret_from_except_lite instead, and Stephen
      checked that this change doesn't cause any breakage.
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5bc58b21
  2. 25 Nov, 2004 15 commits
  3. 24 Nov, 2004 9 commits
  4. 23 Nov, 2004 9 commits
  5. 22 Nov, 2004 6 commits