1. 19 Jun, 2008 10 commits
    • Mikael Pettersson's avatar
      x86, 32-bit: fix boot failure on TSC-less processors · df17b1d9
      Mikael Pettersson authored
      Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6"
      (invalid opcode) and a kernel halt immediately after the
      kernel has been uncompressed. The BUG shows EIP pointing
      to an rdtsc instruction in native_read_tsc(), invoked from
      native_sched_clock().
      
      (This error occurs so early that not even the serial console
      can capture it.)
      
      A bisection showed that this bug first occurs in 2.6.26-rc3-git7,
      via commit 9ccc906c:
      
      >x86: distangle user disabled TSC from unstable
      >
      >tsc_enabled is set to 0 from the command line switch "notsc" and from
      >the mark_tsc_unstable code. Seperate those functionalities and replace
      >tsc_enable with tsc_disable. This makes also the native_sched_clock()
      >decision when to use TSC understandable.
      >
      >Preparatory patch to solve the sched_clock() issue on 32 bit.
      >
      >Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
      
      The core reason for this bug is that native_sched_clock() gets
      called before tsc_init().
      
      Before the commit above, tsc_32.c used a "tsc_enabled" variable
      which defaulted to 0 == disabled, and which only got enabled late
      in tsc_init(). Thus early calls to native_sched_clock() would skip
      the TSC and use jiffies instead.
      
      After the commit above, tsc_32.c uses a "tsc_disabled" variable
      which defaults to 0, meaning that the TSC is Ok to use. Early calls
      to native_sched_clock() now erroneously try to use the TSC on
      !cpu_has_tsc processors, leading to invalid opcode exceptions.
      
      My proposed fix is to initialise tsc_disabled to a "soft disabled"
      state distinct from the hard disabled state set up by the "notsc"
      kernel option. This fixes the native_sched_clock() problem. It also
      allows tsc_init() to be simplified: instead of setting tsc_disabled = 1
      on every error return, we just set tsc_disabled = 0 once when all
      checks have succeeded.
      
      I've verified that this lets my 486 boot again. I've also verified
      that a Core2 machine still uses the TSC as clocksource after the patch.
      Signed-off-by: default avatarMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      df17b1d9
    • Suresh Siddha's avatar
      x86: fix NULL pointer deref in __switch_to · 75118a82
      Suresh Siddha authored
      Patrick McHardy reported a crash:
      
      > > I get this oops once a day, its apparently triggered by something
      > > run by cron, but the process is a different one each time.
      > >
      > > Kernel is -git from yesterday shortly before the -rc6 release
      > > (last commit is the usb-2.6 merge, the x86 patches are missing),
      > > .config is attached.
      > >
      > > I'll retry with current -git, but the patches that have gone in
      > > since I last updated don't look related.
      > >
      > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
      > > 000001ff
      > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118
      > > [62060.043009] *pde = 00000000
      > > [62060.043009] Oops: 0002 [#1] PREEMPT
      
      Vegard Nossum analyzed it:
      
      > This decodes to
      >
      >    0:   0f ae 00                fxsave (%eax)
      >
      > so it's related to the floating-point context. This is the exact
      > location of the crash:
      >
      > $ addr2line -e arch/x86/kernel/process_32.o -i ab0
      > include/asm/i387.h:232
      > include/asm/i387.h:262
      > arch/x86/kernel/process_32.c:595
      >
      > ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
      > Or maybe it never had any other value.
      
      Somehow (as described below) TS_USEDFPU is set but the fpu is not
      allocated or freed.
      
      Another possible FPU pre-emption issue with the sleazy FPU optimization
      which was benign before but not so anymore, with the dynamic FPU allocation
      patch.
      
      New task is getting exec'd and it is prempted at the below point.
      
      flush_thread() {
      	...
      	/*
      	* Forget coprocessor state..
      	*/
      	clear_fpu(tsk);
      		<----- Preemption point
      	clear_used_math();
      	...
      }
      
      Now when it context switches in again, as the used_math() is still set
      and fpu_counter can be > 5, we will do a math_state_restore() which sets
      the task's TS_USEDFPU. After it continues from the above preemption point
      it does clear_used_math() and much later free_thread_xstate().
      
      Now, at the next context switch, it is quite possible that xstate is
      null, used_math() is not set and TS_USEDFPU is still set. This will
      trigger unlazy_fpu() causing kernel oops.
      
      Fix this  by clearing tsk's fpu_counter before clearing task's fpu.
      Reported-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      75118a82
    • Jeremy Fitzhardinge's avatar
      x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits. · ad524d46
      Jeremy Fitzhardinge authored
      When a 64-bit x86 processor runs in 32-bit PAE mode, a pte can
      potentially have the same number of physical address bits as the
      64-bit host ("Enhanced Legacy PAE Paging").  This means, in theory,
      we could have up to 52 bits of physical address in a pte.
      
      The 32-bit kernel uses a 32-bit unsigned long to represent a pfn.
      This means that it can only represent physical addresses up to 32+12=44
      bits wide.  Rather than widening pfns everywhere, just set 2^44 as the
      Linux x86_32-PAE architectural limit for physical address size.
      
      This is a bugfix for two cases:
      1. running a 32-bit PAE kernel on a machine with
        more than 64GB RAM.
      2. running a 32-bit PAE Xen guest on a host machine with
        more than 64GB RAM
      
      In both cases, a pte could need to have more than 36 bits of physical,
      and masking it to 36-bits will cause fairly severe havoc.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Jan Beulich <jbeulich@novell.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ad524d46
    • Dave Airlie's avatar
      agp: brown paper bag patch - put back two lines that got lost · 9bedbcb2
      Dave Airlie authored
      Commit 62c96b9d ("agp/intel: cleanup
      some serious whitespace badness") didn't just fix whitespace.  It also
      lost two lines.
      
      Noticed by Linus. No more whitespace diffs for me.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9bedbcb2
    • Linus Torvalds's avatar
      Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 · 3506ba7b
      Linus Torvalds authored
      * 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
        agp/intel: cleanup some serious whitespace badness
        [AGP] intel_agp: Add support for Intel 4 series chipsets
        [AGP] intel_agp: extra stolen mem size available for IGD_GM chipset
        agp: more boolean conversions.
        drivers/char/agp - use bool
        agp: two-stage page destruction issue
        agp/via: fixup pci ids
      3506ba7b
    • Dave Airlie's avatar
      62c96b9d
    • Zhenyu Wang's avatar
    • Zhenyu Wang's avatar
      [AGP] intel_agp: extra stolen mem size available for IGD_GM chipset · 598d1448
      Zhenyu Wang authored
      This adds missing stolen memory size detect for IGD_GM, be sure to
      detect right size as current X intel driver (2.3.2) which has already
      worked out.
      Signed-off-by: default avatarZhenyu Wang <zhenyu.z.wang@intel.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      598d1448
    • Dave Airlie's avatar
      agp: more boolean conversions. · 9516b030
      Dave Airlie authored
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      9516b030
    • Joe Perches's avatar
      drivers/char/agp - use bool · c7258012
      Joe Perches authored
      Use boolean in AGP instead of having own TRUE/FALSE
      
      --
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      c7258012
  2. 18 Jun, 2008 30 commits