1. 26 Feb, 2008 24 commits
    • Thomas Gleixner's avatar
      x86: no robust/pi futex for real i386 CPUs · f18edc95
      Thomas Gleixner authored
      Real i386 CPUs do not have cmpxchg instructions. Catch it before
      crashing on an invalid opcode.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      f18edc95
    • Mikael Pettersson's avatar
      x86: fix boot failure on 486 due to TSC breakage · 12c247a6
      Mikael Pettersson authored
       > Diffing dmesg between git7 and git8 doesn't sched any light since
       > git8 also removed the printouts of the x86 caps as they were being
       > initialised and updated. I'm currently adding those printouts back
       > in the hope of seeing where and when the caps get broken.
      
      That turned out to be very illuminating:
      
       --- dmesg-2.6.24-git7	2008-02-24 18:01:25.295851000 +0100
       +++ dmesg-2.6.24-git8	2008-02-24 18:01:25.530358000 +0100
       ...
       CPU: After generic identify, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      
       CPU: After all inits, caps: 00000003 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      +CPU: After applying cleared_cpu_caps, caps: 00000013 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      
      Notice how the TSC cap bit goes from Off to On.
      
      (The first two lines are printout loops from -git7 forward-ported
      to -git8, the third line is the same printout loop added just after
      the xor-with-cleared_cpu_caps[] loop.)
      
      Here's how the breakage occurs:
      1. arch/x86/kernel/tsc_32.c:tsc_init() sees !cpu_has_tsc,
         so bails and calls setup_clear_cpu_cap(X86_FEATURE_TSC).
      2. include/asm-x86/cpufeature.h:setup_clear_cpu_cap(bit) clears
         the bit in boot_cpu_data and sets it in cleared_cpu_caps
      3. arch/x86/kernel/cpu/common.c:identify_cpu() XORs all caps
         in with cleared_cpu_caps
         HOWEVER, at this point c->x86_capability correctly has TSC
         Off, cleared_cpu_caps has TSC On, so the XOR incorrectly
         sets TSC to On in c->x86_capability, with disastrous results.
      
      The real bug is that clearing bits with XOR only works if the
      bits are known to be 1 prior to the XOR, and that's not true here.
      
      A simple fix is to convert the XOR to AND-NOT instead. The following
      patch does that, and allows my 486 to boot 2.6.25-rc kernels again.
      
      [ mingo@elte.hu: fixed a similar bug in setup_64.c as well. ]
      
      The breakage was introduced via commit 7d851c8d.
      Signed-off-by: default avatarMikael Pettersson <mikpe@it.uu.se>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      12c247a6
    • Priit Laes's avatar
      x86: fix build on non-C locales. · 03994f01
      Priit Laes authored
      For some locales regex range [a-zA-Z] does not work as it is supposed to.
      so we have to use [:alnum:] and [:xdigit:] to make it work as intended.
      
      [1] http://en.wikipedia.org/wiki/Estonian_alphabetSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      03994f01
    • Glauber Costa's avatar
      x86: make c_idle.work have a static address. · 2b775a27
      Glauber Costa authored
      Currently, c_idle is declared in the stack, and thus, have no static address.
      
      Peter Zijlstra points out this simple solution, in which c_idle.work
      is initializated separatedly. Note that the INIT_WORK macro has a static
      declaration of a key inside.
      Signed-off-by: default avatarGlauber Costa <gcosta@redhat.com>
      Acked-by: default avatarPeter Zijlstra <pzijlstr@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2b775a27
    • Vegard Nossum's avatar
      x86: don't save unreliable stack trace entries · 1650743c
      Vegard Nossum authored
      Currently, there is no way for print_stack_trace() to determine whether
      a given stack trace entry was deemed reliable or not, simply because
      save_stack_trace() does not record this information. (Perhaps needless
      to say, this makes the saved stack traces A LOT harder to read, and
      probably with no other benefits, since debugging features that use
      save_stack_trace() most likely also require frame pointers, etc.)
      
      This patch reverts to the old behaviour of only recording the reliable trace
      entries for saved stack traces.
      Signed-off-by: default avatarVegard Nossum <vegardno@ifi.uio.no>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1650743c
    • Adrian Bunk's avatar
      x86: don't make swapper_pg_pmd global · ed2b7e2b
      Adrian Bunk authored
      There doesn't seem to be any reason for swapper_pg_pmd being global.
      Signed-off-by: default avatarAdrian Bunk <bunk@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ed2b7e2b
    • Joerg Roedel's avatar
      x86: don't print a warning when MTRR are blank and running in KVM · 4147c874
      Joerg Roedel authored
      Inside a KVM virtual machine the MTRRs are usually blank. This confuses Linux
      and causes a warning message at boot. This patch removes that warning message
      when running Linux as a KVM guest.
      Signed-off-by: default avatarJoerg Roedel <joerg.roedel@amd.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4147c874
    • Ingo Molnar's avatar
      x86: fix execve with -fstack-protect · 5d119b2c
      Ingo Molnar authored
      pointed out by pageexec@freemail.hu:
      
      > what happens here is that gcc treats the argument area as owned by the
      > callee, not the caller and is allowed to do certain tricks. for ssp it
      > will make a copy of the struct passed by value into the local variable
      > area and pass *its* address down, and it won't copy it back into the
      > original instance stored in the argument area.
      >
      > so once sys_execve returns, the pt_regs passed by value hasn't at all
      > changed and its default content will cause a nice double fault (FWIW,
      > this part took me the longest to debug, being down with cold didn't
      > help it either ;).
      
      To fix this we pass in pt_regs by pointer.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      5d119b2c
    • Thomas Gleixner's avatar
      x86: fix vsyscall wreckage · ce28b986
      Thomas Gleixner authored
      based on a report from Arne Georg Gleditsch about user-space apps
      misbehaving after toggling /proc/sys/kernel/vsyscall64, a review
      of the code revealed that the "NOP patching" done there is
      fundamentally unsafe for a number of reasons:
      
      1) the patching code runs without synchronizing other CPUs
      
      2) it inserts NOPs even if there is no clock source which provides vread
      
      3) when the clock source changes to one without vread we run in
         exactly the same problem as in #2
      
      4) if nobody toggles the proc entry from 1 to 0 and to 1 again, then
         the syscall is not patched out
      
      as a result it is possible to break user-space via this patching.
      The only safe thing for now is to remove the patching.
      
      This code was broken since v2.6.21.
      Reported-by: default avatarArne Georg Gleditsch <arne.gleditsch@dolphinics.no>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ce28b986
    • Ingo Molnar's avatar
      x86: rename KERNEL_TEXT_SIZE => KERNEL_IMAGE_SIZE · d4afe414
      Ingo Molnar authored
      The KERNEL_TEXT_SIZE constant was mis-named, as we not only map the kernel
      text but data, bss and init sections as well.
      
      That name led me on the wrong path with the KERNEL_TEXT_SIZE regression,
      because i knew how big of _text_ my images have and i knew about the 40 MB
      "text" limit so i wrongly thought to be on the safe side of the 40 MB limit
      with my 29 MB of text, while the total image size was slightly above 40 MB.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d4afe414
    • Ingo Molnar's avatar
      x86: fix spontaneous reboot with allyesconfig bzImage · 88f3aec7
      Ingo Molnar authored
      recently the 64-bit allyesconfig bzImage kernel started spontaneously
      rebooting during early bootup.
      
      after a few fun hours spent with early init debugging, it turns out
      that we've got this rather annoying limit on the size of the kernel
      image:
      
            #define KERNEL_TEXT_SIZE  (40*1024*1024)
      
      which limit my vmlinux just happened to pass:
      
             text           data       bss        dec       hex   filename
         29703744        4222751   8646224c   42572719   2899baf   vmlinux
      
      40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/
      
      So it happily crashed right in head_64.S, which - as we all know - is
      the most debuggable code in the whole architecture ;-)
      
      So increase the limit to allow an up to 128MB kernel image to be mapped.
      (should anyone be that crazy or lazy)
      
      We have a full 4K of pagetable (level2_kernel_pgt) allocated for these
      mappings already, so there's no RAM overhead and the limit was rather
      pointless and arbitrary.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      88f3aec7
    • Yinghai Lu's avatar
      x86: remove double-checking empty zero pages debug · 3b57bc46
      Yinghai Lu authored
      so far no one complained about that.
      Signed-off-by: default avatarYinghai Lu <yinghai.lu@sun.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3b57bc46
    • Pavel Machek's avatar
      x86: notsc is ignored on common configurations · 7265b6f1
      Pavel Machek authored
      notsc is ignored in 32-bit kernels if CONFIG_X86_TSC is on.. which is
      bad, fix it.
      Signed-off-by: default avatarPavel Machek <pavel@suse.cz>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7265b6f1
    • Randy Dunlap's avatar
      x86/mtrr: fix kernel-doc missing notation · f5106d91
      Randy Dunlap authored
      Fix mtrr kernel-doc warning:
      Warning(linux-2.6.24-git12//arch/x86/kernel/cpu/mtrr/main.c:677): No description found for parameter 'end_pfn'
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f5106d91
    • H. Peter Anvin's avatar
      x86: handle BIOSes which terminate e820 with CF=1 and no SMAP · 829157be
      H. Peter Anvin authored
      The proper way to terminate the e820 chain is with %ebx == 0 on the
      last legitimate memory block.  However, several BIOSes don't do that
      and instead return error (CF = 1) when trying to read off the end of
      the list.  For this error return, %eax doesn't necessarily return the
      SMAP signature -- correctly so, since %ah should contain an error code
      in this case.
      
      To deal with some particularly broken BIOSes, we clear the entire e820
      chain if the SMAP signature is missing in the middle, indicating a
      plain insane e820 implementation.  However, we need to make the test
      for CF = 1 before the SMAP check.
      
      This fixes at least one HP laptop (nc6400) for which none of the
      memory-probing methods (e820, e801, 88) functioned fully according to
      spec.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      829157be
    • H. Peter Anvin's avatar
      x86: add comments for NOPs · 4cd20952
      H. Peter Anvin authored
      Add comments describing the various NOP sequences.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      4cd20952
    • H. Peter Anvin's avatar
      x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERIC · 959b3be6
      H. Peter Anvin authored
      P6_NOPs are definitely not supported on some VIA CPUs, and possibly
      (unverified) on AMD K7s.  It is also the only thing that prevents a
      686 kernel from running on Transmeta TM3x00/5x00 (Crusoe) series.
      
      The performance benefit over generic NOPs is very small, so when
      building for generic consumption, avoid using them.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      959b3be6
    • H. Peter Anvin's avatar
      x86: require family >= 6 if we are using P6 NOPs · 7343b3b3
      H. Peter Anvin authored
      The P6 family of NOPs are only available on family >= 6 or above, so
      enforce that in the boot code.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      7343b3b3
    • H. Peter Anvin's avatar
      x86: do not promote TM3x00/TM5x00 to i686-class · a7ef94e6
      H. Peter Anvin authored
      We have been promoting Transmeta TM3x00/TM5x00 chips to i686-class
      based on the notion that they contain all the user-space visible
      features of an i686-class chip.  However, this is not actually true:
      they lack the EA-taking long NOPs (0F 1F /0).  Since this is a
      userspace-visible incompatibility, downgrade these CPUs to the
      manufacturer-defined i586 level.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      a7ef94e6
    • Pavel Machek's avatar
      x86: hpet fix docbook comment · b02a7f22
      Pavel Machek authored
      Signed-off-by: default avatarPavel Machek <Pavel@suse.cz>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      b02a7f22
    • Ingo Molnar's avatar
      x86: make DEBUG_PAGEALLOC and CPA more robust · 92cb54a3
      Ingo Molnar authored
      Use PF_MEMALLOC to prevent recursive calls in the DBEUG_PAGEALLOC
      case. This makes the code simpler and more robust against allocation
      failures.
      
      This fixes the following fallback to non-mmconfig:
      
         http://lkml.org/lkml/2008/2/20/551
         http://bugzilla.kernel.org/show_bug.cgi?id=10083
      
      Also, for DEBUG_PAGEALLOC=n reduce the pool size to one page.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      92cb54a3
    • Ahmed S. Darwish's avatar
      x86/lguest: fix pgdir pmd index calculation · 1ce70c4f
      Ahmed S. Darwish authored
      Hi all,
      
      Beginning from commits close to v2.6.25-rc2, running lguest always oopses
      the host kernel. Oops is at [1].
      
      Bisection led to the following commit:
      
      commit 37cc8d7f
      
          x86/early_ioremap: don't assume we're using swapper_pg_dir
      
          At the early stages of boot, before the kernel pagetable has been
          fully initialized, a Xen kernel will still be running off the
          Xen-provided pagetables rather than swapper_pg_dir[].  Therefore,
          readback cr3 to determine the base of the pagetable rather than
          assuming swapper_pg_dir[].
      
       static inline pmd_t * __init early_ioremap_pmd(unsigned long addr)
       {
      -	pgd_t *pgd = &swapper_pg_dir[pgd_index(addr)];
      +	/* Don't assume we're using swapper_pg_dir at this point */
      +	pgd_t *base = __va(read_cr3());
      +	pgd_t *pgd = &base[pgd_index(addr)];
       	pud_t *pud = pud_offset(pgd, addr);
       	pmd_t *pmd = pmd_offset(pud, addr);
      
      Trying to analyze the problem, it seems on the guest side of lguest,
      %cr3 has a different value from &swapper_pg-dir (which
      is AFAIK fine on a pravirt guest):
      
      Putting some debugging messages in early_ioremap_pmd:
      
      /* Appears 3 times */
      [    0.000000] ***************************
      [    0.000000] __va(%cr3) = c0000000, &swapper_pg_dir = c02cc000
      [    0.000000] ***************************
      
      After 8 hours of debugging and staring on lguest code, I noticed something
      strange in paravirt_ops->set_pmd hypercall invocation:
      
      static void lguest_set_pmd(pmd_t *pmdp, pmd_t pmdval)
      {
      	*pmdp = pmdval;
      	lazy_hcall(LHCALL_SET_PMD, __pa(pmdp)&PAGE_MASK,
      		   (__pa(pmdp)&(PAGE_SIZE-1))/4, 0);
      }
      
      The first hcall parameter is global pgdir which looks fine. The second
      parameter is the pmd index in the pgdir which is suspectful.
      
      AFAIK, calculating the index of pmd does not need a divisoin over four.
      Removing the division made lguest work fine again . Patch is at [2].
      
      I am not sure why the division over four existed in the first place. It
      seems bogus, maybe the Xen patch just made the problem appear ?
      
      [2]: The patch:
      
      [PATCH] lguest: fix pgdir pmd index cacluation
      
      Remove an error in index calculation which leads to removing
      a not existing shadow page table (leading to a Null dereference).
      Signed-off-by: default avatarAhmed S. Darwish <darwish.07@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1ce70c4f
    • Tony Breeds's avatar
      lguest: fix build breakage · db342d21
      Tony Breeds authored
      [ mingo@elte.hu: merged to Rusty's patch ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      db342d21
    • Harvey Harrison's avatar
      lguest: include function prototypes · cbc34973
      Harvey Harrison authored
      Added a declaration to asm-x86/lguest.h and moved the extern arrays there
      as well.  As an alternative to including asm/lguest.h directly, an
      include could be put in linux/lguest.h
      Signed-off-by: default avatarHarvey Harrison <harvey.harrison@gmail.com>
      Cc: "rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      cbc34973
  2. 24 Feb, 2008 16 commits