1. 20 May, 2010 11 commits
    • Huang Ying's avatar
      ACPI, APEI, Use ERST for persistent storage of MCE · 482908b4
      Huang Ying authored
      Traditionally, fatal MCE will cause Linux print error log to console
      then reboot. Because MCE registers will preserve their content after
      warm reboot, the hardware error can be logged to disk or network after
      reboot. But system may fail to warm reboot, then you may lose the
      hardware error log. ERST can help here. Through saving the hardware
      error log into flash via ERST before go panic, the hardware error log
      can be gotten from the flash after system boot successful again.
      
      The fatal MCE processing procedure with ERST involved is as follow:
      
      - Hardware detect error, MCE raised
      - MCE read MCE registers, check error severity (fatal), prepare error record
      - Write MCE error record into flash via ERST
      - Go panic, then trigger system reboot
      - System reboot, /sbin/mcelog run, it reads /dev/mcelog to check flash
        for error record of previous boot via ERST, and output and clear
        them if available
      - /sbin/mcelog logs error records into disk or network
      
      ERST only accepts CPER record format, but there is no pre-defined CPER
      section can accommodate all information in struct mce, so a customized
      section type is defined to hold struct mce inside a CPER record as an
      error section.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      482908b4
    • Huang Ying's avatar
      ACPI, APEI, Error Record Serialization Table (ERST) support · a08f82d0
      Huang Ying authored
      ERST is a way provided by APEI to save and retrieve hardware error
      record to and from some simple persistent storage (such as flash).
      
      The Linux kernel support implementation is quite simple and workable
      in NMI context. So it can be used to save hardware error record into
      flash in hardware error exception or NMI handler, where other more
      complex persistent storage such as disk is not usable. After saving
      hardware error records via ERST in hardware error exception or NMI
      handler, the error records can be retrieved and logged into disk or
      network after a clean reboot.
      
      For more information about ERST, please refer to ACPI Specification
      version 4.0, section 17.4.
      
      This patch incorporate fixes from Jin Dongming.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      a08f82d0
    • Huang Ying's avatar
      ACPI, APEI, Generic Hardware Error Source memory error support · d334a491
      Huang Ying authored
      Generic Hardware Error Source provides a way to report platform
      hardware errors (such as that from chipset). It works in so called
      "Firmware First" mode, that is, hardware errors are reported to
      firmware firstly, then reported to Linux by firmware. This way, some
      non-standard hardware error registers or non-standard hardware link
      can be checked by firmware to produce more valuable hardware error
      information for Linux.
      
      Now, only SCI notification type and memory errors are supported. More
      notification type and hardware error type will be added later. These
      memory errors are reported to user space through /dev/mcelog via
      faking a corrected Machine Check, so that the error memory page can be
      offlined by /sbin/mcelog if the error count for one page is beyond the
      threshold.
      
      On some machines, Machine Check can not report physical address for
      some corrected memory errors, but GHES can do that. So this simplified
      GHES is implemented firstly.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      d334a491
    • Huang Ying's avatar
      ACPI, APEI, UEFI Common Platform Error Record (CPER) header · 06d65dea
      Huang Ying authored
      CPER stands for Common Platform Error Record, it is the hardware error
      record format used to describe platform hardware error by various APEI
      tables, such as ERST, BERT and HEST etc.
      
      For more information about CPER, please refer to Appendix N of UEFI
      Specification version 2.3.
      
      This patch mainly includes the data structure difinition header file
      used by other files.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      06d65dea
    • Huang Ying's avatar
      Unified UUID/GUID definition · fab1c232
      Huang Ying authored
      There are many different UUID/GUID definitions in kernel, such as that
      in EFI, many file systems, some drivers, etc. Every kernel components
      need UUID/GUID has its own definition. This patch provides a unified
      definition for UUID/GUID.
      
      UUID is defined via typedef. This makes that UUID appears more like a
      preliminary type, and makes the data type explicit (comparing with
      implicit "u8 uuid[16]").
      
      The binary representation of UUID/GUID can be little-endian (used by
      EFI, etc) or big-endian (defined by RFC4122), so both is defined.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      fab1c232
    • Huang Ying's avatar
      ACPI Hardware Error Device (PNP0C33) support · 801eab81
      Huang Ying authored
      Hardware Error Device (PNP0C33) is used to report some hardware errors
      notified via SCI, mainly the corrected errors. Some APEI Generic
      Hardware Error Source (GHES) may use SCI on hardware error device to
      notify hardware error to kernel.
      
      After receiving notification from ACPI core, it is forwarded to all
      listeners via a notifier chain. The listener such as APEI GHES should
      check corresponding error source for new events when notified.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      801eab81
    • Huang Ying's avatar
      ACPI, APEI, PCIE AER, use general HEST table parsing in AER firmware_first setup · affb72c3
      Huang Ying authored
      Now, a dedicated HEST tabling parsing code is used for PCIE AER
      firmware_first setup. It is rebased on general HEST tabling parsing
      code of APEI. The firmware_first setup code is moved from PCI core to
      AER driver too, because it is only AER related.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Acked-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      affb72c3
    • Huang Ying's avatar
      ACPI, APEI, Document for APEI · ea8c071c
      Huang Ying authored
      Add document for APEI, including kernel parameters and EINJ debug file
      sytem interface.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      ea8c071c
    • Huang Ying's avatar
      ACPI, APEI, EINJ support · e4021345
      Huang Ying authored
      EINJ provides a hardware error injection mechanism, this is useful for
      debugging and testing of other APEI and RAS features.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      e4021345
    • Huang Ying's avatar
      ACPI, APEI, HEST table parsing · 9dc96664
      Huang Ying authored
      HEST describes error sources in detail; communicating operational
      parameters (i.e. severity levels, masking bits, and threshold values)
      to OS as necessary. It also allows the platform to report error
      sources for which OS would typically not implement support (for
      example, chipset-specific error registers).
      
      HEST information may be needed by other subsystems. For example, HEST
      PCIE AER error source information describes whether a PCIE root port
      works in "firmware first" mode, this is needed by general PCIE AER
      error subsystem. So a public HEST tabling parsing interface is
      provided.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      9dc96664
    • Huang Ying's avatar
      ACPI, APEI, APEI supporting infrastructure · a643ce20
      Huang Ying authored
      APEI stands for ACPI Platform Error Interface, which allows to report
      errors (for example from the chipset) to the operating system. This
      improves NMI handling especially. In addition it supports error
      serialization and error injection.
      
      For more information about APEI, please refer to ACPI Specification
      version 4.0, chapter 17.
      
      This patch provides some common functions used by more than one APEI
      tables, mainly framework of interpreter for EINJ and ERST.
      
      A machine readable language is defined for EINJ and ERST for OS to
      execute, and so to drive the firmware to fulfill the corresponding
      functions. The machine language for EINJ and ERST is compatible, so a
      common framework is defined for them.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      a643ce20
  2. 19 May, 2010 1 commit
    • Huang Ying's avatar
      ACPI, IO memory pre-mapping and atomic accessing · 15651291
      Huang Ying authored
      Some ACPI IO accessing need to be done in atomic context. For example,
      APEI ERST operations may be used for permanent storage in hardware
      error handler. That is, it may be called in atomic contexts such as
      IRQ or NMI, etc. And, ERST/EINJ implement their operations via IO
      memory/port accessing.  But the IO memory accessing method provided by
      ACPI (acpi_read/acpi_write) maps the IO memory during it is accessed,
      so it can not be used in atomic context. To solve the issue, the IO
      memory should be pre-mapped during EINJ/ERST initializing. A linked
      list is used to record which memory area has been mapped, when memory
      is accessed in hardware error handler, search the linked list for the
      mapped virtual address from the given physical address.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      15651291
  3. 16 May, 2010 6 commits
  4. 15 May, 2010 17 commits
  5. 14 May, 2010 5 commits
    • H. Peter Anvin's avatar
      x86, mrst: Don't blindly access extended config space · e9b1d5d0
      H. Peter Anvin authored
      Do not blindly access extended configuration space unless we actively
      know we're on a Moorestown platform.  The fixed-size BAR capability
      lives in the extended configuration space, and thus is not applicable
      if the configuration space isn't appropriately sized.
      
      This fixes booting certain VMware configurations with CONFIG_MRST=y.
      
      Moorestown will add a fake PCI-X 266 capability to advertise the
      presence of extended configuration space.
      Reported-and-tested-by: default avatarPetr Vandrovec <petr@vandrovec.name>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Acked-by: default avatarJacob Pan <jacob.jun.pan@intel.com>
      Acked-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      LKML-Reference: <AANLkTiltKUa3TrKR1M51eGw8FLNoQJSLT0k0_K5X3-OJ@mail.gmail.com>
      e9b1d5d0
    • Linus Torvalds's avatar
      Merge branch 'x86-fixes-for-linus' of... · ef0e9180
      Linus Torvalds authored
      Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments
        x86, k8: Fix build error when K8_NB is disabled
        x86, amd: Check X86_FEATURE_OSVW bit before accessing OSVW MSRs
        x86: Fix fake apicid to node mapping for numa emulation
      ef0e9180
    • Frank Arnold's avatar
      x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments · 7f284d3c
      Frank Arnold authored
      When running a quest kernel on xen we get:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
      IP: [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df
      PGD 0
      Oops: 0000 [#1] SMP
      last sysfs file:
      CPU 0
      Modules linked in:
      
      Pid: 0, comm: swapper Tainted: G        W  2.6.34-rc3 #1 /HVM domU
      RIP: 0010:[<ffffffff8142f2fb>]  [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x
      2ca/0x3df
      RSP: 0018:ffff880002203e08  EFLAGS: 00010046
      RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000060
      RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000
      RBP: ffff880002203ed8 R08: 00000000000017c0 R09: ffff880002203e38
      R10: ffff8800023d5d40 R11: ffffffff81a01e28 R12: ffff880187e6f5c0
      R13: ffff880002203e34 R14: ffff880002203e58 R15: ffff880002203e68
      FS:  0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000038 CR3: 0000000001a3c000 CR4: 00000000000006f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a44020)
      Stack:
       ffffffff810d7ecb ffff880002203e20 ffffffff81059140 ffff880002203e30
      <0> ffffffff810d7ec9 0000000002203e40 000000000050d140 ffff880002203e70
      <0> 0000000002008140 0000000000000086 ffff880040020140 ffffffff81068b8b
      Call Trace:
       <IRQ>
       [<ffffffff810d7ecb>] ? sync_supers_timer_fn+0x0/0x1c
       [<ffffffff81059140>] ? mod_timer+0x23/0x25
       [<ffffffff810d7ec9>] ? arm_supers_timer+0x34/0x36
       [<ffffffff81068b8b>] ? hrtimer_get_next_event+0xa7/0xc3
       [<ffffffff81058e85>] ? get_next_timer_interrupt+0x19a/0x20d
       [<ffffffff8142fa23>] get_cpu_leaves+0x5c/0x232
       [<ffffffff8106a7b1>] ? sched_clock_local+0x1c/0x82
       [<ffffffff8106a9a0>] ? sched_clock_tick+0x75/0x7a
       [<ffffffff8107748c>] generic_smp_call_function_single_interrupt+0xae/0xd0
       [<ffffffff8101f6ef>] smp_call_function_single_interrupt+0x18/0x27
       [<ffffffff8100a773>] call_function_single_interrupt+0x13/0x20
       <EOI>
       [<ffffffff8143c468>] ? notifier_call_chain+0x14/0x63
       [<ffffffff810295c6>] ? native_safe_halt+0xc/0xd
       [<ffffffff810114eb>] ? default_idle+0x36/0x53
       [<ffffffff81008c22>] cpu_idle+0xaa/0xe4
       [<ffffffff81423a9a>] rest_init+0x7e/0x80
       [<ffffffff81b10dd2>] start_kernel+0x40e/0x419
       [<ffffffff81b102c8>] x86_64_start_reservations+0xb3/0xb7
       [<ffffffff81b103c4>] x86_64_start_kernel+0xf8/0x107
      Code: 14 d5 40 ff ae 81 8b 14 02 31 c0 3b 15 47 1c 8b 00 7d 0e 48 8b 05 36 1c 8b
       00 48 63 d2 48 8b 04 d0 c7 85 5c ff ff ff 00 00 00 00 <8b> 70 38 48 8d 8d 5c ff
       ff ff 48 8b 78 10 ba c4 01 00 00 e8 eb
      RIP  [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df
       RSP <ffff880002203e08>
      CR2: 0000000000000038
      ---[ end trace a7919e7f17c0a726 ]---
      
      The L3 cache index disable feature of AMD CPUs has to be disabled if the
      kernel is running as guest on top of a hypervisor because northbridge
      devices are not available to the guest. Currently, this fixes a boot
      crash on top of Xen. In the future this will become an issue on KVM as
      well.
      
      Check if northbridge devices are present and do not enable the feature
      if there are none.
      
      [ hpa: backported to 2.6.34 ]
      Signed-off-by: default avatarFrank Arnold <frank.arnold@amd.com>
      LKML-Reference: <1271945222-5283-3-git-send-email-bp@amd64.org>
      Acked-by: default avatarBorislav Petkov <borislav.petkov@amd.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: <stable@kernel.org>
      7f284d3c
    • Borislav Petkov's avatar
      x86, k8: Fix build error when K8_NB is disabled · ade029e2
      Borislav Petkov authored
      K8_NB depends on PCI and when the last is disabled (allnoconfig) we fail
      at the final linking stage due to missing exported num_k8_northbridges.
      Add a header stub for that.
      Signed-off-by: default avatarBorislav Petkov <borislav.petkov@amd.com>
      LKML-Reference: <20100503183036.GJ26107@aftab>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: <stable@kernel.org>
      ade029e2
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify · 4fc4c3ce
      Linus Torvalds authored
      * 'for-linus' of git://git.infradead.org/users/eparis/notify:
        inotify: don't leak user struct on inotify release
        inotify: race use after free/double free in inotify inode marks
        inotify: clean up the inotify_add_watch out path
        Inotify: undefined reference to `anon_inode_getfd'
      
      Manual merge to remove duplicate "select ANON_INODES" from Kconfig file
      4fc4c3ce