1. 22 Jun, 2017 8 commits
    • Tyler Baicar's avatar
      efi: print unrecognized CPER section · 0fc300f4
      Tyler Baicar authored
      UEFI spec allows for non-standard section in Common Platform Error
      Record. This is defined in section N.2.3 of UEFI version 2.5.
      
      Currently if the CPER section's type (UUID) does not match with
      one of the section types that the kernel knows how to parse, the
      section is skipped. Therefore, user is not able to see
      such CPER data, for instance, error record of non-standard section.
      
      This change prints out the raw data in hex in the dmesg buffer so
      that non-standard sections are reported to the user. Non-standard
      section type errors should be reported to the user because these
      can include errors which are vendor specific. The data length is
      taken from Error Data length field of Generic Error Data Entry.
      
      The following is a sample output from dmesg:
       Hardware error from APEI Generic Hardware Error Source: 2
       It has been corrected by h/w and requires no further action
       event severity: corrected
        time: precise 2017-03-15 20:37:35
        Error 0, type: corrected
         section type: unknown, d2e2621c-f936-468d-0d84-15a4ed015c8b
         section length: 0x238
         00000000: 4d415201 4d492031 453a4d45 435f4343  .RAM1 IMEM:ECC_C
         00000010: 53515f45 44525f42 00000000 00000000  E_QSB_RD........
         00000020: 00000000 00000000 00000000 00000000  ................
         00000030: 00000000 00000000 01010000 01010000  ................
         00000040: 00000000 00000000 00000005 00000000  ................
         00000050: 01010000 00000000 00000001 00dddd00  ................
      ...
      
      The raw data from the error can then be decoded using vendor
      specific tools.
      Signed-off-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      0fc300f4
    • Jonathan (Zhixiong) Zhang's avatar
      acpi: apei: panic OS with fatal error status block · 2fb5853e
      Jonathan (Zhixiong) Zhang authored
      Even if an error status block's severity is fatal, the kernel does not
      honor the severity level and panic.
      
      With the firmware first model, the platform could inform the OS about a
      fatal hardware error through the non-NMI GHES notification type. The OS
      should panic when a hardware error record is received with this
      severity.
      
      Call panic() after CPER data in error status block is printed if
      severity is fatal, before each error section is handled.
      Signed-off-by: default avatarJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Signed-off-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      2fb5853e
    • Tyler Baicar's avatar
      acpi: apei: handle SEA notification type for ARMv8 · 7edda088
      Tyler Baicar authored
      ARM APEI extension proposal added SEA (Synchronous External Abort)
      notification type for ARMv8.
      Add a new GHES error source handling function for SEA. If an error
      source's notification type is SEA, then this function can be registered
      into the SEA exception handler. That way GHES will parse and report
      SEA exceptions when they occur.
      An SEA can interrupt code that had interrupts masked and is treated as
      an NMI. To aid this the page of address space for mapping APEI buffers
      while in_nmi() is always reserved, and ghes_ioremap_pfn_nmi() is
      changed to use the helper methods to find the prot_t to map with in
      the same way as ghes_ioremap_pfn_irq().
      Signed-off-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      7edda088
    • Tyler Baicar's avatar
      arm64: exception: handle Synchronous External Abort · 32015c23
      Tyler Baicar authored
      SEA exceptions are often caused by an uncorrected hardware
      error, and are handled when data abort and instruction abort
      exception classes have specific values for their Fault Status
      Code.
      When SEA occurs, before killing the process, report the error
      in the kernel logs.
      Update fault_info[] with specific SEA faults so that the
      new SEA handler is used.
      Signed-off-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      [will: use NULL instead of 0 when assigning si_addr]
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      32015c23
    • Tyler Baicar's avatar
      efi: parse ARM processor error · 2f74f09b
      Tyler Baicar authored
      Add support for ARM Common Platform Error Record (CPER).
      UEFI 2.6 specification adds support for ARM specific
      processor error information to be reported as part of the
      CPER records. This provides more detail on for processor error logs.
      Signed-off-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      2f74f09b
    • Tyler Baicar's avatar
      cper: add timestamp print to CPER status printing · 8a94471f
      Tyler Baicar authored
      The ACPI 6.1 spec added a timestamp to the generic error data
      entry structure. Print the timestamp out when printing out the
      error information.
      Signed-off-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      8a94471f
    • Tyler Baicar's avatar
      ras: acpi/apei: cper: add support for generic data v3 structure · bbcc2e7b
      Tyler Baicar authored
      The ACPI 6.1 spec adds a new revision of the generic error data
      entry structure. Add support to handle the new structure as well
      as properly verify and iterate through the generic data entries.
      Signed-off-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      bbcc2e7b
    • Tyler Baicar's avatar
      acpi: apei: read ack upon ghes record consumption · 42aa5604
      Tyler Baicar authored
      A RAS (Reliability, Availability, Serviceability) controller
      may be a separate processor running in parallel with OS
      execution, and may generate error records for consumption by
      the OS. If the RAS controller produces multiple error records,
      then they may be overwritten before the OS has consumed them.
      
      The Generic Hardware Error Source (GHES) v2 structure
      introduces the capability for the OS to acknowledge the
      consumption of the error record generated by the RAS
      controller. A RAS controller supporting GHESv2 shall wait for
      the acknowledgment before writing a new error record, thus
      eliminating the race condition.
      
      Add support for parsing of GHESv2 sub-tables as well.
      Signed-off-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      42aa5604
  2. 20 Jun, 2017 1 commit
  3. 09 Jun, 2017 2 commits
  4. 08 Jun, 2017 1 commit
  5. 07 Jun, 2017 1 commit
  6. 05 Jun, 2017 27 commits