1. 11 Mar, 2024 11 commits
  2. 05 Mar, 2024 3 commits
  3. 04 Mar, 2024 4 commits
  4. 01 Mar, 2024 1 commit
  5. 24 Feb, 2024 4 commits
    • Ankit Agrawal's avatar
      vfio: Convey kvm that the vfio-pci device is wc safe · a39d3a96
      Ankit Agrawal authored
      The VM_ALLOW_ANY_UNCACHED flag is implemented for ARM64,
      allowing KVM stage 2 device mapping attributes to use Normal-NC
      rather than DEVICE_nGnRE, which allows guest mappings supporting
      write-combining attributes (WC). ARM does not architecturally
      guarantee this is safe, and indeed some MMIO regions like the GICv2
      VCPU interface can trigger uncontained faults if Normal-NC is used.
      
      To safely use VFIO in KVM the platform must guarantee full safety
      in the guest where no action taken against a MMIO mapping can
      trigger an uncontained failure. The expectation is that most VFIO PCI
      platforms support this for both mapping types, at least in common
      flows, based on some expectations of how PCI IP is integrated. So
      make vfio-pci set the VM_ALLOW_ANY_UNCACHED flag.
      Suggested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarAnkit Agrawal <ankita@nvidia.com>
      Link: https://lore.kernel.org/r/20240224150546.368-5-ankita@nvidia.comSigned-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      a39d3a96
    • Ankit Agrawal's avatar
      KVM: arm64: Set io memory s2 pte as normalnc for vfio pci device · 8c47ce3e
      Ankit Agrawal authored
      To provide VM with the ability to get device IO memory with NormalNC
      property, map device MMIO in KVM for ARM64 at stage2 as NormalNC.
      Having NormalNC S2 default puts guests in control (based on [1],
      "Combining stage 1 and stage 2 memory type attributes") of device
      MMIO regions memory mappings. The rules are summarized below:
      ([(S1) - stage1], [(S2) - stage 2])
      
      S1           |  S2           | Result
      NORMAL-WB    |  NORMAL-NC    | NORMAL-NC
      NORMAL-WT    |  NORMAL-NC    | NORMAL-NC
      NORMAL-NC    |  NORMAL-NC    | NORMAL-NC
      DEVICE<attr> |  NORMAL-NC    | DEVICE<attr>
      
      Still this cannot be generalized to non PCI devices such as GICv2.
      There is insufficient information and uncertainity in the behavior
      of non PCI driver. A driver must indicate support using the
      new flag VM_ALLOW_ANY_UNCACHED.
      
      Adapt KVM to make use of the flag VM_ALLOW_ANY_UNCACHED as indicator to
      activate the S2 setting to NormalNc.
      
      [1] section D8.5.5 of DDI0487J_a_a-profile_architecture_reference_manual.pdf
      Suggested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarAnkit Agrawal <ankita@nvidia.com>
      Link: https://lore.kernel.org/r/20240224150546.368-4-ankita@nvidia.comSigned-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      8c47ce3e
    • Ankit Agrawal's avatar
      mm: Introduce new flag to indicate wc safe · 5c656fcd
      Ankit Agrawal authored
      The VM_ALLOW_ANY_UNCACHED flag is implemented for ARM64, allowing KVM
      stage 2 device mapping attributes to use NormalNC rather than
      DEVICE_nGnRE, which allows guest mappings supporting write-combining
      attributes (WC). ARM does not architecturally guarantee this is safe,
      and indeed some MMIO regions like the GICv2 VCPU interface can trigger
      uncontained faults if NormalNC is used.
      
      Even worse, the expectation is that there are platforms where even
      DEVICE_nGnRE can allow uncontained faults in corner cases. Unfortunately
      existing ARM IP requires platform integration to take responsibility to
      prevent this.
      
      To safely use VFIO in KVM the platform must guarantee full safety in the
      guest where no action taken against a MMIO mapping can trigger an
      uncontained failure. The assumption is that most VFIO PCI platforms
      support this for both mapping types, at least in common flows, based
      on some expectations of how PCI IP is integrated. This can be enabled
      more broadly, for instance into vfio-platform drivers, but only after
      the platform vendor completes auditing for safety.
      
      The VMA flag VM_ALLOW_ANY_UNCACHED was found to be the simplest and
      cleanest way to communicate the information from VFIO to KVM that
      mapping the region in S2 as NormalNC is safe. KVM consumes it to
      activate the code that does the S2 mapping as NormalNC.
      Suggested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarAnkit Agrawal <ankita@nvidia.com>
      Link: https://lore.kernel.org/r/20240224150546.368-3-ankita@nvidia.comSigned-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      5c656fcd
    • Ankit Agrawal's avatar
      KVM: arm64: Introduce new flag for non-cacheable IO memory · c034ec84
      Ankit Agrawal authored
      Currently, KVM for ARM64 maps at stage 2 memory that is considered device
      (i.e. it is not RAM) with DEVICE_nGnRE memory attributes; this setting
      overrides (as per the ARM architecture [1]) any device MMIO mapping
      present at stage 1, resulting in a set-up whereby a guest operating
      system cannot determine device MMIO mapping memory attributes on its
      own but it is always overridden by the KVM stage 2 default.
      
      This set-up does not allow guest operating systems to select device
      memory attributes independently from KVM stage-2 mappings
      (refer to [1], "Combining stage 1 and stage 2 memory type attributes"),
      which turns out to be an issue in that guest operating systems
      (e.g. Linux) may request to map devices MMIO regions with memory
      attributes that guarantee better performance (e.g. gathering
      attribute - that for some devices can generate larger PCIe memory
      writes TLPs) and specific operations (e.g. unaligned transactions)
      such as the NormalNC memory type.
      
      The default device stage 2 mapping was chosen in KVM for ARM64 since
      it was considered safer (i.e. it would not allow guests to trigger
      uncontained failures ultimately crashing the machine) but this
      turned out to be asynchronous (SError) defeating the purpose.
      
      Failures containability is a property of the platform and is independent
      from the memory type used for MMIO device memory mappings.
      
      Actually, DEVICE_nGnRE memory type is even more problematic than
      Normal-NC memory type in terms of faults containability in that e.g.
      aborts triggered on DEVICE_nGnRE loads cannot be made, architecturally,
      synchronous (i.e. that would imply that the processor should issue at
      most 1 load transaction at a time - it cannot pipeline them - otherwise
      the synchronous abort semantics would break the no-speculation attribute
      attached to DEVICE_XXX memory).
      
      This means that regardless of the combined stage1+stage2 mappings a
      platform is safe if and only if device transactions cannot trigger
      uncontained failures and that in turn relies on platform capabilities
      and the device type being assigned (i.e. PCIe AER/DPC error containment
      and RAS architecture[3]); therefore the default KVM device stage 2
      memory attributes play no role in making device assignment safer
      for a given platform (if the platform design adheres to design
      guidelines outlined in [3]) and therefore can be relaxed.
      
      For all these reasons, relax the KVM stage 2 device memory attributes
      from DEVICE_nGnRE to Normal-NC.
      
      The NormalNC was chosen over a different Normal memory type default
      at stage-2 (e.g. Normal Write-through) to avoid cache allocation/snooping.
      
      Relaxing S2 KVM device MMIO mappings to Normal-NC is not expected to
      trigger any issue on guest device reclaim use cases either (i.e. device
      MMIO unmap followed by a device reset) at least for PCIe devices, in that
      in PCIe a device reset is architected and carried out through PCI config
      space transactions that are naturally ordered with respect to MMIO
      transactions according to the PCI ordering rules.
      
      Having Normal-NC S2 default puts guests in control (thanks to
      stage1+stage2 combined memory attributes rules [1]) of device MMIO
      regions memory mappings, according to the rules described in [1]
      and summarized here ([(S1) - stage1], [(S2) - stage 2]):
      
      S1           |  S2           | Result
      NORMAL-WB    |  NORMAL-NC    | NORMAL-NC
      NORMAL-WT    |  NORMAL-NC    | NORMAL-NC
      NORMAL-NC    |  NORMAL-NC    | NORMAL-NC
      DEVICE<attr> |  NORMAL-NC    | DEVICE<attr>
      
      It is worth noting that currently, to map devices MMIO space to user
      space in a device pass-through use case the VFIO framework applies memory
      attributes derived from pgprot_noncached() settings applied to VMAs, which
      result in device-nGnRnE memory attributes for the stage-1 VMM mappings.
      
      This means that a userspace mapping for device MMIO space carried
      out with the current VFIO framework and a guest OS mapping for the same
      MMIO space may result in a mismatched alias as described in [2].
      
      Defaulting KVM device stage-2 mappings to Normal-NC attributes does not
      change anything in this respect, in that the mismatched aliases would
      only affect (refer to [2] for a detailed explanation) ordering between
      the userspace and GuestOS mappings resulting stream of transactions
      (i.e. it does not cause loss of property for either stream of
      transactions on its own), which is harmless given that the userspace
      and GuestOS access to the device is carried out through independent
      transactions streams.
      
      A Normal-NC flag is not present today. So add a new kvm_pgtable_prot
      (KVM_PGTABLE_PROT_NORMAL_NC) flag for it, along with its
      corresponding PTE value 0x5 (0b101) determined from [1].
      
      Lastly, adapt the stage2 PTE property setter function
      (stage2_set_prot_attr) to handle the NormalNC attribute.
      
      The entire discussion leading to this patch series may be followed through
      the following links.
      Link: https://lore.kernel.org/all/20230907181459.18145-3-ankita@nvidia.com
      Link: https://lore.kernel.org/r/20231205033015.10044-1-ankita@nvidia.com
      
      [1] section D8.5.5 - DDI0487J_a_a-profile_architecture_reference_manual.pdf
      [2] section B2.8 - DDI0487J_a_a-profile_architecture_reference_manual.pdf
      [3] sections 1.7.7.3/1.8.5.2/appendix C - DEN0029H_SBSA_7.1.pdf
      Suggested-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarAnkit Agrawal <ankita@nvidia.com>
      Link: https://lore.kernel.org/r/20240224150546.368-2-ankita@nvidia.comSigned-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      c034ec84
  6. 22 Feb, 2024 11 commits
  7. 18 Feb, 2024 6 commits
    • Linus Torvalds's avatar
      Linux 6.8-rc5 · b401b621
      Linus Torvalds authored
      b401b621
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.8-2' of... · 6c160f16
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Reformat nested if-conditionals in Makefiles with 4 spaces
      
       - Fix CONFIG_DEBUG_INFO_BTF builds for big endian
      
       - Fix modpost for module srcversion
      
       - Fix an escape sequence warning in gen_compile_commands.py
      
       - Fix kallsyms to ignore ARMv4 thunk symbols
      
      * tag 'kbuild-fixes-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kallsyms: ignore ARMv4 thunks along with others
        modpost: trim leading spaces when processing source files list
        gen_compile_commands: fix invalid escape sequence warning
        kbuild: Fix changing ELF file type for output of gen_btf for big endian
        docs: kconfig: Fix grammar and formatting
        kbuild: use 4-space indentation when followed by conditionals
      6c160f16
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ddac3d8b
      Linus Torvalds authored
      Pull x86 fix from Borislav Petkov:
      
       - Use a GB page for identity mapping only when memory of this size is
         requested so that mapping of reserved regions is prevented which
         would otherwise lead to system crashes on UV machines
      
      * tag 'x86_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm/ident_map: Use gbpages only where full GB page should be mapped.
      ddac3d8b
    • Linus Torvalds's avatar
      Merge tag 'irq_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cb7c32d
      Linus Torvalds authored
      Pull irq fixes from Borislav Petkov:
      
       - Fix GICv4.1 affinity update
      
       - Restore a quirk for ACPI-based GICv4 systems
      
       - Handle non-coherent GICv4 redistributors properly
      
       - Prevent spurious interrupts on Broadcom devices using GIC v3
         architecture
      
       - Other minor fixes
      
      * tag 'irq_urgent_for_v6.8_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3-its: Fix GICv4.1 VPE affinity update
        irqchip/gic-v3-its: Restore quirk probing for ACPI-based systems
        irqchip/gic-v3-its: Handle non-coherent GICv4 redistributors
        irqchip/qcom-mpm: Fix IS_ERR() vs NULL check in qcom_mpm_init()
        irqchip/loongson-eiointc: Use correct struct type in eiointc_domain_alloc()
        irqchip/irq-brcmstb-l2: Add write memory barrier before exit
      7cb7c32d
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 626721ed
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two fixes for i801 and qcom-geni devices. Meanwhile, a fix from Arnd
        addresses a compilation error encountered during compile test on
        powerpc"
      
      * tag 'i2c-for-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: i801: Fix block process call transactions
        i2c: pasemi: split driver into two separate modules
        i2c: qcom-geni: Correct I2C TRE sequence
      626721ed
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · c02197fc
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "This is a bit of a big batch for rc4, but just due to holiday hangover
        and because I didn't send any fixes last week due to a late revert
        request. I think next week should be back to normal.
      
         - Fix ftrace bug on boot caused by exit text sections with
           '-fpatchable-function-entry'
      
         - Fix accuracy of stolen time on pseries since the switch to
           VIRT_CPU_ACCOUNTING_GEN
      
         - Fix a crash in the IOMMU code when doing DLPAR remove
      
         - Set pt_regs->link on scv entry to fix BPF stack unwinding
      
         - Add missing PPC_FEATURE_BOOKE on 64-bit e5500/e6500, which broke
           gdb
      
         - Fix boot on some 6xx platforms with STRICT_KERNEL_RWX enabled
      
         - Fix build failures with KASAN enabled and 32KB stack size
      
         - Some other minor fixes
      
        Thanks to Arnd Bergmann, Benjamin Gray, Christophe Leroy, David
        Engraf, Gaurav Batra, Jason Gunthorpe, Jiangfeng Xiao, Matthias
        Schiffer, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nysal Jan K.A,
        R Nageswara Sastry, Shivaprasad G Bhat, Shrikanth Hegde, Spoorthy,
        Srikar Dronamraju, and Venkat Rao Bagalkote"
      
      * tag 'powerpc-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach
        powerpc/pseries: fix accuracy of stolen time
        powerpc/ftrace: Ignore ftrace locations in exit text sections
        powerpc/cputable: Add missing PPC_FEATURE_BOOKE on PPC64 Book-E
        powerpc/kasan: Limit KASAN thread size increase to 32KB
        Revert "powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add"
        powerpc: 85xx: mark local functions static
        powerpc: udbg_memcons: mark functions static
        powerpc/kasan: Fix addr error caused by page alignment
        powerpc/6xx: set High BAT Enable flag on G2_LE cores
        selftests/powerpc/papr_vpd: Check devfd before get_system_loc_code()
        powerpc/64: Set task pt_regs->link to the LR value on scv entry
        powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add
        powerpc/pseries/papr-sysparm: use u8 arrays for payloads
      c02197fc